-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow setting random state for reproducibility #59
Comments
Or even better: Letting the user select the number of Monte Carlo (“bootstrap”) samples. The reason is given in the documentation of R function clusGap:
|
Hi! I suppose one could use the clusterer param to add their own callable which took a random state? But anyway, I'm open for this addition so have no strong opinions on how it ought to be done. So please feel free to open another PR and we'll see how it goes. 👍 |
Dear Miles, One can run R from inside Python, via package
So, I have to study the way they do that. Have a nice Sunday! Paulo |
Added this functionality in #61. |
Dear Miles,
I have used
gap-stat
on a same dataset. However, the optimal number of clusters thatgap-stat
returns is not always the same. I guess this happens because the reference distribution is randomly generated (actually, you usenumpy
for that in the code). So, for reproducibility reasons, it appears reasonable to haveoptimalK
function with an argumentrandom_state
.If you agree, maybe I would be able to change the code accordingly, with your directions and help.
Thanks!
The text was updated successfully, but these errors were encountered: