-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working weights are not updated properly in bayesglm.R #110
Comments
Thanks for this report. @amcdavid can you confirm? |
Running this I also found it differs from the output of |
Thanks Nik for the report. I think I agree that this may be a bug, but I definitely don't understand this code well (because we didn't write it). As such, I am hesitant to make changes without a failing unit test. Do we have an example, and an expected result to show what exactly is wrong about this, and verify that changing this line fixes it? This is code from Gelman et al's arm package that we directly included into MAST. (We did this rather than Importing arm because there were changes being made to |
The main loop calls |
Where do you see fortran in the repo? I don't see any. FWIW, here's the diff between our version of arm and the latest: cran/arm@82ebb3c I would rather move away from arm altogether and implement the logistic model in stan or rstanarm. Arm is legacy code, and doesn't have any unit or regression testing, whereas stan and rstanarm are tested and more modern. They may be faster anyways, since they are in C. If we do that we can add an option to use the "old version". |
Sorry, you're right, not sure what the heck I saw. |
And actually this is the diff between our version of bayesglm and the latest version. It has been refactored to the point of being unrecognizable. Presumably for the better, but I am hesitant to mess with things given that lack of unit tests. |
Okay, I think we have to update things. I like the idea of using rstanarm. |
👍 It's been on my todo for a while. I probably won't be able to get to it until middle of next week, so feel free to take a stab. In any case, we wouldn't want to push anything to bioconductor until after the bioconductor release. |
Hello Greg and Andrew: Thanks for responding. Nik |
It's not clear to me yet what the impact of this particular bug was on that paper, but it was present at the time. That said, it was a bug in arm as well, when MAST was released. It's obviously a priority to address it. |
Please make sure a similar issue is not present in rstanarm or whatever package you are going to use. |
@NikTuzov Let me be clear, that this was a bug in |
@NikTuzov We have decided to revert bayesglm back to an older version (when Andrew just wrote it from scratch) because the new one was buggy. We made several mistakes when we tried to clean the codes up using many functions embedded in the bayesglm(). In any case, bayesglm() and bayesglm.fit() in arm_1.10.1 should be a good (if not buggy) version to use. Having said this, I use the zip file in your post to test it, the current bayesglm() is working. I'm not sure if I overlooked anything mentioned in your post. If you can be more clear about the bug and whether or not the bug still affects the current version, you can help us make arm better. Thanks! |
@gfinak What is the present state of things? Are the weights updated properly? Regards, |
I've pushed a fix to a feature branch reintroducing dependence on the latest arm package. |
We should consider keeping the old, albeit buggy behavior around, at least for a bioconductor version or two.
… On Mar 18, 2020, at 3:32 PM, Greg Finak ***@***.***> wrote:
I've pushed a fix to a feature branch reintroducing dependence on the latest arm package.
That's waiting for code review.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#110 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALLAHSEHQMEY2S3HTCXRWDRIEOUXANCNFSM4HGOOKQQ>.
|
If it's a bug, why? |
1. Because it will break reproducibility, and many users aren't sophisticated enough to maintain parallel installs of packages. 2. it's been there for several years. 3. It corresponds to a well defined model -- in effect we have been using Gaussian priors rather than t priors. 4. it is reflected in the independent benchmarks that have been run.
At a minimum. we need to understand the impact of the change so we know what to put in the NEWS.
… On Mar 18, 2020, at 3:49 PM, Greg Finak ***@***.***> wrote:
If it's a bug, why?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#110 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALLAHQVCYI43H7XAISZBDTRIEQTHANCNFSM4HGOOKQQ>.
|
@amcdavid any thoughts on merging this particular fix now that we've had several Bioc release cycles? |
That would be much appreciated. |
I would still like to have a failing test, or at least a diff of old vs new results to understand the impact of this change. Beyond it being "wrong" can @NikTuzov explain why this is important to him? That said, I don't feel that strongly against it as long as we indicate a reproducibility-breaking change somewhere prominent (maybe in a message that is printed out ala tidyverse). |
Sorry I didn't reply back then. https://www.partek.com/partek-flow/ Back then, I didn't want to fix it on my own because the result wouldn't have matched the official R package. Is the fixed version available now? |
Hello Greg: I take it the fix is still on this topic branch: https://github.com/RGLab/MAST/tree/fix/bayesglm Are there any plans to push it into master? Regards, |
I would still like to see a failing unit test, and some larger scale integration tests so we fully understand the implication of changing the backend. Really we probably need these integration tests implemented at some point no matter what, but it's been very much on the backburner for me. I credit that the code does not reflect the intended model, but at this point it's been used for so long I don't think we should change it without the appropriate groundwork laid. You are of course, free to fork if you need this functionality. |
Hello:
I believe there is a bug in bayesglm.R. I attached bayesglm_A_1.R where I inserted a few cat statements. If you run it using the code from the other file, you'll see from the output that the last two weights that correspond to pseudo-data are initialized but never updated.
The problem is that in .bayesglm.fit.loop.updateState() the working weights are computed as:
w.star <- c(w, sqrt(state$dispersion)/priors$scale)
but to update them one needs to use state$prior.sd in the denominator, which should be equal to priors$scale only in the beginning. It looks like you only need to change one line to fix it.
Regards,
Nik Tuzov
bayesglm_issue.zip
The text was updated successfully, but these errors were encountered: