Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : fix quants nans when all the group weights are very close to zero #7313

Merged
merged 3 commits into from
May 18, 2024

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented May 15, 2024

When the group abs max value is very close to zero but not zero, it may still result in a division by zero when computing the scale, which ends with a nan scale. To avoid this, we check the max value against an epsilon instead of zero. With the IQ quants, this could also result in a Oops: found point %u not on grid error.

While doing this, I noticed that there was already a similar check with 1e-30 epsilon in make_qx_quants, however values this small can still result in nan, so I bumped it to 1e-20 and extended it to all the cases that I could find. I used the commented code in test-backend-ops to find these cases. It is possible that an even higher epsilon may be necessary.

I don't expect this to result in lower precision in the quants since the epsilon is so small, but it may be worth checking.

Fixes #7311.

Copy link
Collaborator

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way the change in threshold has any significant effects on the results. Even a threshold of $10^{-10}$ should still be fine.

@slaren
Copy link
Collaborator Author

slaren commented May 15, 2024

I tried progressively higher values and found that some quants still fail with 1e-8, so I increased the eps to 1e-7.

@JohannesGaessler
Copy link
Collaborator

While increasing GROUP_MAX_EPS by factors of 10, I could go as high as $10^{-4}$ before the results started to change for LLaMA 3 8b q6_K.

@mofosyne mofosyne added bugfix fixes an issue or bug Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 16, 2024
@JohannesGaessler
Copy link
Collaborator

To clarify, I'm not sure how generalizable my results are to other models; I think the model for which the fix is needed at least should also be checked since that particular model seems to have some blocks with only very small values.

@slaren
Copy link
Collaborator Author

slaren commented May 17, 2024

I have tried to find the lowest possible eps for the quants that require lower than 1e-15, so that only the quants that actually require it use the higher eps, to reduce the risk of introducing errors. Mostly that's the IQ quants.

I don't really like this solution, I think the best way to handle this would be to check for zero before doing the division, but that would require deeper changes, the code is not very easy to follow, and don't want to risk introducing bugs that may cause models with bad quants to be distributed.

@slaren slaren merged commit 0583484 into master May 18, 2024
68 of 73 checks passed
@slaren slaren deleted the sl/fix-quant-near-zero branch May 18, 2024 00:39
Nexesenex pushed a commit to Nexesenex/kobold.cpp that referenced this pull request May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ggml_validate_row_data finding nan value for IQ4_NL
4 participants