Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What magnitude of avg loss indicates a relatively good result for a quantization model #649

Open
ehuaa opened this issue Apr 26, 2024 · 6 comments

Comments

@ehuaa
Copy link

ehuaa commented Apr 26, 2024

When i quantize a model, the avg loss is lower in earlier layers(0.02) than the loss in later layers(2.0), i'm curious that if the quantization is failed due to a large avg loss?
And for experience, what magnitude of avg loss is good for a quantization model?

@Qubitium
Copy link
Contributor

Qubitium commented Apr 26, 2024

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

  • use running quant avg loss as guide to usable quant
  • run ppl after quant for test 1
  • human eval test for test 2

@ehuaa
Copy link
Author

ehuaa commented Apr 26, 2024

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

  • use running quant avg loss as guide to usable quant
  • run ppl after quant for test 1
  • human eval test for test 2

Thanks for your quick reply! @Qubitium
My losses are lower than 0.05 in the first three layers as you mentioned above, but will eventually turns to above 10.0 in the last 40 layers, is it normal in your experience?
(ps: my model is a finetuned verison of Qwen-72b-chat, which has 80 layers in total.)
I'll test the 2 tests you mentioned above after i finish quantizing my model.

@ehuaa
Copy link
Author

ehuaa commented May 8, 2024

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

  • use running quant avg loss as guide to usable quant
  • run ppl after quant for test 1
  • human eval test for test 2

@Qubitium I have finished the two tests you mentioned above, and found that the result for test 1 is reasonable, but the result of human eval test falls about 50% after quantization, do you have any advice to fix it? thanks

@Qubitium
Copy link
Contributor

Qubitium commented May 8, 2024

What is your PPL before and after quantization?

@ehuaa
Copy link
Author

ehuaa commented May 8, 2024

What is your PPL before and after quantization?

My PPL before quantization on wiki2 is 5.334, while after quantization the PPL is 5.415, my model is a finetuned version of qwen1.5-72b
HumanEval result before quantization is 0.677 while after quantization it drops to 0.372 @Qubitium

@Qubitium
Copy link
Contributor

Qubitium commented May 8, 2024

5.33 for pre-quant PPL is already very suspect in my opinion for such a huge model. Forget quant, troubleshoot your PPL/inference pre-quant. Make sure your PPL is not using same dataset as calibration but real use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants