What magnitude of avg loss indicates a relatively good result for a quantization model #649

ehuaa · 2024-04-26T00:36:31Z

When i quantize a model, the avg loss is lower in earlier layers(0.02) than the loss in later layers(2.0), i'm curious that if the quantization is failed due to a large avg loss?
And for experience, what magnitude of avg loss is good for a quantization model?

Qubitium · 2024-04-26T03:30:53Z

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

use running quant avg loss as guide to usable quant
run ppl after quant for test 1
human eval test for test 2

ehuaa · 2024-04-26T04:19:14Z

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

use running quant avg loss as guide to usable quant

run ppl after quant for test 1

human eval test for test 2

Thanks for your quick reply! @Qubitium
My losses are lower than 0.05 in the first three layers as you mentioned above, but will eventually turns to above 10.0 in the last 40 layers, is it normal in your experience?
(ps: my model is a finetuned verison of Qwen-72b-chat, which has 80 layers in total.)
I'll test the 2 tests you mentioned above after i finish quantizing my model.

ehuaa · 2024-05-08T09:11:12Z

My rule of thumb is if your losses are > 1.0 for early [1-3] layers, calibration data is off or tokenizer is not properly configured. Each module in each layer has it's own loss trend in my experience. Some modules just are harder to quantize. MOE models are the worst-case for gptq due the gating/router layer.

use running quant avg loss as guide to usable quant

run ppl after quant for test 1

human eval test for test 2

@Qubitium I have finished the two tests you mentioned above, and found that the result for test 1 is reasonable, but the result of human eval test falls about 50% after quantization, do you have any advice to fix it? thanks

Qubitium · 2024-05-08T09:14:48Z

What is your PPL before and after quantization?

ehuaa · 2024-05-08T09:28:08Z

What is your PPL before and after quantization?

My PPL before quantization on wiki2 is 5.334， while after quantization the PPL is 5.415, my model is a finetuned version of qwen1.5-72b
HumanEval result before quantization is 0.677 while after quantization it drops to 0.372 @Qubitium

Qubitium · 2024-05-08T12:10:55Z

5.33 for pre-quant PPL is already very suspect in my opinion for such a huge model. Forget quant, troubleshoot your PPL/inference pre-quant. Make sure your PPL is not using same dataset as calibration but real use-case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What magnitude of avg loss indicates a relatively good result for a quantization model #649

What magnitude of avg loss indicates a relatively good result for a quantization model #649

ehuaa commented Apr 26, 2024 •

edited

Qubitium commented Apr 26, 2024 •

edited

ehuaa commented Apr 26, 2024

ehuaa commented May 8, 2024

Qubitium commented May 8, 2024

ehuaa commented May 8, 2024

Qubitium commented May 8, 2024

What magnitude of avg loss indicates a relatively good result for a quantization model #649

What magnitude of avg loss indicates a relatively good result for a quantization model #649

Comments

ehuaa commented Apr 26, 2024 • edited

Qubitium commented Apr 26, 2024 • edited

ehuaa commented Apr 26, 2024

ehuaa commented May 8, 2024

Qubitium commented May 8, 2024

ehuaa commented May 8, 2024

Qubitium commented May 8, 2024

ehuaa commented Apr 26, 2024 •

edited

Qubitium commented Apr 26, 2024 •

edited