[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

Originalimoc · 2024-11-13T08:45:45Z

Problem

Still doing mesurement when set to 8.0 bpw.

Solution

Skip the measurement/generate a dummy meaurement file.

Alternatives

No response

Explanation

What's the point of measurement if you're using 8.0 on all layers anyway? Or is there any ignored/acceptable loss threshold will cause lower bpw like 5~6 to be used even 8 is set?

Examples

No response

Additional context

No response

Acknowledgements

I have looked for similar requests before submitting this one.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will make my requests politely.

pchristidis · 2024-11-24T07:20:56Z

If you want to work around this, just grab a random measurement.json from huggingface for the same base model.

Bartowski does quants for most of the big models, and usually puts the measurement.json in the main branch eg:

https://huggingface.co/bartowski/Qwen2.5-Coder-32B-Instruct-exl2/tree/main

Then just use -m /path/to/measurement.json in the conversion script when you're doing 8bpw

necrogay · 2025-01-09T09:35:10Z

What concerns me the most is the lack of an option to manually override the optimization process. The system decides on its own which layers to quantize and to what degree, sometimes doing so in situations where it’s not entirely appropriate. For instance, I want to set a maximum quantization level of 8+ by specifying the parameters -b 8 [9,10,16,255]. However, this doesn’t seem to matter, as the system still arbitrarily quantizes many layers to 4, 5, or 6 bpw.

What’s more frustrating is that every time I run the process, it selects layers for quantization in a random order. For example, in one run, it might choose layers 3, 5, and 39, but after restarting with the same parameters, it could switch to layers 4, 9, and 28, and so on.

It would be great to have an option to explicitly specify which layers should not be optimized and should instead be quantized with the maximum value. Additionally, it would be useful to define specific quantization ranges for particular layers. For instance, having an additional configuration file where such quantization ranges could be defined would make the process much more convenient and flexible.

turboderp · 2025-01-09T10:26:04Z

Part of this is because 8 bpw requires some layers to use less than the maximum bitrate. The bitrate specified is the actual number of bits per weight including overhead. With that overhead, the actual maximum is about 8.05 bpw (it varies a bit depending on tensor shapes).

I just checked and there was a slight inaccuracy in the optimizer which made it ever so slightly undershoot the target bitrate if the last annealing step left a tiny bit of the cost budget unused. This shouldn't happen, so I fixed it in the latest commit to the dev branch.

With that, you should be able to set a target bitrate of e.g. 9 and always get the largest setting for each layer.

Note that it's highly unlikely to make any practical difference since the reason this happens in the first place is that the measured difference between the highest and next-highest setting for a given layer is below the noise floor.

I might add a shortcut to skip measurement and simply use the max bitrate as an option, but I'm also looking at completely reworking the quantization scheme anyway.

Originalimoc changed the title ~~[REQUEST] Convert.py: Skip measurement when setting 8.0/8.0~~ [REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

Originalimoc commented Nov 13, 2024 •

edited

Loading

pchristidis commented Nov 24, 2024

necrogay commented Jan 9, 2025

turboderp commented Jan 9, 2025

[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

Comments

Originalimoc commented Nov 13, 2024 • edited Loading

Problem

Solution

Alternatives

Explanation

Examples

Additional context

Acknowledgements

pchristidis commented Nov 24, 2024

necrogay commented Jan 9, 2025

turboderp commented Jan 9, 2025

Originalimoc commented Nov 13, 2024 •

edited

Loading