You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is recommended to set the tokenizer's vocab_size to be a multiple of 32 (and consequently adjust the dimensions of the embedding and the final lm_head, i.e., the language_model.output accordingly). Otherwise, after quantization with bitsandbytes (bnb), the model may encounter errors when computing gradients (backward) on Ampere GPUs with a version greater than 8. This is due to bnb manually padding the shape to the nearest multiple of 32, leading to shape mismatches.
As mentioned in #129
It is recommended to set the tokenizer's vocab_size to be a multiple of 32 (and consequently adjust the dimensions of the embedding and the final lm_head, i.e., the language_model.output accordingly). Otherwise, after quantization with bitsandbytes (bnb), the model may encounter errors when computing gradients (backward) on Ampere GPUs with a version greater than 8. This is due to bnb manually padding the shape to the nearest multiple of 32, leading to shape mismatches.
https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/functional.py#L508-L512
The text was updated successfully, but these errors were encountered: