-
-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requesting support for IBM's OpenSource Granite models #441
Comments
Oh interesting! |
Fine tuning for both
|
I noticed the other day when I was attempting to quantize the 34B larger models that the Granite models are 2 different types. The 3B,7B, and 8B models are llama, while the 20B and 34B are gpt-bigcode models. Not sure how that would or wouldn't affect fine tuning since i haven't looked into it yet, but I figured it was worth mentioning. |
Cool! That helps. With 7b-base model, the output is meaningful now. Thanks @danielhanchen |
Great it worked!! |
These open source models were just released yesterday at Red Hat Summit.
https://huggingface.co/ibm-granite
https://arxiv.org/abs/2405.04324
If this ends up being a bigger ask than I think it is, and there's something I can do to help in making this happen, let me know.
The text was updated successfully, but these errors were encountered: