Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Granite to model builder #1153

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Add Granite to model builder #1153

wants to merge 13 commits into from

Conversation

kunal-vaishnavi
Copy link
Contributor

@kunal-vaishnavi kunal-vaishnavi commented Dec 17, 2024

Description

This PR adds IBM's Granite models to the model builder. It also adds the following improvements to the model builder:

  1. Always unpack any packed weights in the attention and MLP layers
  2. Insert optional Add nodes if the MLP layer has a bias (the Granite code models use the LLaMA architecture but with biases included)
  3. Use gate_up_proj or dense_h_to_4h as the attribute name when unpacking weights in the MLP layer

Motivation and Context

Granite is a family of foundation models from IBM.

src/python/py/models/builder.py Dismissed Show dismissed Hide dismissed
src/python/py/models/builder.py Dismissed Show dismissed Hide dismissed
src/python/py/models/builder.py Dismissed Show dismissed Hide dismissed
src/models/model.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants