Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALibi & Flash Attention #864

Merged
merged 21 commits into from
Apr 11, 2023
Merged

ALibi & Flash Attention #864

merged 21 commits into from
Apr 11, 2023

Conversation

dashstander
Copy link
Contributor

The current version Flash Attention does not support the ALiBi positional embedding. This PR adds support for the Triton-based Flash Attention that does allow a bias added to the self-attention calculation.

This can be enabled simply by setting "pos_emb": "alibi" and adding "flash" to the attention config in your model's configuration.

To support this feature we bumped up the listed Triton dependency to triton==2.0.0.dev20221202 and in the course of implementing it we discovered that SparseAttention became broken at some point in the past, see #863.

Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
@dashstander dashstander requested a review from a team as a code owner March 29, 2023 19:09
@dashstander dashstander linked an issue Mar 29, 2023 that may be closed by this pull request
@Quentin-Anthony Quentin-Anthony merged commit f3d65b5 into main Apr 11, 2023
@Quentin-Anthony Quentin-Anthony deleted the 848-alibi-flash-attention branch April 11, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AliBi + Flash Attention Support
2 participants