-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LinearAttention Module #169
Comments
Hey Reduan, thank you for the issue! About the implementation details:The In code, we may get away by requiring to use Otherwise, we could also implement a canonizer (or a meta-rule) for the most popular library implementing attention layers. |
Hi Christopher,
hope you're fine and I'm really glad that the zennit community grows, congratulation!
With a growing community, more nn.Modules desire to be explained and that's why I'm writing this issue.
A student in our department tries to explain a LinearAttention module. (The implementation is below for reference).
It contains a series of
torch.einsum
and
torch.transpose
operations.
It uses the
rearrange
function of the einops library, a new syntax to write basic torch code like transpose, reshape etc.I think, zennit should be able to analyse a series of reshaping and transposing operations. However, I am not completely sure.
I'd be glad, if you could give your opinion on analyzing such a linear attention module. If you don't know, that's also no problem (: Then, it's the beginning of a new research topic.
(And the softmax function is also a problem, but maybe Arras et. al has a solution to this which the student could implement... )
Best,
Reduan
The text was updated successfully, but these errors were encountered: