Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

Open
JarvisLee0423 opened this issue Sep 13, 2024 · 0 comments

Comments

@JarvisLee0423
Copy link

Thanks for your excellent work!
I have two questions about your implementations.
1). It seems that I did not find any implementations about positional embedding in your code. Why the channel-aware attention do not need to specify the order of the input tokens?
2). I noticed that all your work are focusing on self-attention scheme. Is this scheme can be used for cross-attention scheme? For example, if I want to compute the similarity between two input vision features in pixel-wise (spatially). Can we apply the channel-wise attention (covariance) to achieve the similar similarity like the spatial-wise attention?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant