Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

JarvisLee0423 · 2024-09-13T06:19:04Z

Thanks for your excellent work!
I have two questions about your implementations.
1). It seems that I did not find any implementations about positional embedding in your code. Why the channel-aware attention do not need to specify the order of the input tokens?
2). I noticed that all your work are focusing on self-attention scheme. Is this scheme can be used for cross-attention scheme? For example, if I want to compute the similarity between two input vision features in pixel-wise (spatially). Can we apply the channel-wise attention (covariance) to achieve the similar similarity like the spatial-wise attention?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

JarvisLee0423 commented Sep 13, 2024

Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

Questions about positional embedding and whether the channel-wise attention can be transfer to cross-attention computation? #110

Comments

JarvisLee0423 commented Sep 13, 2024