Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACNN implementation has only 1 Conv layer #3

Open
gerardsimons opened this issue Jan 21, 2022 · 5 comments
Open

ACNN implementation has only 1 Conv layer #3

gerardsimons opened this issue Jan 21, 2022 · 5 comments

Comments

@gerardsimons
Copy link

gerardsimons commented Jan 21, 2022

I think something is wrong with the ACNN implementation as the entire CNN exists out of a single Conv1d layer:

# (batch, channels, length)
self.cnn = nn.Conv1d(in_channels=self.in_channels, 
                    out_channels=self.out_channels, 
                    kernel_size=16, 
                    stride=4)
@hsd1503
Copy link
Owner

hsd1503 commented Feb 7, 2022

The attention mechanism (self attention) is written in Line 102 with raw torch.matmul implementations.

@gerardsimons
Copy link
Author

gerardsimons commented Feb 7, 2022

Yes, but shouldn't there still be more than 1 ConvLayer? I would expect a stack of Conv blocks of at least 10+ after which the transformer layers would happen, no?

If I run test_physionet_acnn.py as it is I get the following summary. I understand that the attention may not be printed in the summary bu there must be more CNN blocks?

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv1d-1               [-1, 128, 9]           2,176
            Linear-2                    [-1, 4]             516
================================================================
Total params: 2,692
Trainable params: 2,692
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.01
Params size (MB): 0.01
Estimated Total Size (MB): 0.03

@hsd1503
Copy link
Owner

hsd1503 commented Feb 7, 2022

Just 1 CNN layer with 1 attention layer, the implementation style before Attention Is All You Need paper. You can see in https://arxiv.org/abs/1703.03130

@hsd1503
Copy link
Owner

hsd1503 commented Feb 7, 2022

Here is a very beginning transformer1d repo https://github.com/hsd1503/transformer1d

@gerardsimons
Copy link
Author

Thanks for the resources Hong, I will read up on that! What performance are you able to attain with this model? If you have time a table with all the different architectures and their performance on the Physionet data would be very useful! 🙏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants