Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightseq' Transformer expects an extra layer_norm on both encoder and decoder level #509

Open
yuting-wang-1000 opened this issue May 26, 2023 · 0 comments

Comments

@yuting-wang-1000
Copy link

Hi lightseq Team, I notice lightseq' transformer architecture has an extra layer_norm on both encoder and decoder level (outside decoder layers)

self.layer_norm = nn.LayerNorm(embed_dim)

In Fairseq, this layer_norm is only added when per_layer_norm == True
https://github.com/facebookresearch/fairseq/blob/b30980349bcb2e870481d783ac8cb3f338361601/fairseq/models/transformer/transformer_encoder.py#L100

Due to the architectural difference, I m unable to export native Fairseq Transformer with post layer norm to protobuf/hdf5 format, using
https://github.com/bytedance/lightseq/blob/master/examples/inference/python/export/fairseq/native_fs_transformer_export.py. Cuz my model trained with Fairseq and per_layer_norm ==False doesn't have this extra layer_norm on decoder/encoder level.

Wonder why lightseq requires extra layer_norm on encoder/decoder level. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant