No. | Model Name | Title | Links | Pub. | Organization | Release Time |
---|---|---|---|---|---|---|
1 | HiBERT | HIBERT: Document Level Pre-training of Hierarchical BidirectionalTransformers for Document Summarization | paper | ACL 2019 | Microsoft Research Asia | 16 May 2019 |
2 | star transformer | Star Transformer | paper | NAACL 2019 | Shanghai Key Laboratory of Intelligent Information Processing, Fudan University | 25 Feb 2019 |
3 | ETC | ETC: Encoding Long and Structured Inputs in Transformers | paper | EMNLP 2020 | Google AI | 16 November 2020 |
4 | BP-Transformer | BP-Transformer: Modelling Long-Range Context via Binary Partitioning | paper code | arXiv | AWS Shanghai AI Lab | 11 November 2019 |
5 | Routing Transformer | Efficient Content-Based Sparse Attention with Routing Transformers | paper code | ICLR 2020 | Google AI | 1 Februray 2021 |
7 | Compressive Transformer | Compressive Transformers for Long-Range Sequence Modelling | paper code | ICLR 2020 | Deep Mind | 25 Sep 2019 |
8 | Transformer-XL | Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context | paper code | ACL 2019 | CMU | 9 Jan 2019 |
9 | Big Bird | Big Bird: Transformers for Longer Sequences | paper code | NeurIPS 2020 | Google Research | 8 Jan 2021 |
10 | Adaptive-Span | Adaptive Attention Span in Transformers | paper code | ACL 2019 | Facebook AI | 19 May 2019 |
11 | reformer | reformer: the efficient transformer | paper code | ICLR 2020 | Google AI | 13 Jan 2020 |
12 | Longformer | Longformer: The Long-Document Transformer | paper code | ICLR 2020 | Allen Insitute for Artificial Intelligence | 2 Dec 2020 |
13 | - | parameter efficient multimodal transformers for video representation learning | paper code | ICLR 2021 | Seoul National University | 8 Dec 2020 |
14 | Albert | Albert: A lite BERT for self-supervised learning of language prepresentations | paper code | ICLR 2020 | Google Research | 26 Sep 2019 |
15 | DEQ | Deep Equilibrium Models | paper code | NeurIPS 2019 | CMU | 3 Sep 2019 |
16 | Universal Transformer | Universal Transformers | paper code | ICLR 2019 | University of Amsterdam | 5 May 2019 |
17 | Linear Transformer | Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention | paper code | ICML 2020 | Idiap Research Institute | 31 Aug 2020 |
18 | ∞-former | ∞-former: Infinite Memory Transformer | paper | arXiv | Pedro Henrique Martins | 1 Sep 2021 |
19 | ATS | ATS: Adaptive Token Sampling For Efficient Vision Transformers | paper | arXiv | Microsoft | 30 Nov 2021 |
20 | TerViT | TerViT: An Efficient Ternary Vision Transformer | paper | arXiv | Beihang University | 20 Jan 2022 |
21 | Lite Transformer | Lite Transformer with long-short range attention | paper code | ICLR 2020 | MiT | 24 Apr 2020 |
22 | UVC | Unitified Visual Transformer Compression paper code | ICLR 2022 | University of Texas at Austin | 29 Sept 2021 | |
23 | MobileVIT | MobileVIT: light-weight, general-purpose,and mobile-friendly vision transformers | ICLR 2022 | Apple | 5 Oct 2021 |