Skip to content

Latest commit

 

History

History
1625 lines (1617 loc) · 161 KB

tasks.md

File metadata and controls

1625 lines (1617 loc) · 161 KB

Available tasks

The following tables give you an overview of the tasks in MTEB.

Name Languages Type Category Domains # Samples Avg. Length (Char.)
AFQMC ['cmn'] STS s2s
AILACasedocs ['eng'] Retrieval p2p [Legal]
AILAStatutes ['eng'] Retrieval p2p [Legal]
AJGT (Alomari et al., 2017) ['ara'] Classification s2s [Social] {'train': 1800} {'train': 46.81}
ARCChallenge (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 1172} {'test': 161.7}
ATEC ['cmn'] STS s2s
AfriSentiClassification ['amh', 'arq', 'ary', 'hau', 'ibo', 'kin', 'pcm', 'por', 'swa', 'tso', 'twi', 'yor'] Classification s2s [Social] {'test': 2048} {'test': 74.77}
AfriSentiLangClassification ['amh', 'arq', 'ary', 'hau', 'ibo', 'kin', 'pcm', 'por', 'swa', 'tso', 'twi', 'yor'] Classification s2s [Social] {'test': 5754} {'test': 77.84}
AllegroReviews ['pol'] Classification s2s {'test': 1006} {'test': 477.2}
AlloProfClusteringP2P.v2 (Lefebvre-Brossard et al., 2023) ['fra'] Clustering p2p [Encyclopaedic] {'test': 2556} {'test': 3539.5}
AlloProfClusteringS2S.v2 (Lefebvre-Brossard et al., 2023) ['fra'] Clustering s2s [Encyclopaedic] {'test': 2556} {'test': 32.8}
AlloprofReranking (Lefebvre-Brossard et al., 2023) ['fra'] Reranking s2p [Web, Academic] {'test': 2316, 'train': 9264}
AlloprofRetrieval (Lefebvre-Brossard et al., 2023) ['fra'] Retrieval s2p [Encyclopaedic] {'train': 2048}
AlphaNLI (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 1532} {'test': 147.8}
AmazonCounterfactualClassification ['deu', 'eng', 'jpn'] Classification s2s [Reviews] {'validation': 335, 'test': 670} {'validation': 109.2, 'test': 106.1}
AmazonPolarityClassification (Julian McAuley, 2013) ['eng'] Classification s2s [Reviews] {'test': 400000} {'test': 431.4}
AmazonReviewsClassification (Phillip Keung, 2020) ['cmn', 'deu', 'eng', 'fra', 'jpn', 'spa'] Classification s2s [Reviews] {'validation': 30000, 'test': 30000} {'validation': 159.2, 'test': 160.4}
AngryTweetsClassification (Pauli et al., 2021) ['dan'] Classification s2s [Social] {'test': 1050} {'test': 156.1}
ArEntail (Obeidat et al., 2024) ['ara'] PairClassification s2s [News] {'test': 1000} {'test': 65.77}
ArXivHierarchicalClusteringP2P ['eng'] Clustering p2p [Academic] {'test': 2048} {'test': 1009.98}
ArXivHierarchicalClusteringS2S ['eng'] Clustering p2p [Academic] {'test': 2048} {'test': 1009.98}
ArguAna (Boteva et al., 2016) ['eng'] Retrieval s2p
ArguAna-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
ArmenianParaphrasePC (Arthur Malajyan, 2020) ['hye'] PairClassification s2s [News] {'train': 4023, 'test': 1470} {'train': 243.81, 'test': 241.37}
ArxivClassification (He et al., 2019) ['eng'] Classification s2s [Academic] {'test': 2048}
AskUbuntuDupQuestions ['eng'] Reranking s2s {'test': 2255} {'test': 52.5}
Assin2RTE (Real et al., 2020) ['por'] PairClassification s2s {'test': 2448} {'test': 53.55}
Assin2STS (Real et al., 2020) ['por'] STS s2s {'test': 2448} {'test': 53.55}
BIOSSES (Soğancıoğlu et al., 2017) ['eng'] STS s2s
BQ ['cmn'] STS s2s
BSARDRetrieval (Louis et al., 2022) ['fra'] Retrieval s2p [Legal] {'test': 222} {'test': 71.94}
BUCC.v2 ['cmn', 'deu', 'eng', 'fra', 'rus'] BitextMining s2s {'test': 641684} {'test': 101.3}
Banking77Classification ['eng'] Classification s2s {'test': 3080} {'test': 54.2}
BelebeleRetrieval (Lucas Bandarkar, 2023) ['acm', 'afr', 'als', 'amh', 'apc', 'arb', 'ars', 'ary', 'arz', 'asm', 'azj', 'bam', 'ben', 'bod', 'bul', 'cat', 'ceb', 'ces', 'ckb', 'dan', 'deu', 'ell', 'eng', 'est', 'eus', 'fin', 'fra', 'fuv', 'gaz', 'grn', 'guj', 'hat', 'hau', 'heb', 'hin', 'hrv', 'hun', 'hye', 'ibo', 'ilo', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kac', 'kan', 'kat', 'kaz', 'kea', 'khk', 'khm', 'kin', 'kir', 'kor', 'lao', 'lin', 'lit', 'lug', 'luo', 'lvs', 'mal', 'mar', 'mkd', 'mlt', 'mri', 'mya', 'nld', 'nob', 'npi', 'nso', 'nya', 'ory', 'pan', 'pbt', 'pes', 'plt', 'pol', 'por', 'ron', 'rus', 'shn', 'sin', 'slk', 'slv', 'sna', 'snd', 'som', 'sot', 'spa', 'srp', 'ssw', 'sun', 'swe', 'swh', 'tam', 'tel', 'tgk', 'tgl', 'tha', 'tir', 'tsn', 'tso', 'tur', 'ukr', 'urd', 'uzn', 'vie', 'war', 'wol', 'xho', 'yor', 'zho', 'zsm', 'zul'] Retrieval s2p [Web, News] {'test': 103500} {'test': 568.0}
BengaliDocumentClassification ['ben'] Classification s2s [News] {'test': 2048} {'test': 1658.1}
BengaliHateSpeechClassification (Karim et al., 2020) ['ben'] Classification s2s [News] {'train': 3418} {'train': 103.42}
BengaliSentimentAnalysis (Sazzed et al., 2020) ['ben'] Classification s2s [Reviews] {'train': 11807} {'train': 69.66}
BibleNLPBitextMining (Akerman et al., 2023) ['aai', 'aak', 'aau', 'aaz', 'abt', 'abx', 'aby', 'acf', 'acr', 'acu', 'adz', 'aer', 'aey', 'agd', 'agg', 'agm', 'agn', 'agr', 'agt', 'agu', 'aia', 'aii', 'aka', 'ake', 'alp', 'alq', 'als', 'aly', 'ame', 'amf', 'amk', 'amm', 'amn', 'amo', 'amp', 'amr', 'amu', 'amx', 'anh', 'anv', 'aoi', 'aoj', 'aom', 'aon', 'apb', 'ape', 'apn', 'apr', 'apu', 'apw', 'apz', 'arb', 'are', 'arl', 'arn', 'arp', 'asm', 'aso', 'ata', 'atb', 'atd', 'atg', 'att', 'auc', 'aui', 'auy', 'avt', 'awb', 'awk', 'awx', 'azb', 'azg', 'azz', 'bao', 'bba', 'bbb', 'bbr', 'bch', 'bco', 'bdd', 'bea', 'bef', 'bel', 'ben', 'beo', 'beu', 'bgs', 'bgt', 'bhg', 'bhl', 'big', 'bjk', 'bjp', 'bjr', 'bjv', 'bjz', 'bkd', 'bki', 'bkq', 'bkx', 'blw', 'blz', 'bmh', 'bmk', 'bmr', 'bmu', 'bnp', 'boa', 'boj', 'bon', 'box', 'bpr', 'bps', 'bqc', 'bqp', 'bre', 'bsj', 'bsn', 'bsp', 'bss', 'buk', 'bus', 'bvd', 'bvr', 'bxh', 'byr', 'byx', 'bzd', 'bzh', 'bzj', 'caa', 'cab', 'cac', 'caf', 'cak', 'cao', 'cap', 'car', 'cav', 'cax', 'cbc', 'cbi', 'cbk', 'cbr', 'cbs', 'cbt', 'cbu', 'cbv', 'cco', 'ceb', 'cek', 'ces', 'cgc', 'cha', 'chd', 'chf', 'chk', 'chq', 'chz', 'cjo', 'cjv', 'ckb', 'cle', 'clu', 'cme', 'cmn', 'cni', 'cnl', 'cnt', 'cof', 'con', 'cop', 'cot', 'cpa', 'cpb', 'cpc', 'cpu', 'cpy', 'crn', 'crx', 'cso', 'csy', 'cta', 'cth', 'ctp', 'ctu', 'cub', 'cuc', 'cui', 'cuk', 'cut', 'cux', 'cwe', 'cya', 'daa', 'dad', 'dah', 'dan', 'ded', 'deu', 'dgc', 'dgr', 'dgz', 'dhg', 'dif', 'dik', 'dji', 'djk', 'djr', 'dob', 'dop', 'dov', 'dwr', 'dww', 'dwy', 'ebk', 'eko', 'emi', 'emp', 'eng', 'enq', 'epo', 'eri', 'ese', 'esk', 'etr', 'ewe', 'faa', 'fai', 'far', 'ffm', 'for', 'fra', 'fue', 'fuf', 'fuh', 'gah', 'gai', 'gam', 'gaw', 'gdn', 'gdr', 'geb', 'gfk', 'ghs', 'glk', 'gmv', 'gng', 'gnn', 'gnw', 'gof', 'grc', 'gub', 'guh', 'gui', 'guj', 'gul', 'gum', 'gun', 'guo', 'gup', 'gux', 'gvc', 'gvf', 'gvn', 'gvs', 'gwi', 'gym', 'gyr', 'hat', 'hau', 'haw', 'hbo', 'hch', 'heb', 'heg', 'hin', 'hix', 'hla', 'hlt', 'hmo', 'hns', 'hop', 'hot', 'hrv', 'hto', 'hub', 'hui', 'hun', 'hus', 'huu', 'huv', 'hvn', 'ian', 'ign', 'ikk', 'ikw', 'ilo', 'imo', 'inb', 'ind', 'ino', 'iou', 'ipi', 'isn', 'ita', 'iws', 'ixl', 'jac', 'jae', 'jao', 'jic', 'jid', 'jiv', 'jni', 'jpn', 'jvn', 'kan', 'kaq', 'kbc', 'kbh', 'kbm', 'kbq', 'kdc', 'kde', 'kdl', 'kek', 'ken', 'kew', 'kgf', 'kgk', 'kgp', 'khs', 'khz', 'kik', 'kiw', 'kiz', 'kje', 'kjs', 'kkc', 'kkl', 'klt', 'klv', 'kmg', 'kmh', 'kmk', 'kmo', 'kms', 'kmu', 'kne', 'knf', 'knj', 'knv', 'kos', 'kpf', 'kpg', 'kpj', 'kpr', 'kpw', 'kpx', 'kqa', 'kqc', 'kqf', 'kql', 'kqw', 'ksd', 'ksj', 'ksr', 'ktm', 'kto', 'kud', 'kue', 'kup', 'kvg', 'kvn', 'kwd', 'kwf', 'kwi', 'kwj', 'kyc', 'kyf', 'kyg', 'kyq', 'kyz', 'kze', 'lac', 'lat', 'lbb', 'lbk', 'lcm', 'leu', 'lex', 'lgl', 'lid', 'lif', 'lin', 'lit', 'llg', 'lug', 'luo', 'lww', 'maa', 'maj', 'mal', 'mam', 'maq', 'mar', 'mau', 'mav', 'maz', 'mbb', 'mbc', 'mbh', 'mbj', 'mbl', 'mbs', 'mbt', 'mca', 'mcb', 'mcd', 'mcf', 'mco', 'mcp', 'mcq', 'mcr', 'mdy', 'med', 'mee', 'mek', 'meq', 'met', 'meu', 'mgc', 'mgh', 'mgw', 'mhl', 'mib', 'mic', 'mie', 'mig', 'mih', 'mil', 'mio', 'mir', 'mit', 'miz', 'mjc', 'mkj', 'mkl', 'mkn', 'mks', 'mle', 'mlh', 'mlp', 'mmo', 'mmx', 'mna', 'mop', 'mox', 'mph', 'mpj', 'mpm', 'mpp', 'mps', 'mpt', 'mpx', 'mqb', 'mqj', 'msb', 'msc', 'msk', 'msm', 'msy', 'mti', 'mto', 'mux', 'muy', 'mva', 'mvn', 'mwc', 'mwe', 'mwf', 'mwp', 'mxb', 'mxp', 'mxq', 'mxt', 'mya', 'myk', 'myu', 'myw', 'myy', 'mzz', 'nab', 'naf', 'nak', 'nas', 'nbq', 'nca', 'nch', 'ncj', 'ncl', 'ncu', 'ndg', 'ndj', 'nfa', 'ngp', 'ngu', 'nhe', 'nhg', 'nhi', 'nho', 'nhr', 'nhu', 'nhw', 'nhy', 'nif', 'nii', 'nin', 'nko', 'nld', 'nlg', 'nna', 'nnq', 'noa', 'nop', 'not', 'nou', 'npi', 'npl', 'nsn', 'nss', 'ntj', 'ntp', 'ntu', 'nuy', 'nvm', 'nwi', 'nya', 'nys', 'nyu', 'obo', 'okv', 'omw', 'ong', 'ons', 'ood', 'opm', 'ory', 'ote', 'otm', 'otn', 'otq', 'ots', 'pab', 'pad', 'pah', 'pan', 'pao', 'pes', 'pib', 'pio', 'pir', 'piu', 'pjt', 'pls', 'plu', 'pma', 'poe', 'poh', 'poi', 'pol', 'pon', 'por', 'poy', 'ppo', 'prf', 'pri', 'ptp', 'ptu', 'pwg', 'qub', 'quc', 'quf', 'quh', 'qul', 'qup', 'qvc', 'qve', 'qvh', 'qvm', 'qvn', 'qvs', 'qvw', 'qvz', 'qwh', 'qxh', 'qxn', 'qxo', 'rai', 'reg', 'rgu', 'rkb', 'rmc', 'rmy', 'ron', 'roo', 'rop', 'row', 'rro', 'ruf', 'rug', 'rus', 'rwo', 'sab', 'san', 'sbe', 'sbk', 'sbs', 'seh', 'sey', 'sgb', 'sgz', 'shj', 'shp', 'sim', 'sja', 'sll', 'smk', 'snc', 'snn', 'snp', 'snx', 'sny', 'som', 'soq', 'soy', 'spa', 'spl', 'spm', 'spp', 'sps', 'spy', 'sri', 'srm', 'srn', 'srp', 'srq', 'ssd', 'ssg', 'ssx', 'stp', 'sua', 'sue', 'sus', 'suz', 'swe', 'swh', 'swp', 'sxb', 'tac', 'taj', 'tam', 'tav', 'taw', 'tbc', 'tbf', 'tbg', 'tbo', 'tbz', 'tca', 'tcs', 'tcz', 'tdt', 'tee', 'tel', 'ter', 'tet', 'tew', 'tfr', 'tgk', 'tgl', 'tgo', 'tgp', 'tha', 'tif', 'tim', 'tiw', 'tiy', 'tke', 'tku', 'tlf', 'tmd', 'tna', 'tnc', 'tnk', 'tnn', 'tnp', 'toc', 'tod', 'tof', 'toj', 'ton', 'too', 'top', 'tos', 'tpa', 'tpi', 'tpt', 'tpz', 'trc', 'tsw', 'ttc', 'tte', 'tuc', 'tue', 'tuf', 'tuo', 'tur', 'tvk', 'twi', 'txq', 'txu', 'tzj', 'tzo', 'ubr', 'ubu', 'udu', 'uig', 'ukr', 'uli', 'ulk', 'upv', 'ura', 'urb', 'urd', 'uri', 'urt', 'urw', 'usa', 'usp', 'uvh', 'uvl', 'vid', 'vie', 'viv', 'vmy', 'waj', 'wal', 'wap', 'wat', 'wbi', 'wbp', 'wed', 'wer', 'wim', 'wiu', 'wiv', 'wmt', 'wmw', 'wnc', 'wnu', 'wol', 'wos', 'wrk', 'wro', 'wrs', 'wsk', 'wuv', 'xav', 'xbi', 'xed', 'xla', 'xnn', 'xon', 'xsi', 'xtd', 'xtm', 'yaa', 'yad', 'yal', 'yap', 'yaq', 'yby', 'ycn', 'yka', 'yle', 'yml', 'yon', 'yor', 'yrb', 'yre', 'yss', 'yuj', 'yut', 'yuw', 'yva', 'zaa', 'zab', 'zac', 'zad', 'zai', 'zaj', 'zam', 'zao', 'zap', 'zar', 'zas', 'zat', 'zav', 'zaw', 'zca', 'zga', 'zia', 'ziw', 'zlm', 'zos', 'zpc', 'zpl', 'zpm', 'zpo', 'zpq', 'zpu', 'zpv', 'zpz', 'zsr', 'ztq', 'zty', 'zyp'] BitextMining s2s [Religious] {'train': 256} {'train': 120.0}
BigPatentClustering.v2 (Eva Sharma and Chen Li and Lu Wang, 2019) ['eng'] Clustering p2p [Legal] {'test': 2048} {'test': 30995.5}
BiorxivClusteringP2P.v2 ['eng'] Clustering p2p [Academic] {'test': 2151} {'test': 1664.0}
BiorxivClusteringS2S.v2 ['eng'] Clustering s2s [Academic] {'test': 2151} {'test': 101.7}
BlurbsClusteringP2P.v2 (Steffen Remus, 2019) ['deu'] Clustering p2p [Fiction] {'test': 2048} {'test': 664.09}
BlurbsClusteringS2S.v2 (Steffen Remus, 2019) ['deu'] Clustering s2s [Fiction] {'test': 2048} {'test': 23.02}
BornholmBitextMining ['dan'] BitextMining s2s [Web, Social, Fiction] {'test': 500} {'test': 89.7}
BrazilianToxicTweetsClassification (Joao Augusto Leite and Diego F. Silva and Kalina Bontcheva and Carolina Scarton, 2020) ['por'] MultilabelClassification s2s [Constructed] {'test': 2048} {'test': 85.05}
BulgarianStoreReviewSentimentClassfication (Georgieva-Trifonova et al., 2018) ['bul'] Classification s2s [Reviews] {'test': 182} {'test': 316.7}
CBD ['pol'] Classification s2s {'test': 1000} {'test': 93.2}
CDSC-E ['pol'] PairClassification s2s
CDSC-R ['pol'] STS s2s [Web] {'test': 1000} {'test': 75.24}
CEDRClassification (Sboev et al., 2021) ['rus'] MultilabelClassification s2s [Web, Social, Blog] {'test': 1882} {'test': 91.2}
CLSClusteringP2P.v2 (Yudong Li, 2022) ['cmn'] Clustering p2p [Academic] {'test': 2048}
CLSClusteringS2S.v2 (Yudong Li, 2022) ['cmn'] Clustering s2s [Academic] {'test': 2048}
CMedQAv1-reranking (Zhang et al., 2017) ['cmn'] Reranking s2s [Medical] {'test': 2000} {'test': 165.0}
CMedQAv2-reranking (S. Zhang, 2018) ['cmn'] Reranking s2s
CQADupstackAndroidRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackEnglishRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackGamingRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackGisRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackMathematicaRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackPhysicsRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackProgrammersRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackStatsRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackTexRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackUnixRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackWebmastersRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CQADupstackWordpressRetrieval (Hoogeveen et al., 2015) ['eng'] Retrieval s2p
CSFDCZMovieReviewSentimentClassification (Michal Štefánik, 2023) ['ces'] Classification s2s [Reviews] {'test': 2048} {'test': 386.5}
CSFDSKMovieReviewSentimentClassification (Michal Štefánik, 2023) ['slk'] Classification s2s [Reviews] {'test': 2048} {'test': 366.2}
CTKFactsNLI (Ullrich et al., 2023) ['ces'] PairClassification s2s [News] {'test': 375, 'validation': 305} {'test': 225.62, 'validation': 219.32}
CUADAffiliateLicenseLicenseeLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 198} {'test': 484.11}
CUADAffiliateLicenseLicensorLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 88} {'test': 633.4}
CUADAntiAssignmentLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1172} {'test': 340.81}
CUADAuditRightsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1216} {'test': 337.14}
CUADCapOnLiabilityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1246} {'test': 375.74}
CUADChangeOfControlLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 416} {'test': 391.96}
CUADCompetitiveRestrictionExceptionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 220} {'test': 433.04}
CUADCovenantNotToSueLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 308} {'test': 402.97}
CUADEffectiveDateLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 236} {'test': 277.62}
CUADExclusivityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 762} {'test': 369.17}
CUADExpirationDateLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 876} {'test': 309.27}
CUADGoverningLawLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 876} {'test': 289.87}
CUADIPOwnershipAssignmentLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 576} {'test': 414.0}
CUADInsuranceLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1030} {'test': 365.54}
CUADIrrevocableOrPerpetualLicenseLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 280} {'test': 473.4}
CUADJointIPOwnershipLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 192} {'test': 374.17}
CUADLicenseGrantLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1396} {'test': 409.89}
CUADLiquidatedDamagesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 220} {'test': 351.76}
CUADMinimumCommitmentLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 772} {'test': 364.16}
CUADMostFavoredNationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 64} {'test': 418.75}
CUADNoSolicitOfCustomersLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 84} {'test': 392.89}
CUADNoSolicitOfEmployeesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 142} {'test': 417.94}
CUADNonCompeteLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 442} {'test': 383.2}
CUADNonDisparagementLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 100} {'test': 403.08}
CUADNonTransferableLicenseLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 542} {'test': 399.16}
CUADNoticePeriodToTerminateRenewalLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 222} {'test': 354.85}
CUADPostTerminationServicesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 808} {'test': 422.53}
CUADPriceRestrictionsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 46} {'test': 324.71}
CUADRenewalTermLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 386} {'test': 340.87}
CUADRevenueProfitSharingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 774} {'test': 371.55}
CUADRofrRofoRofnLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 690} {'test': 395.46}
CUADSourceCodeEscrowLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 118} {'test': 399.18}
CUADTerminationForConvenienceLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 430} {'test': 326.3}
CUADThirdPartyBeneficiaryLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 68} {'test': 261.04}
CUADUncappedLiabilityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 294} {'test': 441.04}
CUADUnlimitedAllYouCanEatLicenseLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 48} {'test': 368.08}
CUADVolumeRestrictionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 322} {'test': 306.27}
CUADWarrantyDurationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 320} {'test': 352.27}
CanadaTaxCourtOutcomesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 244} {'test': 622.6}
CataloniaTweetClassification ['cat', 'spa'] Classification s2s [Social, Government] {'validation': 2000, 'test': 2000} {'validation': 202.61, 'test': 200.49}
ClimateFEVER (Thomas Diggelmann, 2021) ['eng'] Retrieval s2p
CmedqaRetrieval ['cmn'] Retrieval s2p
Cmnli ['cmn'] PairClassification s2s
CodeEditSearchRetrieval (Niklas Muennighoff, 2023) ['c', 'c++', 'go', 'java', 'javascript', 'php', 'python', 'ruby', 'rust', 'scala', 'shell', 'swift', 'typescript'] Retrieval p2p [Programming] {'train': 13000} {'train': 553.5}
CodeSearchNetRetrieval (Husain et al., 2019) ['go', 'java', 'javascript', 'php', 'python', 'ruby'] Retrieval p2p [Programming] {'test': 1000} {'test': 1196.4609}
ContractNLIConfidentialityOfAgreementLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 82} {'test': 473.17}
ContractNLIExplicitIdentificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 109} {'test': 506.12}
ContractNLIInclusionOfVerballyConveyedInformationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 139} {'test': 525.75}
ContractNLILimitedUseLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 208} {'test': 407.51}
ContractNLINoLicensingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 162} {'test': 419.42}
ContractNLINoticeOnCompelledDisclosureLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 142} {'test': 503.45}
ContractNLIPermissibleAcquirementOfSimilarInformationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 178} {'test': 427.4}
ContractNLIPermissibleCopyLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 87} {'test': 386.84}
ContractNLIPermissibleDevelopmentOfSimilarInformationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 136} {'test': 396.4}
ContractNLIPermissiblePostAgreementPossessionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 111} {'test': 529.09}
ContractNLIReturnOfConfidentialInformationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 66} {'test': 478.29}
ContractNLISharingWithEmployeesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 170} {'test': 548.63}
ContractNLISharingWithThirdPartiesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 180} {'test': 517.29}
ContractNLISurvivalOfObligationsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 157} {'test': 417.64}
Core17InstructionRetrieval (Orion Weller, 2024) ['eng'] InstructionRetrieval s2p [News] {'eng': 39838} {'eng': 2768.749235474006}
CorporateLobbyingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 490} {'test': 6039.85}
CovidRetrieval ['cmn'] Retrieval s2p
CrossLingualSemanticDiscriminationWMT19 ['deu', 'fra'] Retrieval s2s [News] {'test': 2946} {'test': 161.0}
CrossLingualSemanticDiscriminationWMT21 ['deu', 'fra'] Retrieval s2s [News] {'test': 1786} {'test': 159.0}
CyrillicTurkicLangClassification (Goldhahn et al., 2012) ['bak', 'chv', 'kaz', 'kir', 'krc', 'rus', 'sah', 'tat', 'tyv'] Classification s2s [Web] {'test': 2048} {'test': 92.22}
CzechProductReviewSentimentClassification ['ces'] Classification s2s [Reviews] {'test': 2048} {'test': 153.26}
CzechSoMeSentimentClassification ['ces'] Classification s2s [Reviews] {'test': 1000} {'test': 59.89}
CzechSubjectivityClassification ['ces'] Classification s2s [Reviews] {'validation': 500, 'test': 2000} {'validation': 108.2, 'test': 108.3}
DBPedia (Hasibi et al., 2017) ['eng'] Retrieval s2p
DBPedia-PL (Hasibi et al., 2017) ['pol'] Retrieval s2p
DBpediaClassification (Zhang et al., 2015) ['eng'] Classification s2s [Encyclopaedic] {'test': 70000} {'test': 281.4}
DKHateClassification ['dan'] Classification s2s [Social] {'test': 329} {'test': 104.0}
DalajClassification ['swe'] Classification s2s [Non-fiction] {'test': 444} {'test': 243.8}
DanFEVER ['dan'] Retrieval p2p [Encyclopaedic, Non-fiction] {'train': 8897} {'train': 124.84}
DanishPoliticalCommentsClassification (Mads Guldborg Kjeldgaard Kongsbak, 2019) ['dan'] Classification s2s [Social] {'train': 9010} {'train': 69.9}
DefinitionClassificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1337} {'test': 253.72}
DiaBlaBitextMining (González et al., 2019) ['eng', 'fra'] BitextMining s2s [Social]
Diversity1LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 103.21}
Diversity2LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 0.0}
Diversity3LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 135.46}
Diversity4LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 144.52}
Diversity5LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 174.77}
Diversity6LegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 300} {'test': 301.01}
DuRetrieval (Yifu Qiu, 2022) ['cmn'] Retrieval s2p
DutchBookReviewSentimentClassification (Benjamin et al., 2019) ['nld'] Classification s2s [Reviews] {'test': 2224} {'test': 1443.0}
EcomRetrieval ['cmn'] Retrieval s2p
EightTagsClustering.v2 ['pol'] Clustering s2s [Social] {'test': 2048} {'test': 78.73}
EmotionClassification ['eng'] Classification s2s [Social] {'validation': 2000, 'test': 2000} {'validation': 95.3, 'test': 95.6}
EstQA ['est'] Retrieval s2p [Encyclopaedic] {'test': 603} {'test': 772.5331950207469}
EstonianValenceClassification ['est'] Classification s2s [News] {'train': 3270, 'test': 818} {'train': 226.70642201834863, 'test': 231.5085574572127}
FEVER ['eng'] Retrieval s2p
FQuADRetrieval ['fra'] Retrieval s2p [Encyclopaedic] {'test': 400, 'validation': 100} {'test': 937.0, 'validation': 930.0}
FaithDial (Dziri et al., 2022) ['eng'] Retrieval s2p [Encyclopaedic] {'test': 2042} {'test': 74.0}
FalseFriendsGermanEnglish ['deu'] PairClassification s2s {'test': 1524} {'test': 40.3}
FaroeseSTS ['fao'] STS s2s [News, Web] {'train': 729} {'train': 43.6}
FarsTail (Amirkhani et al., 2023) ['fas'] PairClassification s2s [Academic] {'test': 1029} {'test': 125.84}
FeedbackQARetrieval ['eng'] Retrieval s2p [Web, Government, Medical] {'test': 1992} {'test': 1175.0}
FiQA-PL (Nandan Thakur, 2021) ['pol'] Retrieval s2p
FiQA2018 (Nandan Thakur, 2021) ['eng'] Retrieval s2p
FilipinoHateSpeechClassification (Neil Vicente Cabasag et al., 2019) ['fil'] Classification s2s [Social] {'validation': 2048, 'test': 2048} {'validation': 88.1, 'test': 87.4}
FilipinoShopeeReviewsClassification ['fil'] Classification s2s [Social] {'validation': 2250, 'test': 2250} {'validation': 143.8, 'test': 145.1}
FinParaSTS ['fin'] STS s2s [News, Subtitles] {'test': 1000, 'validation': 1000} {'test': 59.0, 'validation': 58.8}
FinToxicityClassification ['fin'] Classification s2s [News] {'train': 2048, 'test': 2048} {'train': 432.63, 'test': 401.03}
FinancialPhrasebankClassification (P. Malo, 2014) ['eng'] Classification s2s [News] {'train': 4840} {'train': 121.96}
FloresBitextMining (Goyal et al., 2022) ['ace', 'acm', 'acq', 'aeb', 'afr', 'ajp', 'aka', 'als', 'amh', 'apc', 'arb', 'ars', 'ary', 'arz', 'asm', 'ast', 'awa', 'ayr', 'azb', 'azj', 'bak', 'bam', 'ban', 'bel', 'bem', 'ben', 'bho', 'bjn', 'bod', 'bos', 'bug', 'bul', 'cat', 'ceb', 'ces', 'cjk', 'ckb', 'crh', 'cym', 'dan', 'deu', 'dik', 'dyu', 'dzo', 'ell', 'eng', 'epo', 'est', 'eus', 'ewe', 'fao', 'fij', 'fin', 'fon', 'fra', 'fur', 'fuv', 'gaz', 'gla', 'gle', 'glg', 'grn', 'guj', 'hat', 'hau', 'heb', 'hin', 'hne', 'hrv', 'hun', 'hye', 'ibo', 'ilo', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kab', 'kac', 'kam', 'kan', 'kas', 'kat', 'kaz', 'kbp', 'kea', 'khk', 'khm', 'kik', 'kin', 'kir', 'kmb', 'kmr', 'knc', 'kon', 'kor', 'lao', 'lij', 'lim', 'lin', 'lit', 'lmo', 'ltg', 'ltz', 'lua', 'lug', 'luo', 'lus', 'lvs', 'mag', 'mai', 'mal', 'mar', 'min', 'mkd', 'mlt', 'mni', 'mos', 'mri', 'mya', 'nld', 'nno', 'nob', 'npi', 'nso', 'nus', 'nya', 'oci', 'ory', 'pag', 'pan', 'pap', 'pbt', 'pes', 'plt', 'pol', 'por', 'prs', 'quy', 'ron', 'run', 'rus', 'sag', 'san', 'sat', 'scn', 'shn', 'sin', 'slk', 'slv', 'smo', 'sna', 'snd', 'som', 'sot', 'spa', 'srd', 'srp', 'ssw', 'sun', 'swe', 'swh', 'szl', 'tam', 'taq', 'tat', 'tel', 'tgk', 'tgl', 'tha', 'tir', 'tpi', 'tsn', 'tso', 'tuk', 'tum', 'tur', 'twi', 'tzm', 'uig', 'ukr', 'umb', 'urd', 'uzn', 'vec', 'vie', 'war', 'wol', 'xho', 'ydd', 'yor', 'yue', 'zho', 'zsm', 'zul'] BitextMining s2s [Non-fiction, Encyclopaedic] {'dev': 997, 'devtest': 1012}
FrenchBookReviews ['fra'] Classification s2s [Reviews] {'train': 2048} {'train': 311.5}
FrenkEnClassification (Nikola Ljubešić, 2019) ['eng'] Classification s2s [Social] {'test': 2300} {'test': 188.75}
FrenkHrClassification (Nikola Ljubešić, 2019) ['hrv'] Classification s2s [Social] {'test': 2120} {'test': 89.86}
FrenkSlClassification (Nikola Ljubešić, 2019) ['slv'] Classification s2s [Social] {'test': 2177} {'test': 136.61}
FunctionOfDecisionSectionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 367} {'test': 551.07}
GeoreviewClassification ['rus'] Classification p2p [Reviews] {'test': 2048} {'test': 409.0}
GeoreviewClusteringP2P ['rus'] Clustering p2p [Reviews] {'test': 2000} {'test': 384.5}
GeorgianFAQRetrieval ['kat'] Retrieval s2p [Web] {'test': 2566} {'test': 572.0}
GerDaLIR ['deu'] Retrieval s2p
GerDaLIRSmall ['deu'] Retrieval p2p [Legal]
GermanDPR (Timo Möller, 2021) ['deu'] Retrieval s2p
GermanGovServiceRetrieval ['deu'] Retrieval s2p [Government] {'test': 357} {'test': 1211.69}
GermanPoliticiansTwitterSentimentClassification ['deu'] Classification s2s [Social, Government] {'test': 357} {'test': 302.48}
GermanQuAD-Retrieval (Timo Möller, 2021) ['deu'] Retrieval s2p
GermanSTSBenchmark (Philip May, 2021) ['deu'] STS s2s
GreekCivicsQA ['ell'] Retrieval s2p [Academic] {'default': 407} {'default': 2226.85}
GreekLegalCodeClassification ['ell'] Classification s2s [Legal] {'validation': 2048, 'test': 2048} {'validation': 4046.8, 'test': 4200.8}
GujaratiNewsClassification ['guj'] Classification s2s [News] {'train': 5269, 'test': 1318} {'train': 61.95, 'test': 61.91}
HALClusteringS2S.v2 (Mathieu Ciancone, 2024) ['fra'] Clustering s2s [Academic] {'test': 2048} {'test': 86.6}
HagridRetrieval (Ehsan Kamalloo, 2023) ['eng'] Retrieval s2p [Encyclopaedic] {'train': 1922} {'train': 14.53}
HateSpeechPortugueseClassification ['por'] Classification s2s [Social] {'train': 2048} {'train': 101.02}
HeadlineClassification ['rus'] Classification s2s [News] {'test': 2048} {'test': 61.6}
HebrewSentimentAnalysis ['heb'] Classification s2s [Reviews] {'test': 2048} {'test': 113.57}
HellaSwag (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 10042} {'test': 366.1}
HinDialectClassification (Bafna et al., 2022) ['anp', 'awa', 'ben', 'bgc', 'bhb', 'bhd', 'bho', 'bjj', 'bns', 'bra', 'gbm', 'guj', 'hne', 'kfg', 'kfy', 'mag', 'mar', 'mup', 'noe', 'pan', 'raj'] Classification s2s [Social, Spoken] {'test': 1152} {'test': 583.82}
HindiDiscourseClassification ['hin'] Classification s2s [Fiction, Social] {'train': 2048} {'train': 79.23828125}
HotelReviewSentimentClassification (Elnagar et al., 2018) ['ara'] Classification s2s [Reviews] {'train': 2048} {'train': 137.2}
HotpotQA ['eng'] Retrieval s2p
HotpotQA-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
HunSum2AbstractiveRetrieval (Botond Barta, 2024) ['hun'] Retrieval s2p [News] {'test': 1998} {'test': 2462.2177177177177}
IFlyTek ['cmn'] Classification s2s
IN22ConvBitextMining (Jay Gala, 2023) ['asm', 'ben', 'brx', 'doi', 'eng', 'gom', 'guj', 'hin', 'kan', 'kas', 'mai', 'mal', 'mar', 'mni', 'npi', 'ory', 'pan', 'san', 'sat', 'snd', 'tam', 'tel', 'urd'] BitextMining s2s [Social, Spoken, Fiction] {'test': 1503} {'test': 54.3}
IN22GenBitextMining (Jay Gala, 2023) ['asm', 'ben', 'brx', 'doi', 'eng', 'gom', 'guj', 'hin', 'kan', 'kas', 'mai', 'mal', 'mar', 'mni', 'npi', 'ory', 'pan', 'san', 'sat', 'snd', 'tam', 'tel', 'urd'] BitextMining s2s [Web, Legal, Government, News, Religious, Non-fiction] {'test': 1024} {'test': 156.7}
IWSLT2017BitextMining ['ara', 'cmn', 'deu', 'eng', 'fra', 'ita', 'jpn', 'kor', 'nld', 'ron'] BitextMining s2s [Non-fiction, Fiction] {'validation': 21928} {'validation': 95.4}
ImdbClassification ['eng'] Classification p2p [Reviews] {'test': 25000} {'test': 1293.8}
InappropriatenessClassification ['rus'] Classification s2s [Web, Social] {'test': 2048} {'test': 97.7}
IndicCrosslingualSTS (Ramesh et al., 2022) ['asm', 'ben', 'eng', 'guj', 'hin', 'kan', 'mal', 'mar', 'ory', 'pan', 'tam', 'tel', 'urd'] STS s2s [News, Non-fiction, Web, Spoken, Government] {'test': 10020} {'test': 76.22}
IndicGenBenchFloresBitextMining (Harman Singh, 2024) ['asm', 'awa', 'ben', 'bgc', 'bho', 'bod', 'boy', 'eng', 'gbm', 'gom', 'guj', 'hin', 'hne', 'kan', 'mai', 'mal', 'mar', 'mni', 'mup', 'mwr', 'nep', 'ory', 'pan', 'pus', 'raj', 'san', 'sat', 'tam', 'tel', 'urd'] BitextMining s2s [Web, News] {'validation': 997, 'test': 1012} {'validation': 126.25, 'test': 130.84}
IndicLangClassification ['asm', 'ben', 'brx', 'doi', 'gom', 'guj', 'hin', 'kan', 'kas', 'mai', 'mal', 'mar', 'mni', 'npi', 'ory', 'pan', 'san', 'sat', 'snd', 'tam', 'tel', 'urd'] Classification s2s [Web, Non-fiction] {'test': 30418} {'test': 106.5}
IndicNLPNewsClassification (Anoop Kunchukuttan, 2020) ['guj', 'kan', 'mal', 'mar', 'ori', 'pan', 'tam', 'tel'] Classification s2s [News] {'test': 2048} {'test': 1169.053974484789}
IndicQARetrieval (Sumanth Doddapaneni, 2022) ['asm', 'ben', 'guj', 'hin', 'kan', 'mal', 'mar', 'ory', 'pan', 'tam', 'tel'] Retrieval s2p [Web] {'test': 18586} {'test': 930.6}
IndicReviewsClusteringP2P (Sumanth Doddapaneni, 2022) ['asm', 'ben', 'brx', 'guj', 'hin', 'kan', 'mal', 'mar', 'ory', 'pan', 'tam', 'tel', 'urd'] Clustering p2p [Reviews] {'test': 1000} {'test': 137.6}
IndicSentimentClassification (Sumanth Doddapaneni, 2022) ['asm', 'ben', 'brx', 'guj', 'hin', 'kan', 'mal', 'mar', 'ory', 'pan', 'tam', 'tel', 'urd'] Classification s2s [Reviews] {'test': 1000} {'test': 137.6}
IndonesianIdClickbaitClassification ['ind'] Classification s2s [News] {'train': 2048} {'train': 64.28}
IndonesianMongabayConservationClassification ['ind'] Classification s2s [Web] {'validation': 984, 'test': 970} {'validation': 1675.8, 'test': 1675.5}
InsurancePolicyInterpretationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 133} {'test': 521.88}
InternationalCitizenshipQuestionsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 206.18}
IsiZuluNewsClassification (Madodonga et al., 2023) ['zul'] Classification s2s [News] {'train': 752} {'train': 43.1}
ItaCaseholdClassification (Licari et al., 2023) ['ita'] Classification s2s [Legal, Government] {'test': 221} {'test': 4207.9}
Itacola ['ita'] Classification s2s [Non-fiction, Spoken] {'train': 7801, 'test': 975} {'train': 35.95, 'test': 36.67}
JCrewBlockerLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 54} {'test': 1092.22}
JDReview ['cmn'] Classification s2s
JSICK (Yanaka et al., 2022) ['jpn'] STS s2s [Web] {'test': 1986} {'test': 21.47}
JSTS ['jpn'] STS s2s [Web] {'valudtion': 1457} {'valudtion': 46.34}
JaGovFaqsRetrieval ['jpn'] Retrieval s2s [Web] {'test': 2048} {'test': 210.02}
JaQuADRetrieval (ByungHoon So, 2022) ['jpn'] Retrieval p2p [Encyclopaedic, Non-fiction] {'validation': 2048} {'validation': 400.75}
JavaneseIMDBClassification (Wongso et al., 2021) ['jav'] Classification s2s [Reviews] {'test': 25000} {'test': 481.83}
KLUE-NLI (Sungjoon Park, 2021) ['kor'] PairClassification s2s [News, Encyclopaedic] {'validation': 2000} {'validation': 35.01}
KLUE-STS (Sungjoon Park, 2021) ['kor'] STS s2s [Reviews, News, Spoken] {'validation': 519} {'validation': 33.178227360308284}
KLUE-TC (Sungjoon Park, 2021) ['kor'] Classification s2s [News] {'validation': 2048} {'validation': 27.079609091907326}
KannadaNewsClassification (Anoop Kunchukuttan, 2020) ['kan'] Classification s2s [News] {'train': 6460} {'train': 65.88}
KinopoiskClassification (Blinov et al., 2013) ['rus'] Classification p2p [Reviews] {'test': 1500} {'test': 1897.3}
Ko-StrategyQA (Geva et al., 2021) ['kor'] Retrieval s2p
Ko-miracl (Zhang et al., 2023) ['kor'] Retrieval s2p
KorFin (Son et al., 2023) ['kor'] Classification s2s [News] {'test': 2048} {'test': 75.28}
KorHateClassification (Jihyung Moon, 2020) ['kor'] Classification s2s [Social] {'train': 2048, 'test': 471} {'train': 38.57, 'test': 38.86}
KorHateSpeechMLClassification ['kor'] MultilabelClassification s2s [Social] {'train': 8192, 'test': 2048} {'train': 33.67, 'test': 34.67}
KorSTS (Ham et al., 2020) ['kor'] STS s2s [News, Web] {'test': 1379} {'test': 29.279433139534884}
KorSarcasmClassification (Kim et al., 2019) ['kor'] Classification s2s [Social] {'train': 2048, 'test': 301} {'train': 48.45, 'test': 46.77}
KurdishSentimentClassification (Badawi et al., 2024) ['kur'] Classification s2s [Web] {'train': 6000, 'test': 1987} {'train': 59.38, 'test': 56.11}
LCQMC ['cmn'] STS s2s
LEMBNarrativeQARetrieval ['eng'] Retrieval s2p [Fiction, Non-fiction] {'test': 10804} {'test': 326399.3}
LEMBNeedleRetrieval (Zhu et al., 2024) ['eng'] Retrieval s2p [Academic, Blog] {'test_256': 150, 'test_512': 150, 'test_1024': 150, 'test_2048': 150, 'test_4096': 150, 'test_8192': 150, 'test_16384': 150, 'test_32768': 150} {'test_256': 1074.4, 'test_512': 2067.0, 'test_1024': 4129.5, 'test_2048': 8513.4, 'test_4096': 17452.7, 'test_8192': 35261.6, 'test_16384': 72113.7, 'test_32768': 141829.0}
LEMBPasskeyRetrieval (Zhu et al., 2024) ['eng'] Retrieval s2p [Fiction] {'test_256': 150, 'test_512': 150, 'test_1024': 150, 'test_2048': 150, 'test_4096': 150, 'test_8192': 150, 'test_16384': 150, 'test_32768': 150} {'test_256': 914.9, 'test_512': 1823.0, 'test_1024': 3644.7, 'test_2048': 7280.0, 'test_4096': 14555.5, 'test_8192': 29108.1, 'test_16384': 58213.9, 'test_32768': 116417.9}
LEMBQMSumRetrieval ['eng'] Retrieval s2p [Spoken] {'test': 1724} {'test': 56136.4}
LEMBSummScreenFDRetrieval ['eng'] Retrieval s2p [Spoken] {'validation': 672} {'validation': 31445.8}
LEMBWikimQARetrieval (Ho et al., 2020) ['eng'] Retrieval s2p [Encyclopaedic] {'test': 500} {'test': 37513.0}
LanguageClassification (Conneau et al., 2018) ['ara', 'bul', 'cmn', 'deu', 'ell', 'eng', 'fra', 'hin', 'ita', 'jpn', 'nld', 'pol', 'por', 'rus', 'spa', 'swa', 'tha', 'tur', 'urd', 'vie'] Classification s2s [Reviews, Web, Non-fiction, Fiction, Government] {'test': 2048} {'test': 107.8}
LccSentimentClassification ['dan'] Classification s2s [News, Web] {'test': 150} {'test': 118.7}
LeCaRDv2 (Haitao Li, 2023) ['zho'] Retrieval p2p [Legal]
LearnedHandsBenefitsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 66} {'test': 1308.44}
LearnedHandsBusinessLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 174} {'test': 1144.51}
LearnedHandsConsumerLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 614} {'test': 1277.45}
LearnedHandsCourtsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 192} {'test': 1171.02}
LearnedHandsCrimeLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 688} {'test': 1212.9}
LearnedHandsDivorceLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 150} {'test': 1242.43}
LearnedHandsDomesticViolenceLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 174} {'test': 1360.83}
LearnedHandsEducationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 56} {'test': 1397.44}
LearnedHandsEmploymentLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 710} {'test': 1262.74}
LearnedHandsEstatesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 178} {'test': 1200.7}
LearnedHandsFamilyLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 1338.27}
LearnedHandsHealthLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 226} {'test': 1472.59}
LearnedHandsHousingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 1322.54}
LearnedHandsImmigrationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 134} {'test': 1216.31}
LearnedHandsTortsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 432} {'test': 1406.97}
LearnedHandsTrafficLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 556} {'test': 1182.91}
LegalBenchConsumerContractsQA (Koreeda et al., 2021) ['eng'] Retrieval s2p [Legal]
LegalBenchCorporateLobbying (Neel Guha, 2023) ['eng'] Retrieval s2p [Legal]
LegalBenchPC (Neel Guha, 2023) ['eng'] PairClassification s2s [Legal] {'test': 2048} {'test': 287.18}
LegalQuAD (Hoppe et al., 2021) ['deu'] Retrieval s2p [Legal]
LegalReasoningCausalityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 55} {'test': 1563.76}
LegalSummarization ['eng'] Retrieval s2p [Legal]
LinceMTBitextMining (Aguilar et al., 2020) ['eng', 'hin'] BitextMining s2s [Social] {'train': 8060} {'train': 58.67}
LivedoorNewsClustering ['jpn'] Clustering s2s [News] {'test': 1107} {'test': 1082.61}
MAUDLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 1802.93}
MIRACLReranking (Zhang et al., 2023) ['ara', 'ben', 'deu', 'eng', 'fas', 'fin', 'fra', 'hin', 'ind', 'jpn', 'kor', 'rus', 'spa', 'swa', 'tel', 'tha', 'yor', 'zho'] Reranking s2s [Encyclopaedic] {'dev': 44608} {'dev': 506.3}
MIRACLRetrieval (Zhang et al., 2023) ['deu', 'spa'] Retrieval s2p
MLQARetrieval ['ara', 'deu', 'eng', 'hin', 'spa', 'vie', 'zho'] Retrieval s2p [Encyclopaedic] {'test': 158083, 'validation': 15747} {'test': 37352.28, 'validation': 36952.7}
MLQuestions ['eng'] Retrieval s2p [Encyclopaedic, Academic] {'dev': 1500, 'test': 1500} {'dev': 305.0, 'test': 307.0}
MLSUMClusteringP2P.v2 (Scialom et al., 2020) ['deu', 'fra', 'rus', 'spa'] Clustering p2p [News] {'validation': 2048, 'test': 2048} {'validation': 4613.0, 'test': 4810.0}
MLSUMClusteringS2S.v2 (Scialom et al., 2020) ['deu', 'fra', 'rus', 'spa'] Clustering s2s [News] {'validation': 2048, 'test': 2048} {'validation': 4613.0, 'test': 4810.0}
MMarcoReranking (Luiz Henrique Bonifacio, 2021) ['cmn'] Reranking s2s
MMarcoRetrieval (Shitao Xiao, 2024) ['cmn'] Retrieval s2p
MSMARCO (Tri Nguyen and Mir Rosenberg and Xia Song and Jianfeng Gao and Saurabh Tiwary and Rangan Majumder and Li Deng, 2016) ['eng'] Retrieval s2p
MSMARCO-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
MSMARCOv2 (Tri Nguyen and Mir Rosenberg and Xia Song and Jianfeng Gao and Saurabh Tiwary and Rangan Majumder and Li Deng, 2016) ['eng'] Retrieval s2p
MTOPDomainClassification ['deu', 'eng', 'fra', 'hin', 'spa', 'tha'] Classification s2s [Spoken] {'validation': 2235, 'test': 4386} {'validation': 36.5, 'test': 36.8}
MTOPIntentClassification ['deu', 'eng', 'fra', 'hin', 'spa', 'tha'] Classification s2s [Spoken] {'validation': 2235, 'test': 4386} {'validation': 36.5, 'test': 36.8}
MacedonianTweetSentimentClassification ['mkd'] Classification s2s [Social] {'test': 1139} {'test': 67.6}
MalayalamNewsClassification (Anoop Kunchukuttan, 2020) ['mal'] Classification s2s [News] {'train': 5036, 'test': 1260} {'train': 79.48, 'test': 80.44}
MalteseNewsClassification ['mlt'] MultilabelClassification s2s [Constructed] {'train': 10784, 'test': 2297} {'train': 1595.63, 'test': 1752.1}
MarathiNewsClassification (Anoop Kunchukuttan, 2020) ['mar'] Classification s2s [News] {'test': 2048} {'test': 52.37}
MasakhaNEWSClassification (David Ifeoluwa Adelani, 2023) ['amh', 'eng', 'fra', 'hau', 'ibo', 'lin', 'lug', 'orm', 'pcm', 'run', 'sna', 'som', 'swa', 'tir', 'xho', 'yor'] Classification s2s [News] {'test': 422} {'test': 5116.6}
MasakhaNEWSClusteringP2P (David Ifeoluwa Adelani, 2023) ['amh', 'eng', 'fra', 'hau', 'ibo', 'lin', 'lug', 'orm', 'pcm', 'run', 'sna', 'som', 'swa', 'tir', 'xho', 'yor'] Clustering p2p
MasakhaNEWSClusteringS2S (David Ifeoluwa Adelani, 2023) ['amh', 'eng', 'fra', 'hau', 'ibo', 'lin', 'lug', 'orm', 'pcm', 'run', 'sna', 'som', 'swa', 'tir', 'xho', 'yor'] Clustering s2s
MassiveIntentClassification (Jack FitzGerald, 2022) ['afr', 'amh', 'ara', 'aze', 'ben', 'cmo', 'cym', 'dan', 'deu', 'ell', 'eng', 'fas', 'fin', 'fra', 'heb', 'hin', 'hun', 'hye', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kan', 'kat', 'khm', 'kor', 'lav', 'mal', 'mon', 'msa', 'mya', 'nld', 'nob', 'pol', 'por', 'ron', 'rus', 'slv', 'spa', 'sqi', 'swa', 'swe', 'tam', 'tel', 'tgl', 'tha', 'tur', 'urd', 'vie'] Classification s2s [Spoken] {'validation': 2033, 'test': 2974} {'validation': 34.8, 'test': 34.6}
MassiveScenarioClassification (Jack FitzGerald, 2022) ['afr', 'amh', 'ara', 'aze', 'ben', 'cmo', 'cym', 'dan', 'deu', 'ell', 'eng', 'fas', 'fin', 'fra', 'heb', 'hin', 'hun', 'hye', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kan', 'kat', 'khm', 'kor', 'lav', 'mal', 'mon', 'msa', 'mya', 'nld', 'nob', 'pol', 'por', 'ron', 'rus', 'slv', 'spa', 'sqi', 'swa', 'swe', 'tam', 'tel', 'tgl', 'tha', 'tur', 'urd', 'vie'] Classification s2s [Spoken] {'validation': 2033, 'test': 2974} {'validation': 34.8, 'test': 34.6}
MedicalQARetrieval (Asma et al., 2019) ['eng'] Retrieval s2s [Medical] {'test': 2048} {'test': 1205.9619140625}
MedicalRetrieval ['cmn'] Retrieval s2p
MedrxivClusteringP2P.v2 ['eng'] Clustering p2p [Academic, Medical] {'test': 1500} {'test': 1984.7}
MedrxivClusteringS2S.v2 ['eng'] Clustering s2s [Academic, Medical] {'test': 1500} {'test': 114.9}
MewsC16JaClustering ['jpn'] Clustering s2s [News] {'test': 992} {'test': 95.0}
MindSmallReranking ['eng'] Reranking s2s [News] {'test': 107968} {'test': 70.9}
MintakaRetrieval ['ara', 'deu', 'fra', 'hin', 'ita', 'jpn', 'por', 'spa'] Retrieval s2p
Moroco (Andrei M. Butnaru, 2019) ['ron'] Classification s2s [News] {'test': 2048} {'test': 1710.94}
MovieReviewSentimentClassification (Théophile Blard, 2020) ['fra'] Classification s2s [Reviews] {'validation': 1024, 'test': 1024} {'validation': 550.3, 'test': 558.1}
MultiEURLEXMultilabelClassification (Chalkidis et al., 2021) ['bul', 'ces', 'dan', 'deu', 'ell', 'eng', 'est', 'fin', 'fra', 'hrv', 'hun', 'ita', 'lav', 'lit', 'mlt', 'nld', 'pol', 'por', 'ron', 'slk', 'slv', 'spa', 'swe'] MultilabelClassification p2p [Legal, Government] {'test': 5000} {'test': 12014.41}
MultiHateClassification ['ara', 'cmn', 'deu', 'eng', 'fra', 'hin', 'ita', 'nld', 'pol', 'por', 'spa'] Classification s2s [Constructed] {'test': 10000} {'test': 45.9}
MultiLongDocRetrieval (Jianlv Chen, 2024) ['ara', 'cmn', 'deu', 'eng', 'fra', 'hin', 'ita', 'jpn', 'kor', 'por', 'rus', 'spa', 'tha'] Retrieval s2p
MultilingualSentiment ['cmn'] Classification s2s
MultilingualSentimentClassification ['ara', 'bam', 'bul', 'cmn', 'cym', 'deu', 'dza', 'ell', 'eng', 'eus', 'fas', 'fin', 'heb', 'hrv', 'ind', 'jpn', 'kor', 'mlt', 'nor', 'pol', 'rus', 'slk', 'spa', 'tha', 'tur', 'uig', 'urd', 'vie', 'zho'] Classification s2s [Reviews] {'test': 7000} {'test': 56.0}
MyanmarNews (A. H. Khine, 2017) ['mya'] Classification p2p [News] {'train': 2048} {'train': 174.2}
NFCorpus (Boteva et al., 2016) ['eng'] Retrieval s2p
NFCorpus-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
NLPJournalAbsIntroRetrieval ['jpn'] Retrieval s2s [Academic] {'test': 404} {'test': 1246.49}
NLPJournalTitleAbsRetrieval ['jpn'] Retrieval s2s [Academic] {'test': 404} {'test': 234.59}
NLPJournalTitleIntroRetrieval ['jpn'] Retrieval s2s [Academic] {'test': 404} {'test': 1040.19}
NQ (Tom Kwiatkowski, 2019) ['eng'] Retrieval s2p
NQ-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
NTREXBitextMining ['afr', 'amh', 'arb', 'aze', 'bak', 'bel', 'bem', 'ben', 'bod', 'bos', 'bul', 'cat', 'ces', 'ckb', 'cym', 'dan', 'deu', 'div', 'dzo', 'ell', 'eng', 'eus', 'ewe', 'fao', 'fas', 'fij', 'fil', 'fin', 'fra', 'fuc', 'gle', 'glg', 'guj', 'hau', 'heb', 'hin', 'hmn', 'hrv', 'hun', 'hye', 'ibo', 'ind', 'isl', 'ita', 'jpn', 'kan', 'kat', 'kaz', 'khm', 'kin', 'kir', 'kmr', 'kor', 'lao', 'lav', 'lit', 'ltz', 'mal', 'mar', 'mey', 'mkd', 'mlg', 'mlt', 'mon', 'mri', 'msa', 'mya', 'nde', 'nep', 'nld', 'nno', 'nob', 'nso', 'nya', 'orm', 'pan', 'pol', 'por', 'prs', 'pus', 'ron', 'rus', 'shi', 'sin', 'slk', 'slv', 'smo', 'sna', 'snd', 'som', 'spa', 'sqi', 'srp', 'ssw', 'swa', 'swe', 'tah', 'tam', 'tat', 'tel', 'tgk', 'tha', 'tir', 'ton', 'tsn', 'tuk', 'tur', 'uig', 'ukr', 'urd', 'uzb', 'ven', 'vie', 'wol', 'xho', 'yor', 'yue', 'zho', 'zul'] BitextMining s2s [News] {'test': 3826252} {'test': 120.0}
NYSJudicialEthicsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 292} {'test': 159.45}
NaijaSenti ['hau', 'ibo', 'pcm', 'yor'] Classification s2s [Social] {'test': 4800} {'test': 72.81}
NarrativeQARetrieval (Tomáš Kočiský, 2017) ['eng'] Retrieval s2p
NepaliNewsClassification ['nep'] Classification s2s [News] {'train': 5975, 'test': 1495} {'train': 196.61, 'test': 196.017}
NeuCLIR2022Retrieval (Lawrie et al., 2023) ['fas', 'rus', 'zho'] Retrieval s2p [News] {'fas': 2232130, 'zho': 3179323, 'rus': 4627657} {'fas': 3500.5143969099317, 'zho': 2543.1140667919617, 'rus': 3214.755239654659}
NeuCLIR2023Retrieval (Dawn Lawrie, 2024) ['fas', 'rus', 'zho'] Retrieval s2p [News] {'fas': 2232092, 'zho': 3179285, 'rus': 4627619} {'fas': 3579.508213937439, 'zho': 2704.44834488453, 'rus': 3466.8192213553616}
News21InstructionRetrieval (Orion Weller, 2024) ['eng'] InstructionRetrieval s2p [News] {'eng': 61906} {'eng': 2983.724665391969}
NewsClassification (Zhang et al., 2015) ['eng'] Classification s2s [News] {'test': 7600} {'test': 235.29}
NoRecClassification ['nob'] Classification s2s {'test': 2050} {'test': 82.0}
NollySentiBitextMining (Shode et al., 2023) ['eng', 'hau', 'ibo', 'pcm', 'yor'] BitextMining s2s [Social, Reviews] {'train': 1640} {'train': 135.91}
NorQuadRetrieval ['nob'] Retrieval p2p [Encyclopaedic, Non-fiction] {'test': 2602} {'test': 502.19}
NordicLangClassification ['dan', 'fao', 'isl', 'nno', 'nob', 'swe'] Classification s2s {'test': 3000} {'test': 78.2}
NorwegianCourtsBitextMining (Tiedemann et al., 2020) ['nno', 'nob'] BitextMining s2s [Legal] {'test': 2050} {'test': 1884.0}
NorwegianParliamentClassification ['nob'] Classification s2s {'test': 1200, 'validation': 1200} {'test': 1884.0, 'validation': 1911.0}
NusaParagraphEmotionClassification ['bbc', 'bew', 'bug', 'jav', 'mad', 'mak', 'min', 'mui', 'rej', 'sun'] Classification s2s [Non-fiction, Fiction] {'train': 15516, 'validation': 2948, 'test': 6250} {'train': 740.24, 'validation': 740.66, 'test': 740.71}
NusaParagraphTopicClassification ['bbc', 'bew', 'bug', 'jav', 'mad', 'mak', 'min', 'mui', 'rej', 'sun'] Classification s2s [Non-fiction, Fiction] {'train': 15516, 'validation': 2948, 'test': 6250} {'train': 740.24, 'validation': 740.66, 'test': 740.71}
NusaTranslationBitextMining (Cahyawijaya et al., 2023) ['abs', 'bbc', 'bew', 'bhp', 'ind', 'jav', 'mad', 'mak', 'min', 'mui', 'rej', 'sun'] BitextMining s2s [Social] {'train': 50200} {'train': 147.01}
NusaX-senti (Winata et al., 2022) ['ace', 'ban', 'bbc', 'bjn', 'bug', 'eng', 'ind', 'jav', 'mad', 'min', 'nij', 'sun'] Classification s2s [Reviews, Web, Social, Constructed] {'test': 4800} {'test': 52.4}
NusaXBitextMining (Winata et al., 2023) ['ace', 'ban', 'bbc', 'bjn', 'bug', 'eng', 'ind', 'jav', 'mad', 'min', 'nij', 'sun'] BitextMining s2s [Reviews] {'train': 5500} {'train': 157.15}
OPP115DataRetentionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 88} {'test': 195.2}
OPP115DataSecurityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1334} {'test': 246.69}
OPP115DoNotTrackLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 110} {'test': 223.16}
OPP115FirstPartyCollectionUseLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2086} {'test': 204.25}
OPP115InternationalAndSpecificAudiencesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 980} {'test': 327.71}
OPP115PolicyChangeLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 431} {'test': 200.99}
OPP115ThirdPartySharingCollectionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1590} {'test': 223.64}
OPP115UserAccessEditAndDeletionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 462} {'test': 218.59}
OPP115UserChoiceControlLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 1546} {'test': 210.62}
Ocnli (Hai Hu, 2020) ['cmn'] PairClassification s2s
OdiaNewsClassification (Anoop Kunchukuttan, 2020) ['ory'] Classification s2s [News] {'test': 2048} {'test': 49.24}
OnlineShopping ['cmn'] Classification s2s
OnlineStoreReviewSentimentClassification ['ara'] Classification s2s [Reviews] {'train': 2048} {'train': 137.2}
OpusparcusPC (Mathias Creutz, 2018) ['deu', 'eng', 'fin', 'fra', 'rus', 'swe'] PairClassification s2s
OralArgumentQuestionPurposeLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 312} {'test': 269.71}
OverrulingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 167.2}
PAC ['pol'] Classification p2p {'test': 3453} {'test': 185.3}
PAWSX ['cmn'] STS s2s
PIQA (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 1838} {'test': 134.3}
PROALegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 95} {'test': 251.73}
PSC ['pol'] PairClassification s2s
PatentClassification ['eng'] Classification s2s [Legal] {'test': 5000} {'test': 18620.44}
PawsX (Yinfei Yang, 2019) ['cmn', 'deu', 'eng', 'fra', 'jpn', 'kor', 'spa'] PairClassification s2s
PersianFoodSentimentClassification (Mehrdad Farahani et al., 2020) ['fas'] Classification s2s [Reviews] {'validation': 2048, 'test': 2048} {'validation': 90.37, 'test': 90.58}
PersonalJurisdictionLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 50} {'test': 381.14}
PhincBitextMining (Srivastava et al., 2020) ['eng', 'hin'] BitextMining s2s [Social] {'train': 13738} {'train': 75.32}
PlscClusteringP2P.v2 ['pol'] Clustering s2s [Academic] {'test': 2048} {'test': 1023.21}
PlscClusteringS2S.v2 ['pol'] Clustering s2s [Academic] {'test': 2048} {'test': 84.34}
PoemSentimentClassification (Emily Sheng, 2020) ['eng'] Classification s2s [Reviews] {'validation': 105, 'test': 104} {'validation': 45.3, 'test': 42.4}
PolEmo2.0-IN ['pol'] Classification s2s
PolEmo2.0-OUT ['pol'] Classification s2s {'test': 722} {'test': 756.2}
PpcPC (Sławomir Dadas, 2022) ['pol'] PairClassification s2s
PublicHealthQA ['ara', 'eng', 'fra', 'kor', 'rus', 'spa', 'vie', 'zho'] Retrieval s2p [Medical, Government, Web] {'test': 888} {'test': 778.1655}
PunjabiNewsClassification (Anoop Kunchukuttan, 2020) ['pan'] Classification s2s [News] {'train': 627, 'test': 157} {'train': 4222.22, 'test': 4115.14}
QBQTC ['cmn'] STS s2s
Quail (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 2720} {'test': 1983.3}
Quora-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2s
QuoraRetrieval (DataCanary et al., 2017) ['eng'] Retrieval s2s
RARbCode (Xiao et al., 2024) ['eng'] Retrieval s2p [Programming] {'test': 1484} {'test': 621.2}
RARbMath (Xiao et al., 2024) ['eng'] Retrieval s2p [Encyclopaedic] {'test': 6319} {'test': 682.9}
RTE3 ['deu', 'eng', 'fra', 'ita'] PairClassification s2s [News, Web, Encyclopaedic] {'test': 1923} {'test': 124.79}
RUParaPhraserSTS (Pivovarova et al., 2017) ['rus'] STS s2s [News] {'test': 1924} {'test': 61.25}
RedditClustering.v2 (Gregor Geigle, 2021) ['eng'] Clustering s2s [Web, Social] {'test': 32768} {'test': 64.7}
RedditClusteringP2P.v2 (Gregor Geigle, 2021) ['eng'] Clustering p2p [Web, Social] {'test': 18375} {'test': 727.7}
RestaurantReviewSentimentClassification (ElSahar et al., 2015) ['ara'] Classification s2s [Reviews] {'train': 2048} {'train': 231.4}
RiaNewsRetrieval (Gavrilov et al., 2019) ['rus'] Retrieval s2p [News] {'test': 10000} {'test': 1230.8}
Robust04InstructionRetrieval (Orion Weller, 2024) ['eng'] InstructionRetrieval s2p [News] {'eng': 95088} {'eng': 2471.0398058252426}
RomaTalesBitextMining ['hun', 'rom'] BitextMining s2s [Fiction] {'test': 215} {'test': 316.8046511627907}
RomaniBibleClustering ['rom'] Clustering p2p [Religious] {'test': 2048} {'test': 132.2}
RomanianReviewsSentiment (Anca Maria Tache, 2021) ['ron'] Classification s2s [Reviews] {'test': 2048} {'test': 588.6}
RomanianSentimentClassification (Dumitrescu et al., 2020) ['ron'] Classification s2s [Reviews] {'test': 2048} {'test': 67.6}
RonSTS (Dumitrescu et al., 2021) ['ron'] STS s2s [News, Social, Web] {'test': 1379} {'test': 60.5}
RuBQReranking (Ivan Rybin, 2021) ['rus'] Reranking s2p [Encyclopaedic] {'test': 1551} {'test': 499.9}
RuBQRetrieval (Ivan Rybin, 2021) ['rus'] Retrieval s2p [Encyclopaedic] {'test': 2845} {'test': 509.5}
RuReviewsClassification (Sergey Smetanin, 2019) ['rus'] Classification p2p [Reviews] {'test': 2048} {'test': 133.2}
RuSTSBenchmarkSTS (Philip May, 2021) ['rus'] STS s2s [News, Social, Web] {'test': 1264} {'test': 54.2}
RuSciBenchGRNTIClassification ['rus'] Classification p2p [Academic] {'test': 2048} {'test': 890.1}
RuSciBenchGRNTIClusteringP2P ['rus'] Clustering p2p [Academic] {'test': 2048} {'test': 890.1}
RuSciBenchOECDClassification ['rus'] Classification p2p [Academic] {'test': 2048} {'test': 838.9}
RuSciBenchOECDClusteringP2P ['rus'] Clustering p2p [Academic] {'test': 2048} {'test': 838.9}
SCDBPAccountabilityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3520.0}
SCDBPAuditsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3507.0}
SCDBPCertificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 378} {'test': 3507.0}
SCDBPTrainingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3506.0}
SCDBPVerificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3498.0}
SCDDAccountabilityLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 378} {'test': 3522.0}
SCDDAuditsLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3506.0}
SCDDCertificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 378} {'test': 3518.0}
SCDDTrainingLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3499.0}
SCDDVerificationLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 379} {'test': 3503.0}
SCIDOCS (Arman Cohan, 2020) ['eng'] Retrieval s2p
SCIDOCS-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
SIB200Classification (Adelani et al., 2023) ['ace', 'acm', 'acq', 'aeb', 'afr', 'ajp', 'aka', 'als', 'amh', 'apc', 'arb', 'ars', 'ary', 'arz', 'asm', 'ast', 'awa', 'ayr', 'azb', 'azj', 'bak', 'bam', 'ban', 'bel', 'bem', 'ben', 'bho', 'bjn', 'bod', 'bos', 'bug', 'bul', 'cat', 'ceb', 'ces', 'cjk', 'ckb', 'crh', 'cym', 'dan', 'deu', 'dik', 'dyu', 'dzo', 'ell', 'eng', 'epo', 'est', 'eus', 'ewe', 'fao', 'fij', 'fin', 'fon', 'fra', 'fur', 'fuv', 'gaz', 'gla', 'gle', 'glg', 'grn', 'guj', 'hat', 'hau', 'heb', 'hin', 'hne', 'hrv', 'hun', 'hye', 'ibo', 'ilo', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kab', 'kac', 'kam', 'kan', 'kas', 'kat', 'kaz', 'kbp', 'kea', 'khk', 'khm', 'kik', 'kin', 'kir', 'kmb', 'kmr', 'knc', 'kon', 'kor', 'lao', 'lij', 'lim', 'lin', 'lit', 'lmo', 'ltg', 'ltz', 'lua', 'lug', 'luo', 'lus', 'lvs', 'mag', 'mai', 'mal', 'mar', 'min', 'mkd', 'mlt', 'mni', 'mos', 'mri', 'mya', 'nld', 'nno', 'nob', 'npi', 'nqo', 'nso', 'nus', 'nya', 'oci', 'ory', 'pag', 'pan', 'pap', 'pbt', 'pes', 'plt', 'pol', 'por', 'prs', 'quy', 'ron', 'run', 'rus', 'sag', 'san', 'sat', 'scn', 'shn', 'sin', 'slk', 'slv', 'smo', 'sna', 'snd', 'som', 'sot', 'spa', 'srd', 'srp', 'ssw', 'sun', 'swe', 'swh', 'szl', 'tam', 'taq', 'tat', 'tel', 'tgk', 'tgl', 'tha', 'tir', 'tpi', 'tsn', 'tso', 'tuk', 'tum', 'tur', 'twi', 'tzm', 'uig', 'ukr', 'umb', 'urd', 'uzn', 'vec', 'vie', 'war', 'wol', 'xho', 'ydd', 'yor', 'yue', 'zho', 'zsm', 'zul'] Classification s2s [News] {'train': 701, 'validation': 99, 'test': 204} {'train': 111.24, 'validation': 97.11, 'test': 135.53}
SIB200ClusteringS2S (Adelani et al., 2023) ['ace', 'acm', 'acq', 'aeb', 'afr', 'ajp', 'aka', 'als', 'amh', 'apc', 'arb', 'ars', 'ary', 'arz', 'asm', 'ast', 'awa', 'ayr', 'azb', 'azj', 'bak', 'bam', 'ban', 'bel', 'bem', 'ben', 'bho', 'bjn', 'bod', 'bos', 'bug', 'bul', 'cat', 'ceb', 'ces', 'cjk', 'ckb', 'crh', 'cym', 'dan', 'deu', 'dik', 'dyu', 'dzo', 'ell', 'eng', 'epo', 'est', 'eus', 'ewe', 'fao', 'fij', 'fin', 'fon', 'fra', 'fur', 'fuv', 'gaz', 'gla', 'gle', 'glg', 'grn', 'guj', 'hat', 'hau', 'heb', 'hin', 'hne', 'hrv', 'hun', 'hye', 'ibo', 'ilo', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kab', 'kac', 'kam', 'kan', 'kas', 'kat', 'kaz', 'kbp', 'kea', 'khk', 'khm', 'kik', 'kin', 'kir', 'kmb', 'kmr', 'knc', 'kon', 'kor', 'lao', 'lij', 'lim', 'lin', 'lit', 'lmo', 'ltg', 'ltz', 'lua', 'lug', 'luo', 'lus', 'lvs', 'mag', 'mai', 'mal', 'mar', 'min', 'mkd', 'mlt', 'mni', 'mos', 'mri', 'mya', 'nld', 'nno', 'nob', 'npi', 'nqo', 'nso', 'nus', 'nya', 'oci', 'ory', 'pag', 'pan', 'pap', 'pbt', 'pes', 'plt', 'pol', 'por', 'prs', 'quy', 'ron', 'run', 'rus', 'sag', 'san', 'sat', 'scn', 'shn', 'sin', 'slk', 'slv', 'smo', 'sna', 'snd', 'som', 'sot', 'spa', 'srd', 'srp', 'ssw', 'sun', 'swe', 'swh', 'szl', 'tam', 'taq', 'tat', 'tel', 'tgk', 'tgl', 'tha', 'tir', 'tpi', 'tsn', 'tso', 'tuk', 'tum', 'tur', 'twi', 'tzm', 'uig', 'ukr', 'umb', 'urd', 'uzn', 'vec', 'vie', 'war', 'wol', 'xho', 'ydd', 'yor', 'yue', 'zho', 'zsm', 'zul'] Clustering s2s [News] {'test': 1004} {'test': 114.78}
SICK-BR-PC ['por'] PairClassification s2s [Web] {'test': 1000} {'test': 54.89}
SICK-BR-STS ['por'] STS s2s [Web] {'test': 1000} {'test': 54.89}
SICK-E-PL ['pol'] PairClassification s2s
SICK-R ['eng'] STS s2s
SICK-R-PL ['pol'] STS s2s [Web] {'test': 9812} {'test': 42.8}
SICKFr ['fra'] STS s2s
SIQA (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 0} {'test': 0.0}
SNLHierarchicalClusteringP2P (Navjord et al., 2023) ['nob'] Clustering p2p [Encyclopaedic, Non-fiction] {'test': 1300} {'test': 1986.9453846153847}
SNLHierarchicalClusteringS2S (Navjord et al., 2023) ['nob'] Clustering s2s [Encyclopaedic, Non-fiction] {'test': 1300} {'test': 242.22384615384615}
SNLRetrieval (Navjord et al., 2023) ['nob'] Retrieval p2p [Encyclopaedic, Non-fiction] {'test': 2048} {'test': 1101.3}
SRNCorpusBitextMining (Zwennicker et al., 2022) ['nld', 'srn'] BitextMining s2s [Social, Web] {'test': 256} {'test': 55.0}
STS12 (Agirre et al., 2012) ['eng'] STS s2s [Encyclopaedic, News] {'test': 6216} {'test': 64.7}
STS13 (Eneko Agirre, 2013) ['eng'] STS s2s [Web, News, Non-fiction] {'test': 3000} {'test': 54.0}
STS14 ['eng'] STS s2s [Blog, Web] {'test': 7500} {'test': 54.3}
STS15 ['eng'] STS s2s [Blog, News, Web] {'test': 6000} {'test': 57.7}
STS16 ['eng'] STS s2s [Blog, Web] {'test': 2372} {'test': 65.3}
STS17 ['ara', 'deu', 'eng', 'fra', 'ita', 'kor', 'nld', 'spa', 'tur'] STS s2s [News, Web] {'test': 500} {'test': 43.3}
STS22 ['ara', 'cmn', 'deu', 'eng', 'fra', 'ita', 'pol', 'rus', 'spa', 'tur'] STS p2p [News] {'test': 8056} {'test': 1993.6}
STSB ['cmn'] STS s2s
STSBenchmark (Philip May, 2021) ['eng'] STS s2s
STSBenchmarkMultilingualSTS (Philip May, 2021) ['cmn', 'deu', 'eng', 'fra', 'ita', 'nld', 'pol', 'por', 'rus', 'spa'] STS s2s [News, Social, Web] {'dev': 30000, 'test': 27580} {'dev': 66.5, 'test': 56.1}
STSES (Agirre et al., 2015) ['spa'] STS s2s
SanskritShlokasClassification ['san'] Classification s2s [Religious] {'train': 383, 'validation': 96} {'train': 98.415, 'validation': 96.635}
ScalaClassification ['dan', 'nno', 'nob', 'swe'] Classification s2s [Fiction, News, Non-fiction, Blog, Spoken, Web] {'test': 4096} {'test': 102.72}
SciDocsRR ['eng'] Reranking s2s [Academic, Non-fiction] {'test': 19599} {'test': 69.0}
SciFact (Arman Cohan, 2020) ['eng'] Retrieval s2p
SciFact-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
SemRel24STS (Nedjma Ousidhoum, 2024) ['afr', 'amh', 'arb', 'arq', 'ary', 'eng', 'hau', 'hin', 'ind', 'kin', 'mar', 'tel'] STS s2s {'dev': 2089, 'test': 7498} {'dev': 163.1, 'test': 145.9}
SensitiveTopicsClassification ['rus'] MultilabelClassification s2s [Web, Social] {'test': 2048} {'test': 95.3}
SentimentAnalysisHindi (Shantipriya Parida, 2023) ['hin'] Classification s2s [Reviews] {'train': 2497} {'train': 81.29}
SinhalaNewsClassification (Nisansa de Silva, 2015) ['sin'] Classification s2s [News] {'train': 3327} {'train': 148.04}
SinhalaNewsSourceClassification (Dhananjaya et al., 2022) ['sin'] Classification s2s [News] {'train': 24094} {'train': 56.08}
SiswatiNewsClassification (Madodonga et al., 2023) ['ssw'] Classification s2s [News] {'train': 80} {'train': 354.2}
SlovakMovieReviewSentimentClassification ({�{S, 2023) ['svk'] Classification s2s [Reviews] {'test': 2048} {'test': 366.17}
SlovakSumRetrieval ['slk'] Retrieval s2s [News, Social, Web] {'test': 600} {'test': 238.44}
SouthAfricanLangClassification (ExploreAI Academy et al., 2022) ['afr', 'eng', 'nbl', 'nso', 'sot', 'ssw', 'tsn', 'tso', 'ven', 'xho', 'zul'] Classification s2s [Web, Non-fiction] {'test': 2048} {'test': 247.49}
SpanishNewsClassification ['spa'] Classification s2s [News] {'train': 2048} {'train': 4218.2}
SpanishNewsClusteringP2P ['spa'] Clustering p2p
SpanishPassageRetrievalS2P ['spa'] Retrieval s2p
SpanishPassageRetrievalS2S ['spa'] Retrieval s2s
SpanishSentimentClassification ['spa'] Classification s2s [Reviews] {'validation': 147, 'test': 296} {'validation': 85.02, 'test': 87.91}
SpartQA (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 0} {'test': 0.0}
SprintDuplicateQuestions ['eng'] PairClassification s2s {'validation': 101000, 'test': 101000} {'validation': 65.2, 'test': 67.9}
StackExchangeClustering.v2 (Gregor Geigle, 2021) ['eng'] Clustering s2s [Web] {'test': 32768} {'test': 57.0}
StackExchangeClusteringP2P.v2 (Gregor Geigle, 2021) ['eng'] Clustering p2p [Web] {'test': 2996} {'test': 1090.7}
StackOverflowDupQuestions (Xueqing Liu, 2018) ['eng'] Reranking s2s {'test': 3467} {'test': 49.8}
StatcanDialogueDatasetRetrieval ['eng', 'fra'] Retrieval s2p [Government, Web] {'dev': 1000, 'test': 1011, 'corpus': 5907} {'dev': 776.58, 'test': 857.13, 'corpus': 6806.97}
SummEval (Soğancıoğlu et al., 2017) ['eng'] Summarization p2p {'test': 2800} {'test': 359.8}
SummEvalFr (Fabbri et al., 2020) ['fra'] Summarization p2p
SweFaqRetrieval ['swe'] Retrieval s2s [Government, Non-fiction] {'test': 1024} {'test': 195.44}
SweRecClassification ['swe'] Classification s2s {'test': 1024} {'test': 318.8}
SwedishSentimentClassification ['swe'] Classification s2s [Reviews] {'validation': 1024, 'test': 1024} {'validation': 499.3, 'test': 498.1}
SwednClusteringP2P (Monsen et al., 2021) ['swe'] Clustering p2p [News, Non-fiction] {'all': 2048} {'all': 1619.71}
SwednClusteringS2S (Monsen et al., 2021) ['swe'] Clustering s2s [News, Non-fiction] {'all': 2048} {'all': 1619.71}
SwednRetrieval (Monsen et al., 2021) ['swe'] Retrieval p2p [News, Non-fiction] {'test': 2048} {'test': 1946.35}
SwissJudgementClassification (Joel Niklaus, 2022) ['deu', 'fra', 'ita'] Classification s2s [Legal] {'test': 2048} {'test': 3411.72}
SyntecReranking (Mathieu Ciancone, 2024) ['fra'] Reranking s2p [Legal]
SyntecRetrieval (Mathieu Ciancone, 2024) ['fra'] Retrieval s2p [Legal] {'test': 90} {'test': 62.0}
T2Reranking (Xiaohui Xie, 2023) ['cmn'] Reranking s2s
T2Retrieval (Xiaohui Xie, 2023) ['cmn'] Retrieval s2p
TERRa (Shavrina et al., 2020) ['rus'] PairClassification s2s [News, Web] {'dev': 307} {'dev': 138.2}
TNews ['cmn'] Classification s2s
TRECCOVID (Kirk Roberts, 2021) ['eng'] Retrieval s2p
TRECCOVID-PL (Konrad Wojtasik, 2024) ['pol'] Retrieval s2p
TV2Nordretrieval ['dan'] Retrieval p2p [News, Non-fiction] {'test': 4096} {'test': 784.11}
TamilNewsClassification (Anoop Kunchukuttan, 2020) ['tam'] Classification s2s [News] {'train': 14521, 'test': 3631} {'train': 56.5, 'test': 56.52}
Tatoeba (Tatoeba community, 2021) ['afr', 'amh', 'ang', 'ara', 'arq', 'arz', 'ast', 'awa', 'aze', 'bel', 'ben', 'ber', 'bos', 'bre', 'bul', 'cat', 'cbk', 'ceb', 'ces', 'cha', 'cmn', 'cor', 'csb', 'cym', 'dan', 'deu', 'dsb', 'dtp', 'ell', 'eng', 'epo', 'est', 'eus', 'fao', 'fin', 'fra', 'fry', 'gla', 'gle', 'glg', 'gsw', 'heb', 'hin', 'hrv', 'hsb', 'hun', 'hye', 'ido', 'ile', 'ina', 'ind', 'isl', 'ita', 'jav', 'jpn', 'kab', 'kat', 'kaz', 'khm', 'kor', 'kur', 'kzj', 'lat', 'lfn', 'lit', 'lvs', 'mal', 'mar', 'max', 'mhr', 'mkd', 'mon', 'nds', 'nld', 'nno', 'nob', 'nov', 'oci', 'orv', 'pam', 'pes', 'pms', 'pol', 'por', 'ron', 'rus', 'slk', 'slv', 'spa', 'sqi', 'srp', 'swe', 'swg', 'swh', 'tam', 'tat', 'tel', 'tgl', 'tha', 'tuk', 'tur', 'tzl', 'uig', 'ukr', 'urd', 'uzb', 'vie', 'war', 'wuu', 'xho', 'yid', 'yue', 'zsm'] BitextMining s2s {'test': 2000} {'test': 39.4}
TbilisiCityHallBitextMining ['eng', 'kat'] BitextMining s2s [News] {'test': 1820} {'test': 78.0}
TelemarketingSalesRuleLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 47} {'test': 348.29}
TeluguAndhraJyotiNewsClassification ['tel'] Classification s2s [News] {'test': 4329} {'test': 1428.28}
TempReasonL1 (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 4000} {'test': 59.2}
TempReasonL2Context (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 0} {'test': 0.0}
TempReasonL2Fact (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 5397} {'test': 854.8}
TempReasonL2Pure (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 5397} {'test': 80.0}
TempReasonL3Context (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 4426} {'test': 13448.4}
TempReasonL3Fact (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 4426} {'test': 919.9}
TempReasonL3Pure (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 4426} {'test': 98.2}
TenKGnadClassification ['deu'] Classification p2p [News] {'test': 1028} {'test': 2627.31}
TenKGnadClusteringP2P.v2 ['deu'] Clustering p2p [News, Non-fiction] {'test': 10275} {'test': 2641.03}
TenKGnadClusteringS2S.v2 ['deu'] Clustering s2s [News, Non-fiction] {'test': 10275} {'test': 50.96}
TextualismToolDictionariesLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 107} {'test': 943.23}
TextualismToolPlainLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 165} {'test': 997.97}
ThuNewsClusteringP2P.v2 (Sun et al., 2016) ['cmn'] Clustering p2p [News] {'test': 2048}
ThuNewsClusteringS2S.v2 (Sun et al., 2016) ['cmn'] Clustering s2s [News] {'test': 2048}
TopiOCQA (Vaibhav Adlakha, 2022) ['eng'] Retrieval s2p [Encyclopaedic] {'dev': 2514} {'validation': 708.0}
Touche2020 ['eng'] Retrieval s2p
ToxicChatClassification (Zi Lin, 2023) ['eng'] Classification s2s [Constructed] {'test': 1427} {'test': 189.4}
ToxicConversationsClassification (cjadams et al., 2019) ['eng'] Classification s2s [Social] {'test': 50000} {'test': 296.6}
TswanaNewsClassification (Vukosi Marivate, 2023) ['tsn'] Classification s2s [News] {'validation': 487, 'test': 487} {'validation': 2417.72, 'test': 2369.52}
TurHistQuadRetrieval (Soygazi et al., 2021) ['tur'] Retrieval p2p [Encyclopaedic, Non-fiction, Academic] {'test': 1330} {'test': 1513.83}
TurkicClassification ['bak', 'kaz', 'kir'] Classification s2s [News] {'train': 193056} {'train': 1103.13}
TurkishMovieSentimentClassification (Erkin Demirtas, 2013) ['tur'] Classification s2s [Reviews] {'test': 2644} {'test': 141.5}
TurkishProductSentimentClassification (Erkin Demirtas, 2013) ['tur'] Classification s2s [Reviews] {'test': 800} {'test': 246.85}
TweetEmotionClassification (Al-Khatib et al., 2018) ['ara'] Classification s2s [Social] {'train': 2048} {'train': 78.8}
TweetSarcasmClassification ['ara'] Classification s2s [Social] {'test': 2110} {'test': 102.1}
TweetSentimentClassification ['ara', 'deu', 'eng', 'fra', 'hin', 'ita', 'por', 'spa'] Classification s2s [Social] {'test': 2048} {'test': 83.51}
TweetSentimentExtractionClassification (Maggie et al., 2020) ['eng'] Classification s2s [Social] {'test': 3534} {'test': 67.8}
TweetTopicSingleClassification ['eng'] Classification s2s [Social, News] {'test_2021': 1693} {'test_2021': 167.66}
TwentyNewsgroupsClustering.v2 (Ken Lang, 1995) ['eng'] Clustering s2s [News] {'test': 2381} {'test': 32.0}
TwitterHjerneRetrieval (Holm et al., 2024) ['dan'] Retrieval p2p [Social] {'train': 340} {'train': 138.23}
TwitterSemEval2015 ['eng'] PairClassification s2s {'test': 16777} {'test': 38.3}
TwitterURLCorpus ['eng'] PairClassification s2s {'test': 51534} {'test': 79.5}
UCCVCommonLawLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 94} {'test': 114.127}
UkrFormalityClassification ['ukr'] Classification s2s [News] {'train': 2048, 'test': 2048} {'train': 52.1, 'test': 53.07}
UnfairTOSLegalBenchClassification (Neel Guha, 2023) ['eng'] Classification s2s [Legal] {'test': 2048} {'test': 184.69}
UrduRomanSentimentClassification (Sharf,Zareen, 2018) ['urd'] Classification s2s [Social] {'train': 2048} {'train': 68.248}
VGHierarchicalClusteringP2P (Navjord et al., 2023) ['nob'] Clustering p2p [News, Non-fiction] {'test': 2048} {'test': 2670.3243084794544}
VGHierarchicalClusteringS2S (Navjord et al., 2023) ['nob'] Clustering p2p [News, Non-fiction] {'test': 2048} {'test': 139.31247668283325}
VideoRetrieval ['cmn'] Retrieval s2p
VieMedEVBitextMining (Nhu Vo, 2024) ['eng', 'vie'] BitextMining s2s [Medical] {'test': 2048} {'test': 139.23}
VieQuADRetrieval ['vie'] Retrieval s2p [Encyclopaedic, Non-fiction] {'validation': 2048} {'validation': 790.24}
VieStudentFeedbackClassification (Nguyen et al., 2018) ['vie'] Classification s2s [Reviews] {'test': 2048} {'test': 14.22}
WRIMEClassification ['jpn'] Classification s2s [Social] {'test': 2048} {'test': 47.78}
Waimai ['cmn'] Classification s2s
WebLINXCandidatesReranking (Xing Han Lù, 2024) ['eng'] Reranking p2p [Academic, Web] {'validation': 1301, 'test_iid': 1438, 'test_cat': 3560, 'test_web': 3144, 'test_vis': 5298, 'test_geo': 4916} {'validation': 1647.52, 'test_iid': 1722.63, 'test_cat': 2149.66, 'test_web': 1831.46, 'test_vis': 1737.26, 'test_geo': 1742.66}
WikiCitiesClustering ['eng'] Clustering p2p
WikiClusteringP2P.v2 ['bos', 'cat', 'ces', 'dan', 'eus', 'glv', 'ilo', 'kur', 'lav', 'min', 'mlt', 'sco', 'sqi', 'wln'] Clustering p2p [Encyclopaedic] {'test': 2048} {'test': 625.3}
WikipediaRerankingMultilingual ['ben', 'bul', 'ces', 'dan', 'deu', 'eng', 'fas', 'fin', 'hin', 'ita', 'nld', 'nor', 'por', 'ron', 'srp', 'swe'] Reranking s2p [Encyclopaedic] {'en': 1500, 'de': 1500, 'it': 1500, 'pt': 1500, 'nl': 1500, 'cs': 1500, 'ro': 1500, 'bg': 1500, 'sr': 1500, 'fi': 1500, 'da': 1500, 'fa': 1500, 'hi': 1500, 'bn': 1500, 'no': 1500, 'sv': 1500} {'test': 452.0}
WikipediaRetrievalMultilingual ['ben', 'bul', 'ces', 'dan', 'deu', 'eng', 'fas', 'fin', 'hin', 'ita', 'nld', 'nor', 'por', 'ron', 'srp', 'swe'] Retrieval s2p [Encyclopaedic] {'en': 1500, 'de': 1500, 'it': 1500, 'pt': 1500, 'nl': 1500, 'cs': 1500, 'ro': 1500, 'bg': 1500, 'sr': 1500, 'fi': 1500, 'da': 1500, 'fa': 1500, 'hi': 1500, 'bn': 1500, 'no': 1500, 'sv': 1500} {'test': 452.0}
WinoGrande (Xiao et al., 2024) ['eng'] Retrieval s2s [Encyclopaedic] {'test': 0} {'test': 0.0}
WisesightSentimentClassification ['tha'] Classification s2s [Social, News] {'train': 2048} {'train': 103.42}
XMarket (Bonab et al., 2021) ['deu', 'eng', 'spa'] Retrieval s2p
XNLI (Conneau et al., 2018) ['ara', 'bul', 'deu', 'ell', 'eng', 'fra', 'hin', 'rus', 'spa', 'swa', 'tha', 'tur', 'vie', 'zho'] PairClassification s2s [Non-fiction, Fiction, Government] {'validation': 2163, 'test': 2460} {'validation': 106.5, 'test': 106.5}
XNLIV2 (Upadhyay et al., 2023) ['asm', 'ben', 'bho', 'ell', 'guj', 'kan', 'mar', 'ory', 'pan', 'rus', 'san', 'tam', 'tur'] PairClassification s2s [Non-fiction, Fiction, Government] {'test': 5010} {'test': 80.06}
XPQARetrieval (Shen et al., 2023) ['ara', 'cmn', 'deu', 'eng', 'fra', 'hin', 'ita', 'jpn', 'kor', 'pol', 'por', 'spa', 'tam'] Retrieval s2p [Reviews] {'test': 19801} {'test': 104.68}
XQuADRetrieval (Mikel Artetxe, 2019) ['arb', 'deu', 'ell', 'eng', 'hin', 'ron', 'rus', 'spa', 'tha', 'tur', 'vie', 'zho'] Retrieval s2p [Web] {'test': 1190} {'test': 788.7}
XStance ['deu', 'fra', 'ita'] PairClassification s2s [Social] {'test': 2048} {'test': 152.41}
YahooAnswersTopicsClassification (Zhang et al., 2015) ['eng'] Classification s2s [Web] {'test': 60000} {'test': 346.35}
YelpReviewFullClassification (Zhang et al., 2015) ['eng'] Classification s2s [Reviews] {'test': 50000}
YueOpenriceReviewClassification (Xiang et al., 2019) ['yue'] Classification s2s [Reviews] {'test': 6161} {'test': 173.0}
indonli ['ind'] PairClassification s2s [Encyclopaedic, Web, News] {'test_expert': 2040} {'test_expert': 145.88}

Task Count per Language

Language BitextMining Classification Clustering InstructionRetrieval MultilabelClassification PairClassification Reranking Retrieval STS Summarization
aai 1 0 0 0 0 0 0 0 0 0
aak 1 0 0 0 0 0 0 0 0 0
aau 1 0 0 0 0 0 0 0 0 0
aaz 1 0 0 0 0 0 0 0 0 0
abs 1 0 0 0 0 0 0 0 0 0
abt 1 0 0 0 0 0 0 0 0 0
abx 1 0 0 0 0 0 0 0 0 0
aby 1 0 0 0 0 0 0 0 0 0
ace 2 2 1 0 0 0 0 0 0 0
acf 1 0 0 0 0 0 0 0 0 0
acm 1 1 1 0 0 0 0 1 0 0
acq 1 1 1 0 0 0 0 0 0 0
acr 1 0 0 0 0 0 0 0 0 0
acu 1 0 0 0 0 0 0 0 0 0
adz 1 0 0 0 0 0 0 0 0 0
aeb 1 1 1 0 0 0 0 0 0 0
aer 1 0 0 0 0 0 0 0 0 0
aey 1 0 0 0 0 0 0 0 0 0
afr 3 4 1 0 0 0 0 1 1 0
agd 1 0 0 0 0 0 0 0 0 0
agg 1 0 0 0 0 0 0 0 0 0
agm 1 0 0 0 0 0 0 0 0 0
agn 1 0 0 0 0 0 0 0 0 0
agr 1 0 0 0 0 0 0 0 0 0
agt 1 0 0 0 0 0 0 0 0 0
agu 1 0 0 0 0 0 0 0 0 0
aia 1 0 0 0 0 0 0 0 0 0
aii 1 0 0 0 0 0 0 0 0 0
ajp 1 1 1 0 0 0 0 0 0 0
aka 2 1 1 0 0 0 0 0 0 0
ake 1 0 0 0 0 0 0 0 0 0
alp 1 0 0 0 0 0 0 0 0 0
alq 1 0 0 0 0 0 0 0 0 0
als 2 1 1 0 0 0 0 1 0 0
aly 1 0 0 0 0 0 0 0 0 0
ame 1 0 0 0 0 0 0 0 0 0
amf 1 0 0 0 0 0 0 0 0 0
amh 3 6 3 0 0 0 0 1 1 0
amk 1 0 0 0 0 0 0 0 0 0
amm 1 0 0 0 0 0 0 0 0 0
amn 1 0 0 0 0 0 0 0 0 0
amo 1 0 0 0 0 0 0 0 0 0
amp 1 0 0 0 0 0 0 0 0 0
amr 1 0 0 0 0 0 0 0 0 0
amu 1 0 0 0 0 0 0 0 0 0
amx 1 0 0 0 0 0 0 0 0 0
ang 1 0 0 0 0 0 0 0 0 0
anh 1 0 0 0 0 0 0 0 0 0
anp 0 1 0 0 0 0 0 0 0 0
anv 1 0 0 0 0 0 0 0 0 0
aoi 1 0 0 0 0 0 0 0 0 0
aoj 1 0 0 0 0 0 0 0 0 0
aom 1 0 0 0 0 0 0 0 0 0
aon 1 0 0 0 0 0 0 0 0 0
apb 1 0 0 0 0 0 0 0 0 0
apc 1 1 1 0 0 0 0 1 0 0
ape 1 0 0 0 0 0 0 0 0 0
apn 1 0 0 0 0 0 0 0 0 0
apr 1 0 0 0 0 0 0 0 0 0
apu 1 0 0 0 0 0 0 0 0 0
apw 1 0 0 0 0 0 0 0 0 0
apz 1 0 0 0 0 0 0 0 0 0
ara 2 12 0 0 0 2 1 5 2 0
arb 3 1 1 0 0 0 0 2 1 0
are 1 0 0 0 0 0 0 0 0 0
arl 1 0 0 0 0 0 0 0 0 0
arn 1 0 0 0 0 0 0 0 0 0
arp 1 0 0 0 0 0 0 0 0 0
arq 1 2 0 0 0 0 0 0 1 0
ars 1 1 1 0 0 0 0 1 0 0
ary 1 3 1 0 0 0 0 1 1 0
arz 2 1 1 0 0 0 0 1 0 0
asm 5 3 2 0 0 1 0 2 1 0
aso 1 0 0 0 0 0 0 0 0 0
ast 2 1 1 0 0 0 0 0 0 0
ata 1 0 0 0 0 0 0 0 0 0
atb 1 0 0 0 0 0 0 0 0 0
atd 1 0 0 0 0 0 0 0 0 0
atg 1 0 0 0 0 0 0 0 0 0
att 1 0 0 0 0 0 0 0 0 0
auc 1 0 0 0 0 0 0 0 0 0
aui 1 0 0 0 0 0 0 0 0 0
auy 1 0 0 0 0 0 0 0 0 0
avt 1 0 0 0 0 0 0 0 0 0
awa 3 2 1 0 0 0 0 0 0 0
awb 1 0 0 0 0 0 0 0 0 0
awk 1 0 0 0 0 0 0 0 0 0
awx 1 0 0 0 0 0 0 0 0 0
ayr 1 1 1 0 0 0 0 0 0 0
azb 2 1 1 0 0 0 0 0 0 0
aze 2 2 0 0 0 0 0 0 0 0
azg 1 0 0 0 0 0 0 0 0 0
azj 1 1 1 0 0 0 0 1 0 0
azz 1 0 0 0 0 0 0 0 0 0
bak 2 3 1 0 0 0 0 0 0 0
bam 1 2 1 0 0 0 0 1 0 0
ban 2 2 1 0 0 0 0 0 0 0
bao 1 0 0 0 0 0 0 0 0 0
bba 1 0 0 0 0 0 0 0 0 0
bbb 1 0 0 0 0 0 0 0 0 0
bbc 2 3 0 0 0 0 0 0 0 0
bbr 1 0 0 0 0 0 0 0 0 0
bch 1 0 0 0 0 0 0 0 0 0
bco 1 0 0 0 0 0 0 0 0 0
bdd 1 0 0 0 0 0 0 0 0 0
bea 1 0 0 0 0 0 0 0 0 0
bef 1 0 0 0 0 0 0 0 0 0
bel 4 1 1 0 0 0 0 0 0 0
bem 2 1 1 0 0 0 0 0 0 0
ben 7 9 2 0 0 1 2 3 1 0
beo 1 0 0 0 0 0 0 0 0 0
ber 1 0 0 0 0 0 0 0 0 0
beu 1 0 0 0 0 0 0 0 0 0
bew 1 2 0 0 0 0 0 0 0 0
bgc 1 1 0 0 0 0 0 0 0 0
bgs 1 0 0 0 0 0 0 0 0 0
bgt 1 0 0 0 0 0 0 0 0 0
bhb 0 1 0 0 0 0 0 0 0 0
bhd 0 1 0 0 0 0 0 0 0 0
bhg 1 0 0 0 0 0 0 0 0 0
bhl 1 0 0 0 0 0 0 0 0 0
bho 2 2 1 0 0 1 0 0 0 0
bhp 1 0 0 0 0 0 0 0 0 0
big 1 0 0 0 0 0 0 0 0 0
bjj 0 1 0 0 0 0 0 0 0 0
bjk 1 0 0 0 0 0 0 0 0 0
bjn 2 2 1 0 0 0 0 0 0 0
bjp 1 0 0 0 0 0 0 0 0 0
bjr 1 0 0 0 0 0 0 0 0 0
bjv 1 0 0 0 0 0 0 0 0 0
bjz 1 0 0 0 0 0 0 0 0 0
bkd 1 0 0 0 0 0 0 0 0 0
bki 1 0 0 0 0 0 0 0 0 0
bkq 1 0 0 0 0 0 0 0 0 0
bkx 1 0 0 0 0 0 0 0 0 0
blw 1 0 0 0 0 0 0 0 0 0
blz 1 0 0 0 0 0 0 0 0 0
bmh 1 0 0 0 0 0 0 0 0 0
bmk 1 0 0 0 0 0 0 0 0 0
bmr 1 0 0 0 0 0 0 0 0 0
bmu 1 0 0 0 0 0 0 0 0 0
bnp 1 0 0 0 0 0 0 0 0 0
bns 0 1 0 0 0 0 0 0 0 0
boa 1 0 0 0 0 0 0 0 0 0
bod 3 1 1 0 0 0 0 1 0 0
boj 1 0 0 0 0 0 0 0 0 0
bon 1 0 0 0 0 0 0 0 0 0
bos 3 1 2 0 0 0 0 0 0 0
box 1 0 0 0 0 0 0 0 0 0
boy 1 0 0 0 0 0 0 0 0 0
bpr 1 0 0 0 0 0 0 0 0 0
bps 1 0 0 0 0 0 0 0 0 0
bqc 1 0 0 0 0 0 0 0 0 0
bqp 1 0 0 0 0 0 0 0 0 0
bra 0 1 0 0 0 0 0 0 0 0
bre 2 0 0 0 0 0 0 0 0 0
brx 2 2 1 0 0 0 0 0 0 0
bsj 1 0 0 0 0 0 0 0 0 0
bsn 1 0 0 0 0 0 0 0 0 0
bsp 1 0 0 0 0 0 0 0 0 0
bss 1 0 0 0 0 0 0 0 0 0
bug 2 4 1 0 0 0 0 0 0 0
buk 1 0 0 0 0 0 0 0 0 0
bul 3 4 1 0 1 1 1 2 0 0
bus 1 0 0 0 0 0 0 0 0 0
bvd 1 0 0 0 0 0 0 0 0 0
bvr 1 0 0 0 0 0 0 0 0 0
bxh 1 0 0 0 0 0 0 0 0 0
byr 1 0 0 0 0 0 0 0 0 0
byx 1 0 0 0 0 0 0 0 0 0
bzd 1 0 0 0 0 0 0 0 0 0
bzh 1 0 0 0 0 0 0 0 0 0
bzj 1 0 0 0 0 0 0 0 0 0
caa 1 0 0 0 0 0 0 0 0 0
cab 1 0 0 0 0 0 0 0 0 0
cac 1 0 0 0 0 0 0 0 0 0
caf 1 0 0 0 0 0 0 0 0 0
cak 1 0 0 0 0 0 0 0 0 0
cao 1 0 0 0 0 0 0 0 0 0
cap 1 0 0 0 0 0 0 0 0 0
car 1 0 0 0 0 0 0 0 0 0
cat 3 2 2 0 0 0 0 1 0 0
cav 1 0 0 0 0 0 0 0 0 0
cax 1 0 0 0 0 0 0 0 0 0
cbc 1 0 0 0 0 0 0 0 0 0
cbi 1 0 0 0 0 0 0 0 0 0
cbk 2 0 0 0 0 0 0 0 0 0
cbr 1 0 0 0 0 0 0 0 0 0
cbs 1 0 0 0 0 0 0 0 0 0
cbt 1 0 0 0 0 0 0 0 0 0
cbu 1 0 0 0 0 0 0 0 0 0
cbv 1 0 0 0 0 0 0 0 0 0
cco 1 0 0 0 0 0 0 0 0 0
ceb 3 1 1 0 0 0 0 1 0 0
cek 1 0 0 0 0 0 0 0 0 0
ces 4 5 2 0 1 1 1 2 0 0
cgc 1 0 0 0 0 0 0 0 0 0
cha 2 0 0 0 0 0 0 0 0 0
chd 1 0 0 0 0 0 0 0 0 0
chf 1 0 0 0 0 0 0 0 0 0
chk 1 0 0 0 0 0 0 0 0 0
chq 1 0 0 0 0 0 0 0 0 0
chv 0 1 0 0 0 0 0 0 0 0
chz 1 0 0 0 0 0 0 0 0 0
cjk 1 1 1 0 0 0 0 0 0 0
cjo 1 0 0 0 0 0 0 0 0 0
cjv 1 0 0 0 0 0 0 0 0 0
ckb 3 1 1 0 0 0 0 1 0 0
cle 1 0 0 0 0 0 0 0 0 0
clu 1 0 0 0 0 0 0 0 0 0
cme 1 0 0 0 0 0 0 0 0 0
cmn 4 10 4 0 0 3 4 10 9 0
cmo 0 2 0 0 0 0 0 0 0 0
cni 1 0 0 0 0 0 0 0 0 0
cnl 1 0 0 0 0 0 0 0 0 0
cnt 1 0 0 0 0 0 0 0 0 0
code 0 0 0 0 0 0 0 19 0 0
cof 1 0 0 0 0 0 0 0 0 0
con 1 0 0 0 0 0 0 0 0 0
cop 1 0 0 0 0 0 0 0 0 0
cor 1 0 0 0 0 0 0 0 0 0
cot 1 0 0 0 0 0 0 0 0 0
cpa 1 0 0 0 0 0 0 0 0 0
cpb 1 0 0 0 0 0 0 0 0 0
cpc 1 0 0 0 0 0 0 0 0 0
cpu 1 0 0 0 0 0 0 0 0 0
cpy 1 0 0 0 0 0 0 0 0 0
crh 1 1 1 0 0 0 0 0 0 0
crn 1 0 0 0 0 0 0 0 0 0
crx 1 0 0 0 0 0 0 0 0 0
csb 1 0 0 0 0 0 0 0 0 0
cso 1 0 0 0 0 0 0 0 0 0
csy 1 0 0 0 0 0 0 0 0 0
cta 1 0 0 0 0 0 0 0 0 0
cth 1 0 0 0 0 0 0 0 0 0
ctp 1 0 0 0 0 0 0 0 0 0
ctu 1 0 0 0 0 0 0 0 0 0
cub 1 0 0 0 0 0 0 0 0 0
cuc 1 0 0 0 0 0 0 0 0 0
cui 1 0 0 0 0 0 0 0 0 0
cuk 1 0 0 0 0 0 0 0 0 0
cut 1 0 0 0 0 0 0 0 0 0
cux 1 0 0 0 0 0 0 0 0 0
cwe 1 0 0 0 0 0 0 0 0 0
cya 1 0 0 0 0 0 0 0 0 0
cym 3 4 1 0 0 0 0 0 0 0
daa 1 0 0 0 0 0 0 0 0 0
dad 1 0 0 0 0 0 0 0 0 0
dah 1 0 0 0 0 0 0 0 0 0
dan 5 9 2 0 1 0 1 5 0 0
ded 1 0 0 0 0 0 0 0 0 0
deu 6 14 7 0 1 6 2 17 4 0
dgc 1 0 0 0 0 0 0 0 0 0
dgr 1 0 0 0 0 0 0 0 0 0
dgz 1 0 0 0 0 0 0 0 0 0
dhg 1 0 0 0 0 0 0 0 0 0
dif 1 0 0 0 0 0 0 0 0 0
dik 2 1 1 0 0 0 0 0 0 0
div 1 0 0 0 0 0 0 0 0 0
dji 1 0 0 0 0 0 0 0 0 0
djk 1 0 0 0 0 0 0 0 0 0
djr 1 0 0 0 0 0 0 0 0 0
dob 1 0 0 0 0 0 0 0 0 0
doi 2 1 0 0 0 0 0 0 0 0
dop 1 0 0 0 0 0 0 0 0 0
dov 1 0 0 0 0 0 0 0 0 0
dsb 1 0 0 0 0 0 0 0 0 0
dtp 1 0 0 0 0 0 0 0 0 0
dwr 1 0 0 0 0 0 0 0 0 0
dww 1 0 0 0 0 0 0 0 0 0
dwy 1 0 0 0 0 0 0 0 0 0
dyu 1 1 1 0 0 0 0 0 0 0
dza 0 1 0 0 0 0 0 0 0 0
dzo 2 1 1 0 0 0 0 0 0 0
ebk 1 0 0 0 0 0 0 0 0 0
eko 1 0 0 0 0 0 0 0 0 0
ell 3 6 1 0 1 2 0 3 0 0
emi 1 0 0 0 0 0 0 0 0 0
emp 1 0 0 0 0 0 0 0 0 0
eng 16 143 16 3 1 8 7 71 13 1
enq 1 0 0 0 0 0 0 0 0 0
epo 3 1 1 0 0 0 0 0 0 0
eri 1 0 0 0 0 0 0 0 0 0
ese 1 0 0 0 0 0 0 0 0 0
esk 1 0 0 0 0 0 0 0 0 0
est 2 2 1 0 1 0 0 2 0 0
etr 1 0 0 0 0 0 0 0 0 0
eus 3 2 2 0 0 0 0 1 0 0
ewe 3 1 1 0 0 0 0 0 0 0
faa 1 0 0 0 0 0 0 0 0 0
fai 1 0 0 0 0 0 0 0 0 0
fao 3 2 1 0 0 0 0 0 1 0
far 1 0 0 0 0 0 0 0 0 0
fas 1 4 0 0 0 1 2 3 0 0
ffm 1 0 0 0 0 0 0 0 0 0
fij 2 1 1 0 0 0 0 0 0 0
fil 1 2 0 0 0 0 0 0 0 0
fin 3 5 1 0 1 1 2 2 1 0
fon 1 1 1 0 0 0 0 0 0 0
for 1 0 0 0 0 0 0 0 0 0
fra 7 13 8 0 1 5 3 12 4 1
fry 1 0 0 0 0 0 0 0 0 0
fuc 1 0 0 0 0 0 0 0 0 0
fue 1 0 0 0 0 0 0 0 0 0
fuf 1 0 0 0 0 0 0 0 0 0
fuh 1 0 0 0 0 0 0 0 0 0
fur 1 1 1 0 0 0 0 0 0 0
fuv 1 1 1 0 0 0 0 1 0 0
gah 1 0 0 0 0 0 0 0 0 0
gai 1 0 0 0 0 0 0 0 0 0
gam 1 0 0 0 0 0 0 0 0 0
gaw 1 0 0 0 0 0 0 0 0 0
gaz 1 1 1 0 0 0 0 1 0 0
gbm 1 1 0 0 0 0 0 0 0 0
gdn 1 0 0 0 0 0 0 0 0 0
gdr 1 0 0 0 0 0 0 0 0 0
geb 1 0 0 0 0 0 0 0 0 0
gfk 1 0 0 0 0 0 0 0 0 0
ghs 1 0 0 0 0 0 0 0 0 0
gla 2 1 1 0 0 0 0 0 0 0
gle 3 1 1 0 0 0 0 0 0 0
glg 3 1 1 0 0 0 0 0 0 0
glk 1 0 0 0 0 0 0 0 0 0
glv 0 0 1 0 0 0 0 0 0 0
gmv 1 0 0 0 0 0 0 0 0 0
gng 1 0 0 0 0 0 0 0 0 0
gnn 1 0 0 0 0 0 0 0 0 0
gnw 1 0 0 0 0 0 0 0 0 0
gof 1 0 0 0 0 0 0 0 0 0
gom 3 1 0 0 0 0 0 0 0 0
grc 1 0 0 0 0 0 0 0 0 0
grn 1 1 1 0 0 0 0 1 0 0
gsw 1 0 0 0 0 0 0 0 0 0
gub 1 0 0 0 0 0 0 0 0 0
guh 1 0 0 0 0 0 0 0 0 0
gui 1 0 0 0 0 0 0 0 0 0
guj 6 6 2 0 0 1 0 2 1 0
gul 1 0 0 0 0 0 0 0 0 0
gum 1 0 0 0 0 0 0 0 0 0
gun 1 0 0 0 0 0 0 0 0 0
guo 1 0 0 0 0 0 0 0 0 0
gup 1 0 0 0 0 0 0 0 0 0
gux 1 0 0 0 0 0 0 0 0 0
gvc 1 0 0 0 0 0 0 0 0 0
gvf 1 0 0 0 0 0 0 0 0 0
gvn 1 0 0 0 0 0 0 0 0 0
gvs 1 0 0 0 0 0 0 0 0 0
gwi 1 0 0 0 0 0 0 0 0 0
gym 1 0 0 0 0 0 0 0 0 0
gyr 1 0 0 0 0 0 0 0 0 0
hat 2 1 1 0 0 0 0 1 0 0
hau 4 5 3 0 0 0 0 1 1 0
haw 1 0 0 0 0 0 0 0 0 0
hbo 1 0 0 0 0 0 0 0 0 0
hch 1 0 0 0 0 0 0 0 0 0
heb 4 5 1 0 0 0 0 1 0 0
heg 1 0 0 0 0 0 0 0 0 0
hin 9 12 2 0 0 1 2 8 2 0
hix 1 0 0 0 0 0 0 0 0 0
hla 1 0 0 0 0 0 0 0 0 0
hlt 1 0 0 0 0 0 0 0 0 0
hmn 1 0 0 0 0 0 0 0 0 0
hmo 1 0 0 0 0 0 0 0 0 0
hne 2 2 1 0 0 0 0 0 0 0
hns 1 0 0 0 0 0 0 0 0 0
hop 1 0 0 0 0 0 0 0 0 0
hot 1 0 0 0 0 0 0 0 0 0
hrv 4 3 1 0 1 0 0 1 0 0
hsb 1 0 0 0 0 0 0 0 0 0
hto 1 0 0 0 0 0 0 0 0 0
hub 1 0 0 0 0 0 0 0 0 0
hui 1 0 0 0 0 0 0 0 0 0
hun 5 3 1 0 1 0 0 2 0 0
hus 1 0 0 0 0 0 0 0 0 0
huu 1 0 0 0 0 0 0 0 0 0
huv 1 0 0 0 0 0 0 0 0 0
hvn 1 0 0 0 0 0 0 0 0 0
hye 3 3 1 0 0 1 0 1 0 0
ian 1 0 0 0 0 0 0 0 0 0
ibo 3 5 3 0 0 0 0 1 0 0
ido 1 0 0 0 0 0 0 0 0 0
ign 1 0 0 0 0 0 0 0 0 0
ikk 1 0 0 0 0 0 0 0 0 0
ikw 1 0 0 0 0 0 0 0 0 0
ile 1 0 0 0 0 0 0 0 0 0
ilo 2 1 2 0 0 0 0 1 0 0
imo 1 0 0 0 0 0 0 0 0 0
ina 1 0 0 0 0 0 0 0 0 0
inb 1 0 0 0 0 0 0 0 0 0
ind 6 7 1 0 0 1 1 1 1 0
ino 1 0 0 0 0 0 0 0 0 0
iou 1 0 0 0 0 0 0 0 0 0
ipi 1 0 0 0 0 0 0 0 0 0
isl 3 4 1 0 0 0 0 1 0 0
isn 1 0 0 0 0 0 0 0 0 0
ita 5 9 1 0 1 2 1 5 3 0
iws 1 0 0 0 0 0 0 0 0 0
ixl 1 0 0 0 0 0 0 0 0 0
jac 1 0 0 0 0 0 0 0 0 0
jae 1 0 0 0 0 0 0 0 0 0
jao 1 0 0 0 0 0 0 0 0 0
jav 4 7 1 0 0 0 0 1 0 0
jic 1 0 0 0 0 0 0 0 0 0
jid 1 0 0 0 0 0 0 0 0 0
jiv 1 0 0 0 0 0 0 0 0 0
jni 1 0 0 0 0 0 0 0 0 0
jpn 5 8 3 0 0 1 1 9 2 0
jvn 1 0 0 0 0 0 0 0 0 0
kab 2 1 1 0 0 0 0 0 0 0
kac 1 1 1 0 0 0 0 1 0 0
kam 1 1 1 0 0 0 0 0 0 0
kan 6 7 2 0 0 1 0 2 1 0
kaq 1 0 0 0 0 0 0 0 0 0
kas 3 2 1 0 0 0 0 0 0 0
kat 4 3 1 0 0 0 0 2 0 0
kaz 3 3 1 0 0 0 0 1 0 0
kbc 1 0 0 0 0 0 0 0 0 0
kbh 1 0 0 0 0 0 0 0 0 0
kbm 1 0 0 0 0 0 0 0 0 0
kbp 1 1 1 0 0 0 0 0 0 0
kbq 1 0 0 0 0 0 0 0 0 0
kdc 1 0 0 0 0 0 0 0 0 0
kde 1 0 0 0 0 0 0 0 0 0
kdl 1 0 0 0 0 0 0 0 0 0
kea 1 1 1 0 0 0 0 1 0 0
kek 1 0 0 0 0 0 0 0 0 0
ken 1 0 0 0 0 0 0 0 0 0
kew 1 0 0 0 0 0 0 0 0 0
kfg 0 1 0 0 0 0 0 0 0 0
kfy 0 1 0 0 0 0 0 0 0 0
kgf 1 0 0 0 0 0 0 0 0 0
kgk 1 0 0 0 0 0 0 0 0 0
kgp 1 0 0 0 0 0 0 0 0 0
khk 1 1 1 0 0 0 0 1 0 0
khm 3 3 1 0 0 0 0 1 0 0
khs 1 0 0 0 0 0 0 0 0 0
khz 1 0 0 0 0 0 0 0 0 0
kik 2 1 1 0 0 0 0 0 0 0
kin 2 3 1 0 0 0 0 1 1 0
kir 2 3 1 0 0 0 0 1 0 0
kiw 1 0 0 0 0 0 0 0 0 0
kiz 1 0 0 0 0 0 0 0 0 0
kje 1 0 0 0 0 0 0 0 0 0
kjs 1 0 0 0 0 0 0 0 0 0
kkc 1 0 0 0 0 0 0 0 0 0
kkl 1 0 0 0 0 0 0 0 0 0
klt 1 0 0 0 0 0 0 0 0 0
klv 1 0 0 0 0 0 0 0 0 0
kmb 1 1 1 0 0 0 0 0 0 0
kmg 1 0 0 0 0 0 0 0 0 0
kmh 1 0 0 0 0 0 0 0 0 0
kmk 1 0 0 0 0 0 0 0 0 0
kmo 1 0 0 0 0 0 0 0 0 0
kmr 2 1 1 0 0 0 0 0 0 0
kms 1 0 0 0 0 0 0 0 0 0
kmu 1 0 0 0 0 0 0 0 0 0
knc 1 1 1 0 0 0 0 0 0 0
kne 1 0 0 0 0 0 0 0 0 0
knf 1 0 0 0 0 0 0 0 0 0
knj 1 0 0 0 0 0 0 0 0 0
knv 1 0 0 0 0 0 0 0 0 0
kon 1 1 1 0 0 0 0 0 0 0
kor 4 8 1 0 1 2 1 6 3 0
kos 1 0 0 0 0 0 0 0 0 0
kpf 1 0 0 0 0 0 0 0 0 0
kpg 1 0 0 0 0 0 0 0 0 0
kpj 1 0 0 0 0 0 0 0 0 0
kpr 1 0 0 0 0 0 0 0 0 0
kpw 1 0 0 0 0 0 0 0 0 0
kpx 1 0 0 0 0 0 0 0 0 0
kqa 1 0 0 0 0 0 0 0 0 0
kqc 1 0 0 0 0 0 0 0 0 0
kqf 1 0 0 0 0 0 0 0 0 0
kql 1 0 0 0 0 0 0 0 0 0
kqw 1 0 0 0 0 0 0 0 0 0
krc 0 1 0 0 0 0 0 0 0 0
ksd 1 0 0 0 0 0 0 0 0 0
ksj 1 0 0 0 0 0 0 0 0 0
ksr 1 0 0 0 0 0 0 0 0 0
ktm 1 0 0 0 0 0 0 0 0 0
kto 1 0 0 0 0 0 0 0 0 0
kud 1 0 0 0 0 0 0 0 0 0
kue 1 0 0 0 0 0 0 0 0 0
kup 1 0 0 0 0 0 0 0 0 0
kur 1 1 1 0 0 0 0 0 0 0
kvg 1 0 0 0 0 0 0 0 0 0
kvn 1 0 0 0 0 0 0 0 0 0
kwd 1 0 0 0 0 0 0 0 0 0
kwf 1 0 0 0 0 0 0 0 0 0
kwi 1 0 0 0 0 0 0 0 0 0
kwj 1 0 0 0 0 0 0 0 0 0
kyc 1 0 0 0 0 0 0 0 0 0
kyf 1 0 0 0 0 0 0 0 0 0
kyg 1 0 0 0 0 0 0 0 0 0
kyq 1 0 0 0 0 0 0 0 0 0
kyz 1 0 0 0 0 0 0 0 0 0
kze 1 0 0 0 0 0 0 0 0 0
kzj 1 0 0 0 0 0 0 0 0 0
lac 1 0 0 0 0 0 0 0 0 0
lao 2 1 1 0 0 0 0 1 0 0
lat 2 0 0 0 0 0 0 0 0 0
lav 1 2 1 0 1 0 0 0 0 0
lbb 1 0 0 0 0 0 0 0 0 0
lbk 1 0 0 0 0 0 0 0 0 0
lcm 1 0 0 0 0 0 0 0 0 0
leu 1 0 0 0 0 0 0 0 0 0
lex 1 0 0 0 0 0 0 0 0 0
lfn 1 0 0 0 0 0 0 0 0 0
lgl 1 0 0 0 0 0 0 0 0 0
lid 1 0 0 0 0 0 0 0 0 0
lif 1 0 0 0 0 0 0 0 0 0
lij 1 1 1 0 0 0 0 0 0 0
lim 1 1 1 0 0 0 0 0 0 0
lin 2 2 3 0 0 0 0 1 0 0
lit 4 1 1 0 1 0 0 1 0 0
llg 1 0 0 0 0 0 0 0 0 0
lmo 1 1 1 0 0 0 0 0 0 0
ltg 1 1 1 0 0 0 0 0 0 0
ltz 2 1 1 0 0 0 0 0 0 0
lua 1 1 1 0 0 0 0 0 0 0
lug 2 2 3 0 0 0 0 1 0 0
luo 2 1 1 0 0 0 0 1 0 0
lus 1 1 1 0 0 0 0 0 0 0
lvs 2 1 1 0 0 0 0 1 0 0
lww 1 0 0 0 0 0 0 0 0 0
maa 1 0 0 0 0 0 0 0 0 0
mad 2 3 0 0 0 0 0 0 0 0
mag 1 2 1 0 0 0 0 0 0 0
mai 4 2 1 0 0 0 0 0 0 0
maj 1 0 0 0 0 0 0 0 0 0
mak 1 2 0 0 0 0 0 0 0 0
mal 7 7 2 0 0 0 0 2 1 0
mam 1 0 0 0 0 0 0 0 0 0
maq 1 0 0 0 0 0 0 0 0 0
mar 7 6 2 0 0 1 0 2 2 0
mau 1 0 0 0 0 0 0 0 0 0
mav 1 0 0 0 0 0 0 0 0 0
max 1 0 0 0 0 0 0 0 0 0
maz 1 0 0 0 0 0 0 0 0 0
mbb 1 0 0 0 0 0 0 0 0 0
mbc 1 0 0 0 0 0 0 0 0 0
mbh 1 0 0 0 0 0 0 0 0 0
mbj 1 0 0 0 0 0 0 0 0 0
mbl 1 0 0 0 0 0 0 0 0 0
mbs 1 0 0 0 0 0 0 0 0 0
mbt 1 0 0 0 0 0 0 0 0 0
mca 1 0 0 0 0 0 0 0 0 0
mcb 1 0 0 0 0 0 0 0 0 0
mcd 1 0 0 0 0 0 0 0 0 0
mcf 1 0 0 0 0 0 0 0 0 0
mco 1 0 0 0 0 0 0 0 0 0
mcp 1 0 0 0 0 0 0 0 0 0
mcq 1 0 0 0 0 0 0 0 0 0
mcr 1 0 0 0 0 0 0 0 0 0
mdy 1 0 0 0 0 0 0 0 0 0
med 1 0 0 0 0 0 0 0 0 0
mee 1 0 0 0 0 0 0 0 0 0
mek 1 0 0 0 0 0 0 0 0 0
meq 1 0 0 0 0 0 0 0 0 0
met 1 0 0 0 0 0 0 0 0 0
meu 1 0 0 0 0 0 0 0 0 0
mey 1 0 0 0 0 0 0 0 0 0
mgc 1 0 0 0 0 0 0 0 0 0
mgh 1 0 0 0 0 0 0 0 0 0
mgw 1 0 0 0 0 0 0 0 0 0
mhl 1 0 0 0 0 0 0 0 0 0
mhr 1 0 0 0 0 0 0 0 0 0
mib 1 0 0 0 0 0 0 0 0 0
mic 1 0 0 0 0 0 0 0 0 0
mie 1 0 0 0 0 0 0 0 0 0
mig 1 0 0 0 0 0 0 0 0 0
mih 1 0 0 0 0 0 0 0 0 0
mil 1 0 0 0 0 0 0 0 0 0
min 3 4 2 0 0 0 0 0 0 0
mio 1 0 0 0 0 0 0 0 0 0
mir 1 0 0 0 0 0 0 0 0 0
mit 1 0 0 0 0 0 0 0 0 0
miz 1 0 0 0 0 0 0 0 0 0
mjc 1 0 0 0 0 0 0 0 0 0
mkd 3 2 1 0 0 0 0 1 0 0
mkj 1 0 0 0 0 0 0 0 0 0
mkl 1 0 0 0 0 0 0 0 0 0
mkn 1 0 0 0 0 0 0 0 0 0
mks 1 0 0 0 0 0 0 0 0 0
mle 1 0 0 0 0 0 0 0 0 0
mlg 1 0 0 0 0 0 0 0 0 0
mlh 1 0 0 0 0 0 0 0 0 0
mlp 1 0 0 0 0 0 0 0 0 0
mlt 2 2 2 0 2 0 0 1 0 0
mmo 1 0 0 0 0 0 0 0 0 0
mmx 1 0 0 0 0 0 0 0 0 0
mna 1 0 0 0 0 0 0 0 0 0
mni 4 2 1 0 0 0 0 0 0 0
mon 2 2 0 0 0 0 0 0 0 0
mop 1 0 0 0 0 0 0 0 0 0
mos 1 1 1 0 0 0 0 0 0 0
mox 1 0 0 0 0 0 0 0 0 0
mph 1 0 0 0 0 0 0 0 0 0
mpj 1 0 0 0 0 0 0 0 0 0
mpm 1 0 0 0 0 0 0 0 0 0
mpp 1 0 0 0 0 0 0 0 0 0
mps 1 0 0 0 0 0 0 0 0 0
mpt 1 0 0 0 0 0 0 0 0 0
mpx 1 0 0 0 0 0 0 0 0 0
mqb 1 0 0 0 0 0 0 0 0 0
mqj 1 0 0 0 0 0 0 0 0 0
mri 2 1 1 0 0 0 0 1 0 0
msa 1 2 0 0 0 0 0 0 0 0
msb 1 0 0 0 0 0 0 0 0 0
msc 1 0 0 0 0 0 0 0 0 0
msk 1 0 0 0 0 0 0 0 0 0
msm 1 0 0 0 0 0 0 0 0 0
msy 1 0 0 0 0 0 0 0 0 0
mti 1 0 0 0 0 0 0 0 0 0
mto 1 0 0 0 0 0 0 0 0 0
mui 1 2 0 0 0 0 0 0 0 0
mup 1 1 0 0 0 0 0 0 0 0
mux 1 0 0 0 0 0 0 0 0 0
muy 1 0 0 0 0 0 0 0 0 0
mva 1 0 0 0 0 0 0 0 0 0
mvn 1 0 0 0 0 0 0 0 0 0
mwc 1 0 0 0 0 0 0 0 0 0
mwe 1 0 0 0 0 0 0 0 0 0
mwf 1 0 0 0 0 0 0 0 0 0
mwp 1 0 0 0 0 0 0 0 0 0
mwr 1 0 0 0 0 0 0 0 0 0
mxb 1 0 0 0 0 0 0 0 0 0
mxp 1 0 0 0 0 0 0 0 0 0
mxq 1 0 0 0 0 0 0 0 0 0
mxt 1 0 0 0 0 0 0 0 0 0
mya 3 4 1 0 0 0 0 1 0 0
myk 1 0 0 0 0 0 0 0 0 0
myu 1 0 0 0 0 0 0 0 0 0
myw 1 0 0 0 0 0 0 0 0 0
myy 1 0 0 0 0 0 0 0 0 0
mzz 1 0 0 0 0 0 0 0 0 0
nab 1 0 0 0 0 0 0 0 0 0
naf 1 0 0 0 0 0 0 0 0 0
nak 1 0 0 0 0 0 0 0 0 0
nas 1 0 0 0 0 0 0 0 0 0
nbl 0 1 0 0 0 0 0 0 0 0
nbq 1 0 0 0 0 0 0 0 0 0
nca 1 0 0 0 0 0 0 0 0 0
nch 1 0 0 0 0 0 0 0 0 0
ncj 1 0 0 0 0 0 0 0 0 0
ncl 1 0 0 0 0 0 0 0 0 0
ncu 1 0 0 0 0 0 0 0 0 0
nde 1 0 0 0 0 0 0 0 0 0
ndg 1 0 0 0 0 0 0 0 0 0
ndj 1 0 0 0 0 0 0 0 0 0
nds 1 0 0 0 0 0 0 0 0 0
nep 2 1 0 0 0 0 0 0 0 0
nfa 1 0 0 0 0 0 0 0 0 0
ngp 1 0 0 0 0 0 0 0 0 0
ngu 1 0 0 0 0 0 0 0 0 0
nhe 1 0 0 0 0 0 0 0 0 0
nhg 1 0 0 0 0 0 0 0 0 0
nhi 1 0 0 0 0 0 0 0 0 0
nho 1 0 0 0 0 0 0 0 0 0
nhr 1 0 0 0 0 0 0 0 0 0
nhu 1 0 0 0 0 0 0 0 0 0
nhw 1 0 0 0 0 0 0 0 0 0
nhy 1 0 0 0 0 0 0 0 0 0
nif 1 0 0 0 0 0 0 0 0 0
nii 1 0 0 0 0 0 0 0 0 0
nij 1 1 0 0 0 0 0 0 0 0
nin 1 0 0 0 0 0 0 0 0 0
nko 1 0 0 0 0 0 0 0 0 0
nld 6 6 1 0 1 0 1 2 2 0
nlg 1 0 0 0 0 0 0 0 0 0
nna 1 0 0 0 0 0 0 0 0 0
nno 4 3 1 0 0 0 0 0 0 0
nnq 1 0 0 0 0 0 0 0 0 0
noa 1 0 0 0 0 0 0 0 0 0
nob 4 7 5 0 0 0 0 3 0 0
noe 0 1 0 0 0 0 0 0 0 0
nop 1 0 0 0 0 0 0 0 0 0
nor 0 1 0 0 0 0 1 1 0 0
not 1 0 0 0 0 0 0 0 0 0
nou 1 0 0 0 0 0 0 0 0 0
nov 1 0 0 0 0 0 0 0 0 0
npi 4 2 1 0 0 0 0 1 0 0
npl 1 0 0 0 0 0 0 0 0 0
nqo 0 1 1 0 0 0 0 0 0 0
nsn 1 0 0 0 0 0 0 0 0 0
nso 2 2 1 0 0 0 0 1 0 0
nss 1 0 0 0 0 0 0 0 0 0
ntj 1 0 0 0 0 0 0 0 0 0
ntp 1 0 0 0 0 0 0 0 0 0
ntu 1 0 0 0 0 0 0 0 0 0
nus 1 1 1 0 0 0 0 0 0 0
nuy 1 0 0 0 0 0 0 0 0 0
nvm 1 0 0 0 0 0 0 0 0 0
nwi 1 0 0 0 0 0 0 0 0 0
nya 3 1 1 0 0 0 0 1 0 0
nys 1 0 0 0 0 0 0 0 0 0
nyu 1 0 0 0 0 0 0 0 0 0
obo 1 0 0 0 0 0 0 0 0 0
oci 2 1 1 0 0 0 0 0 0 0
okv 1 0 0 0 0 0 0 0 0 0
omw 1 0 0 0 0 0 0 0 0 0
ong 1 0 0 0 0 0 0 0 0 0
ons 1 0 0 0 0 0 0 0 0 0
ood 1 0 0 0 0 0 0 0 0 0
opm 1 0 0 0 0 0 0 0 0 0
ori 0 1 0 0 0 0 0 0 0 0
orm 1 1 2 0 0 0 0 0 0 0
orv 1 0 0 0 0 0 0 0 0 0
ory 5 4 2 0 0 1 0 2 1 0
ote 1 0 0 0 0 0 0 0 0 0
otm 1 0 0 0 0 0 0 0 0 0
otn 1 0 0 0 0 0 0 0 0 0
otq 1 0 0 0 0 0 0 0 0 0
ots 1 0 0 0 0 0 0 0 0 0
pab 1 0 0 0 0 0 0 0 0 0
pad 1 0 0 0 0 0 0 0 0 0
pag 1 1 1 0 0 0 0 0 0 0
pah 1 0 0 0 0 0 0 0 0 0
pam 1 0 0 0 0 0 0 0 0 0
pan 6 6 2 0 0 1 0 2 1 0
pao 1 0 0 0 0 0 0 0 0 0
pap 1 1 1 0 0 0 0 0 0 0
pbt 1 1 1 0 0 0 0 1 0 0
pcm 1 4 2 0 0 0 0 0 0 0
pes 3 1 1 0 0 0 0 1 0 0
pib 1 0 0 0 0 0 0 0 0 0
pio 1 0 0 0 0 0 0 0 0 0
pir 1 0 0 0 0 0 0 0 0 0
piu 1 0 0 0 0 0 0 0 0 0
pjt 1 0 0 0 0 0 0 0 0 0
pls 1 0 0 0 0 0 0 0 0 0
plt 1 1 1 0 0 0 0 1 0 0
plu 1 0 0 0 0 0 0 0 0 0
pma 1 0 0 0 0 0 0 0 0 0
pms 1 0 0 0 0 0 0 0 0 0
poe 1 0 0 0 0 0 0 0 0 0
poh 1 0 0 0 0 0 0 0 0 0
poi 1 0 0 0 0 0 0 0 0 0
pol 4 11 4 0 1 4 0 13 4 0
pon 1 0 0 0 0 0 0 0 0 0
por 4 9 1 0 2 2 1 5 3 0
poy 1 0 0 0 0 0 0 0 0 0
ppo 1 0 0 0 0 0 0 0 0 0
prf 1 0 0 0 0 0 0 0 0 0
pri 1 0 0 0 0 0 0 0 0 0
prs 2 1 1 0 0 0 0 0 0 0
ptp 1 0 0 0 0 0 0 0 0 0
ptu 1 0 0 0 0 0 0 0 0 0
pus 2 0 0 0 0 0 0 0 0 0
pwg 1 0 0 0 0 0 0 0 0 0
qub 1 0 0 0 0 0 0 0 0 0
quc 1 0 0 0 0 0 0 0 0 0
quf 1 0 0 0 0 0 0 0 0 0
quh 1 0 0 0 0 0 0 0 0 0
qul 1 0 0 0 0 0 0 0 0 0
qup 1 0 0 0 0 0 0 0 0 0
quy 1 1 1 0 0 0 0 0 0 0
qvc 1 0 0 0 0 0 0 0 0 0
qve 1 0 0 0 0 0 0 0 0 0
qvh 1 0 0 0 0 0 0 0 0 0
qvm 1 0 0 0 0 0 0 0 0 0
qvn 1 0 0 0 0 0 0 0 0 0
qvs 1 0 0 0 0 0 0 0 0 0
qvw 1 0 0 0 0 0 0 0 0 0
qvz 1 0 0 0 0 0 0 0 0 0
qwh 1 0 0 0 0 0 0 0 0 0
qxh 1 0 0 0 0 0 0 0 0 0
qxn 1 0 0 0 0 0 0 0 0 0
qxo 1 0 0 0 0 0 0 0 0 0
rai 1 0 0 0 0 0 0 0 0 0
raj 1 1 0 0 0 0 0 0 0 0
reg 1 0 0 0 0 0 0 0 0 0
rej 1 2 0 0 0 0 0 0 0 0
rgu 1 0 0 0 0 0 0 0 0 0
rkb 1 0 0 0 0 0 0 0 0 0
rmc 1 0 0 0 0 0 0 0 0 0
rmy 1 0 0 0 0 0 0 0 0 0
rom 1 0 1 0 0 0 0 0 0 0
ron 5 6 1 0 1 0 1 3 1 0
roo 1 0 0 0 0 0 0 0 0 0
rop 1 0 0 0 0 0 0 0 0 0
row 1 0 0 0 0 0 0 0 0 0
rro 1 0 0 0 0 0 0 0 0 0
ruf 1 0 0 0 0 0 0 0 0 0
rug 1 0 0 0 0 0 0 0 0 0
run 1 2 3 0 0 0 0 0 0 0
rus 5 13 6 0 2 4 2 8 4 0
rwo 1 0 0 0 0 0 0 0 0 0
sab 1 0 0 0 0 0 0 0 0 0
sag 1 1 1 0 0 0 0 0 0 0
sah 0 1 0 0 0 0 0 0 0 0
san 5 3 1 0 0 1 0 0 0 0
sat 4 2 1 0 0 0 0 0 0 0
sbe 1 0 0 0 0 0 0 0 0 0
sbk 1 0 0 0 0 0 0 0 0 0
sbs 1 0 0 0 0 0 0 0 0 0
scn 1 1 1 0 0 0 0 0 0 0
sco 0 0 1 0 0 0 0 0 0 0
seh 1 0 0 0 0 0 0 0 0 0
sey 1 0 0 0 0 0 0 0 0 0
sgb 1 0 0 0 0 0 0 0 0 0
sgz 1 0 0 0 0 0 0 0 0 0
shi 1 0 0 0 0 0 0 0 0 0
shj 1 0 0 0 0 0 0 0 0 0
shn 1 1 1 0 0 0 0 1 0 0
shp 1 0 0 0 0 0 0 0 0 0
sim 1 0 0 0 0 0 0 0 0 0
sin 2 3 1 0 0 0 0 1 0 0
sja 1 0 0 0 0 0 0 0 0 0
slk 3 3 1 0 1 0 0 2 0 0
sll 1 0 0 0 0 0 0 0 0 0
slv 3 4 1 0 1 0 0 1 0 0
smk 1 0 0 0 0 0 0 0 0 0
smo 2 1 1 0 0 0 0 0 0 0
sna 2 2 3 0 0 0 0 1 0 0
snc 1 0 0 0 0 0 0 0 0 0
snd 4 2 1 0 0 0 0 1 0 0
snn 1 0 0 0 0 0 0 0 0 0
snp 1 0 0 0 0 0 0 0 0 0
snx 1 0 0 0 0 0 0 0 0 0
sny 1 0 0 0 0 0 0 0 0 0
som 3 2 3 0 0 0 0 1 0 0
soq 1 0 0 0 0 0 0 0 0 0
sot 1 2 1 0 0 0 0 1 0 0
soy 1 0 0 0 0 0 0 0 0 0
spa 4 13 4 0 1 2 1 11 4 0
spl 1 0 0 0 0 0 0 0 0 0
spm 1 0 0 0 0 0 0 0 0 0
spp 1 0 0 0 0 0 0 0 0 0
sps 1 0 0 0 0 0 0 0 0 0
spy 1 0 0 0 0 0 0 0 0 0
sqi 2 2 1 0 0 0 0 0 0 0
srd 1 1 1 0 0 0 0 0 0 0
sri 1 0 0 0 0 0 0 0 0 0
srm 1 0 0 0 0 0 0 0 0 0
srn 2 0 0 0 0 0 0 0 0 0
srp 4 1 1 0 0 0 1 2 0 0
srq 1 0 0 0 0 0 0 0 0 0
ssd 1 0 0 0 0 0 0 0 0 0
ssg 1 0 0 0 0 0 0 0 0 0
ssw 2 3 1 0 0 0 0 1 0 0
ssx 1 0 0 0 0 0 0 0 0 0
stp 1 0 0 0 0 0 0 0 0 0
sua 1 0 0 0 0 0 0 0 0 0
sue 1 0 0 0 0 0 0 0 0 0
sun 3 4 1 0 0 0 0 1 0 0
sus 1 0 0 0 0 0 0 0 0 0
suz 1 0 0 0 0 0 0 0 0 0
svk 0 1 0 0 0 0 0 0 0 0
swa 1 6 2 0 0 1 1 0 0 0
swe 4 8 3 0 1 1 1 4 0 0
swg 1 0 0 0 0 0 0 0 0 0
swh 3 1 1 0 0 0 0 1 0 0
swp 1 0 0 0 0 0 0 0 0 0
sxb 1 0 0 0 0 0 0 0 0 0
szl 1 1 1 0 0 0 0 0 0 0
tac 1 0 0 0 0 0 0 0 0 0
tah 1 0 0 0 0 0 0 0 0 0
taj 1 0 0 0 0 0 0 0 0 0
tam 7 7 2 0 0 1 0 3 1 0
taq 1 1 1 0 0 0 0 0 0 0
tat 3 2 1 0 0 0 0 0 0 0
tav 1 0 0 0 0 0 0 0 0 0
taw 1 0 0 0 0 0 0 0 0 0
tbc 1 0 0 0 0 0 0 0 0 0
tbf 1 0 0 0 0 0 0 0 0 0
tbg 1 0 0 0 0 0 0 0 0 0
tbo 1 0 0 0 0 0 0 0 0 0
tbz 1 0 0 0 0 0 0 0 0 0
tca 1 0 0 0 0 0 0 0 0 0
tcs 1 0 0 0 0 0 0 0 0 0
tcz 1 0 0 0 0 0 0 0 0 0
tdt 1 0 0 0 0 0 0 0 0 0
tee 1 0 0 0 0 0 0 0 0 0
tel 7 7 2 0 0 0 1 2 2 0
ter 1 0 0 0 0 0 0 0 0 0
tet 1 0 0 0 0 0 0 0 0 0
tew 1 0 0 0 0 0 0 0 0 0
tfr 1 0 0 0 0 0 0 0 0 0
tgk 3 1 1 0 0 0 0 1 0 0
tgl 3 3 1 0 0 0 0 1 0 0
tgo 1 0 0 0 0 0 0 0 0 0
tgp 1 0 0 0 0 0 0 0 0 0
tha 4 8 1 0 0 1 1 3 0 0
tif 1 0 0 0 0 0 0 0 0 0
tim 1 0 0 0 0 0 0 0 0 0
tir 2 2 3 0 0 0 0 1 0 0
tiw 1 0 0 0 0 0 0 0 0 0
tiy 1 0 0 0 0 0 0 0 0 0
tke 1 0 0 0 0 0 0 0 0 0
tku 1 0 0 0 0 0 0 0 0 0
tlf 1 0 0 0 0 0 0 0 0 0
tmd 1 0 0 0 0 0 0 0 0 0
tna 1 0 0 0 0 0 0 0 0 0
tnc 1 0 0 0 0 0 0 0 0 0
tnk 1 0 0 0 0 0 0 0 0 0
tnn 1 0 0 0 0 0 0 0 0 0
tnp 1 0 0 0 0 0 0 0 0 0
toc 1 0 0 0 0 0 0 0 0 0
tod 1 0 0 0 0 0 0 0 0 0
tof 1 0 0 0 0 0 0 0 0 0
toj 1 0 0 0 0 0 0 0 0 0
ton 2 0 0 0 0 0 0 0 0 0
too 1 0 0 0 0 0 0 0 0 0
top 1 0 0 0 0 0 0 0 0 0
tos 1 0 0 0 0 0 0 0 0 0
tpa 1 0 0 0 0 0 0 0 0 0
tpi 2 1 1 0 0 0 0 0 0 0
tpt 1 0 0 0 0 0 0 0 0 0
tpz 1 0 0 0 0 0 0 0 0 0
trc 1 0 0 0 0 0 0 0 0 0
tsn 2 3 1 0 0 0 0 1 0 0
tso 1 4 1 0 0 0 0 1 0 0
tsw 1 0 0 0 0 0 0 0 0 0
ttc 1 0 0 0 0 0 0 0 0 0
tte 1 0 0 0 0 0 0 0 0 0
tuc 1 0 0 0 0 0 0 0 0 0
tue 1 0 0 0 0 0 0 0 0 0
tuf 1 0 0 0 0 0 0 0 0 0
tuk 3 1 1 0 0 0 0 0 0 0
tum 1 1 1 0 0 0 0 0 0 0
tuo 1 0 0 0 0 0 0 0 0 0
tur 4 7 1 0 0 2 0 3 2 0
tvk 1 0 0 0 0 0 0 0 0 0
twi 2 3 1 0 0 0 0 0 0 0
txq 1 0 0 0 0 0 0 0 0 0
txu 1 0 0 0 0 0 0 0 0 0
tyv 0 1 0 0 0 0 0 0 0 0
tzj 1 0 0 0 0 0 0 0 0 0
tzl 1 0 0 0 0 0 0 0 0 0
tzm 1 1 1 0 0 0 0 0 0 0
tzo 1 0 0 0 0 0 0 0 0 0
ubr 1 0 0 0 0 0 0 0 0 0
ubu 1 0 0 0 0 0 0 0 0 0
udu 1 0 0 0 0 0 0 0 0 0
uig 4 2 1 0 0 0 0 0 0 0
ukr 4 2 1 0 0 0 0 1 0 0
uli 1 0 0 0 0 0 0 0 0 0
ulk 1 0 0 0 0 0 0 0 0 0
umb 1 1 1 0 0 0 0 0 0 0
upv 1 0 0 0 0 0 0 0 0 0
ura 1 0 0 0 0 0 0 0 0 0
urb 1 0 0 0 0 0 0 0 0 0
urd 7 8 2 0 0 0 0 1 1 0
uri 1 0 0 0 0 0 0 0 0 0
urt 1 0 0 0 0 0 0 0 0 0
urw 1 0 0 0 0 0 0 0 0 0
usa 1 0 0 0 0 0 0 0 0 0
usp 1 0 0 0 0 0 0 0 0 0
uvh 1 0 0 0 0 0 0 0 0 0
uvl 1 0 0 0 0 0 0 0 0 0
uzb 2 0 0 0 0 0 0 0 0 0
uzn 1 1 1 0 0 0 0 1 0 0
vec 1 1 1 0 0 0 0 0 0 0
ven 1 1 0 0 0 0 0 0 0 0
vid 1 0 0 0 0 0 0 0 0 0
vie 5 6 1 0 0 1 0 5 0 0
viv 1 0 0 0 0 0 0 0 0 0
vmy 1 0 0 0 0 0 0 0 0 0
waj 1 0 0 0 0 0 0 0 0 0
wal 1 0 0 0 0 0 0 0 0 0
wap 1 0 0 0 0 0 0 0 0 0
war 2 1 1 0 0 0 0 1 0 0
wat 1 0 0 0 0 0 0 0 0 0
wbi 1 0 0 0 0 0 0 0 0 0
wbp 1 0 0 0 0 0 0 0 0 0
wed 1 0 0 0 0 0 0 0 0 0
wer 1 0 0 0 0 0 0 0 0 0
wim 1 0 0 0 0 0 0 0 0 0
wiu 1 0 0 0 0 0 0 0 0 0
wiv 1 0 0 0 0 0 0 0 0 0
wln 0 0 1 0 0 0 0 0 0 0
wmt 1 0 0 0 0 0 0 0 0 0
wmw 1 0 0 0 0 0 0 0 0 0
wnc 1 0 0 0 0 0 0 0 0 0
wnu 1 0 0 0 0 0 0 0 0 0
wol 3 1 1 0 0 0 0 1 0 0
wos 1 0 0 0 0 0 0 0 0 0
wrk 1 0 0 0 0 0 0 0 0 0
wro 1 0 0 0 0 0 0 0 0 0
wrs 1 0 0 0 0 0 0 0 0 0
wsk 1 0 0 0 0 0 0 0 0 0
wuu 1 0 0 0 0 0 0 0 0 0
wuv 1 0 0 0 0 0 0 0 0 0
xav 1 0 0 0 0 0 0 0 0 0
xbi 1 0 0 0 0 0 0 0 0 0
xed 1 0 0 0 0 0 0 0 0 0
xho 3 3 3 0 0 0 0 1 0 0
xla 1 0 0 0 0 0 0 0 0 0
xnn 1 0 0 0 0 0 0 0 0 0
xon 1 0 0 0 0 0 0 0 0 0
xsi 1 0 0 0 0 0 0 0 0 0
xtd 1 0 0 0 0 0 0 0 0 0
xtm 1 0 0 0 0 0 0 0 0 0
yaa 1 0 0 0 0 0 0 0 0 0
yad 1 0 0 0 0 0 0 0 0 0
yal 1 0 0 0 0 0 0 0 0 0
yap 1 0 0 0 0 0 0 0 0 0
yaq 1 0 0 0 0 0 0 0 0 0
yby 1 0 0 0 0 0 0 0 0 0
ycn 1 0 0 0 0 0 0 0 0 0
ydd 1 1 1 0 0 0 0 0 0 0
yid 1 0 0 0 0 0 0 0 0 0
yka 1 0 0 0 0 0 0 0 0 0
yle 1 0 0 0 0 0 0 0 0 0
yml 1 0 0 0 0 0 0 0 0 0
yon 1 0 0 0 0 0 0 0 0 0
yor 4 5 3 0 0 0 1 1 0 0
yrb 1 0 0 0 0 0 0 0 0 0
yre 1 0 0 0 0 0 0 0 0 0
yss 1 0 0 0 0 0 0 0 0 0
yue 3 2 1 0 0 0 0 0 0 0
yuj 1 0 0 0 0 0 0 0 0 0
yut 1 0 0 0 0 0 0 0 0 0
yuw 1 0 0 0 0 0 0 0 0 0
yva 1 0 0 0 0 0 0 0 0 0
zaa 1 0 0 0 0 0 0 0 0 0
zab 1 0 0 0 0 0 0 0 0 0
zac 1 0 0 0 0 0 0 0 0 0
zad 1 0 0 0 0 0 0 0 0 0
zai 1 0 0 0 0 0 0 0 0 0
zaj 1 0 0 0 0 0 0 0 0 0
zam 1 0 0 0 0 0 0 0 0 0
zao 1 0 0 0 0 0 0 0 0 0
zap 1 0 0 0 0 0 0 0 0 0
zar 1 0 0 0 0 0 0 0 0 0
zas 1 0 0 0 0 0 0 0 0 0
zat 1 0 0 0 0 0 0 0 0 0
zav 1 0 0 0 0 0 0 0 0 0
zaw 1 0 0 0 0 0 0 0 0 0
zca 1 0 0 0 0 0 0 0 0 0
zga 1 0 0 0 0 0 0 0 0 0
zho 2 2 1 0 0 1 1 7 0 0
zia 1 0 0 0 0 0 0 0 0 0
ziw 1 0 0 0 0 0 0 0 0 0
zlm 1 0 0 0 0 0 0 0 0 0
zos 1 0 0 0 0 0 0 0 0 0
zpc 1 0 0 0 0 0 0 0 0 0
zpl 1 0 0 0 0 0 0 0 0 0
zpm 1 0 0 0 0 0 0 0 0 0
zpo 1 0 0 0 0 0 0 0 0 0
zpq 1 0 0 0 0 0 0 0 0 0
zpu 1 0 0 0 0 0 0 0 0 0
zpv 1 0 0 0 0 0 0 0 0 0
zpz 1 0 0 0 0 0 0 0 0 0
zsm 2 1 1 0 0 0 0 1 0 0
zsr 1 0 0 0 0 0 0 0 0 0
ztq 1 0 0 0 0 0 0 0 0 0
zty 1 0 0 0 0 0 0 0 0 0
zul 2 3 1 0 0 0 0 1 0 0
zyp 1 0 0 0 0 0 0 0 0 0
Total 1394 793 304 3 28 67 46 356 85 2