- docs: Update links in README.md (#296) (
76056b5
)
-
fix: Added tasks from SEB (#287)
-
Added tasks from SEB
-
docs: fix link
-
fix: ran linting
-
fix typing for 3.8
-
fixed annotation for v3.8 (
39cff49
)
- fix: updated version in transition to semantic release ci (
238ab82
)
- feat: Updating version
BREAKING CHANGE: Bump version (caee2e9
)
-
ci: disable changelog (
b7d3cde
) -
ci: moved release to the correct folder (
b4fa85a
) -
ci: renamed test job and workflow (#282)
ci: Added tests (6675bb8
)
-
docs: typos in readme (#268) (
aa9234c
) -
docs: add dataset schemas (#255)
-
docs: update AbsTaskClassification.py document schema for classification task
-
update AbsTaskBitextMining.py
-
update BornholmskBitextMining.py
-
update AbsTaskClustering.py and BlurbsClusteringP2P.py
-
update 8 files
-
update 9 files
-
update AbsTaskReranking.py
-
update BlurbsClusteringP2P.py
-
update CMTEBPairClassification.py
-
update GerDaLIRRetrieval.py
-
update 7 files
-
update AbsTaskBitextMining.py
-
update AbsTaskClassification.py (
c3ce1ac
) -
docs: Add development installation instructions (#246)
-
docs: Add development installation instructions
-
removed unused requirements file
I don't believe this is nec. with the setup.py specifying the same dependencies
-
docs: Updated make file with new dependencies
-
ci: Update ci to use make commands
This ensure that the user runs exactly what the CI expects
-
ci: Avoid specifying tests folder as it causes issuew ith tests
-
ci: removed unec. args for test ci
-
Added dev install (
0048878
)
-
fix: dead link in readme (
ecbb776
) -
fix: Added sizes to the metadata (#276)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
fix: Added sizes to the metadata
this allow for automatic metadata generations
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
-
reformatted empty fields to None
Co-authored-by: Niklas Muennighoff <[email protected]> (cd4a012
)
-
refactor: add metadata basemodel (#260)
-
refactor: rename description to metadata dict
-
refactor: add TaskMetadata and first example
-
update 9 files
-
update TaskMetadata.py
-
update TaskMetadata.py
-
update TaskMetadata.py
-
update LICENSE, TaskMetadata.py and requirements.dev.txt
-
update 151 files
-
update 150 files
-
update 43 files and delete 1 file
-
update 106 files
-
update 45 files
-
update 6 files
-
update 14 files
-
Added model results to repo and updated CLI to create consistent folder structure. (#254)
-
Added model results to repo and updated CLI to create consistent folder structure.
-
ci: updated ci to use make install
-
Added missing pytest dependencies
-
Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Restructing the readme (#262)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package
-
build(deps): update TaskMetadata.py and pyproject.toml
-
update 221 files
-
build(deps): update pyproject.toml
-
build(deps): update pyproject.toml
-
build(deps): update pyproject.toml
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (dd5d617
)
-
overwrite version (
bc60c9d
) -
v1.3.0 (
50b856c
) -
v1.3.0 (
61c12d8
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
7b0a766
) -
Ci-fix (#289)
-
added release pipeline
-
v1.3.0
-
ci: moved release to the correct folder (
7f56c1a
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
57f500f
) -
v1.3.0
-
added release pipeline
-
v1.3.0 (
5e4d10e
) -
v1.3.0 (
cdda2f2
) -
added release pipeline (
69a440b
) -
tests: speed up tests (#283)
update Makefile and test_all_abstasks.py (2155bf6
)
-
update TaskMetadata.py (#281) (
acfd7d4
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
c9d1a03
) -
Enable ruff ci (#279)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
fix: Added sizes to the metadata
this allow for automatic metadata generations
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
-
reformatted empty fields to None
-
CI: Enable linter
Co-authored-by: Niklas Muennighoff <[email protected]> (a16eb07
)
-
Added MMTEB (#275)
-
restructing the readme
-
added mmteb
-
removed unec. method
-
Added docstring to metadata
-
Updated outdated examples
-
formatting documents
-
fix: Updated form to be parsed correctly
-
Updated based on feedback
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
updated based on feedback
-
Added suggestion from review
-
added correction based on review
Co-authored-by: Niklas Muennighoff <[email protected]> (c0dc49a
)
-
dev: add ruff as suggested extension (#274) (
b08913f
) -
dev: add isort (#271)
-
dev: add isort
-
dev: add isort (
845099d
) -
dev: run tests on pull request towards any branch (
13f759a
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
b42abe4
) -
replaced linter with ruff (#265)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package
-
replaced linter with ruff
-
rerun tests
-
ci: Added in newer workflow
some of them are disables as they require other issues to be solved
- Update Makefile
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (023e881
)
-
Restructing the readme (#262)
-
restructing the readme
-
removed double specification of versions and moved all setup to pyproject.toml
-
correctly use flat-layout for the package (
769157b
) -
restructing the readme (
364be7f
) -
Added model results to repo and updated CLI to create consistent folder structure. (#254)
-
Added model results to repo and updated CLI to create consistent folder structure.
-
ci: updated ci to use make install
-
Added missing pytest dependencies
-
Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (8a758bc
)
-
dev: add workspace defaults in VSCode (#253)
-
dev: add black as default formatter in vscode
-
Update .vscode/settings.json
Co-authored-by: Kenneth Enevoldsen <[email protected]> (30e5b9e
)
-
Add Danish Discourse dataset (#247)
-
misc.
-
update ddisco.py
-
chore: delete ddisco.py, ddisco.test.tsv and ddisco.train.tsv
-
Update mteb/tasks/Classification/DdiscoCohesionClassification.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- Update mteb/tasks/Classification/DdiscoCohesionClassification.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- Update mteb/tasks/Classification/DdiscoCohesionClassification.py
Co-authored-by: Imene Kerboua <[email protected]>
- Update mteb/tasks/Classification/DdiscoCohesionClassification.py
Co-authored-by: Imene Kerboua <[email protected]>
- Update mteb/tasks/Classification/DdiscoCohesionClassification.py
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]> (d46d0f5
)
-
Update structure of mteb/tasks to mteb/tasks/{type}/{language} (#245)
-
Fix structure of mteb/tasks Fixes #243
-
fix: Added missing init files (
b1c78c1
) -
tests: do not run tests on collection (#249)
test: update test_all_abstasks.py (236614a
)
-
Update README.md with correct DRESModel location (
399edf4
) -
Fix typo (
9610378
) -
Set dev version (
716f59c
)
-
Release: 1.2.0 (
9e9dca8
) -
Rmv superfluous file (
d772fed
) -
Remove duplicate & outdated code (
12bcb83
) -
Adapt scripts (
36b9234
) -
Add example (
273ff4a
) -
Simplify retrieval (#233)
-
Simplify retrieval
-
Simplify
-
Make call method
-
Add splits
-
Rmv outdated test
-
Fix name & \n
-
Add qrels
-
Add revisions
Co-authored-by: Imene Kerboua <[email protected]>
-
Add hf hub org
-
Add test
-
Add missing revision
-
Rename test
Co-authored-by: Imene Kerboua <[email protected]>
- log dres compat
Co-authored-by: Imene Kerboua <[email protected]> (c9fccbc
)
-
Fixed missing revision error on Norwegian Bitext Mining (#221)
-
Removed revision specification from Norwegian Bitext Mining task
-
Update to latest revision
Co-authored-by: Niklas Muennighoff <[email protected]> (b249c67
)
-
Remove HAGRID from french benchmark (#235)
-
add Masakhane dataset config
-
add trigram lang code for dataset who use it
-
create french script eval
-
fix French word
-
add some documentation
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
4 pair classification (#10)
-
add Opusparcus dataset
-
multilingual usage
-
use eval_split of config files
-
change eval_split according to data
Co-authored-by: Gabriel Sequeira <[email protected]>
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
Clustering with HAL S2S dataset (#11)
HAL S2S dataset creation and evaluation on clustering task.
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
DiaBLa and Flores Bitext Mining evaluation (#12)
-
Add DiaBLa dataset for bitext mining
-
Add DiaBLa dataset for bitext mining
-
deduplicate bitext task
-
add Flores
-
format files
-
add flores to evaluation script
-
remove prints
-
add revision
Co-authored-by: Gabriel Sequeira <[email protected]>
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
adding dataset processing for mteb
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
fix change on langmapping
-
reset alphabetical order
-
add revision handling
-
Clustering: Add AlloProf dataset (#17)
AlloProf dataset for clustering task
-
handling of revision
-
change split + add revision handling
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
adding dataset processing for mteb
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
add script to process and upload alloprof on HF
-
adding dataset processing for mteb
-
refactor few thing
-
reset alphabetical order
-
add revision handling
-
handling of revision
-
change split + add revision handling
-
use eval variable
-
alphabetic order
-
Add MLSUM dataset for clustering task (#21)
-
Use Masakhane dataset for clustering task (#23)
-
16 add datasets to readmemd (#18)
-
run task table
-
run task table
-
Add MLSUM dataset for clustering task (#21)
-
Use Masakhane dataset for clustering task (#23)
-
run task table
-
refresh readme
-
refresh readme
-
run task table
-
refresh readme
Co-authored-by: Gabriel Sequeira <[email protected]> Co-authored-by: Marion Schaeffer <[email protected]>
- load only test split (#25)
Co-authored-by: Gabriel Sequeira <[email protected]>
- Update mteb/tasks/BitextMining/DiaBLaBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/HALClusteringS2S.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- renaming masakhane (#28)
Co-authored-by: Gabriel Sequeira <[email protected]>
-
Syntec dataset addition (#26)
-
add scrpit to process & load to HF
-
add script to enable download of data from HF
-
add syntec dataset files to gitignore
-
add syntecretrieval
-
add syntec retrival
-
build dataloading script
-
remove datasets
-
correct typo
Co-authored-by: Sequeira Gabriel <[email protected]>
-
30 add syntec reranking (#31)
-
change name to secify retrieval
-
add reranking tasks
-
create script to upload dataset fo reranking task
-
create reranking task
-
add reranking tasks
-
add model name in description
-
SummEval translated to french (#32)
-
7 sts (#33)
-
taike into account multilingual tasks
-
add stsbenchmark multilingual dataset
-
add STS tasks
-
taike into account multilingual tasks
-
add stsbenchmark multilingual dataset
-
add STS tasks
-
add coma
-
Adding sick fr dataset to sts tasks (#34)
-
Adding sick fr dataset to sts tasks
-
modifying dataset in load function to have the right column names
-
Fix alloprof dataset (#36)
-
change revision to use
-
remove duplicate data
-
change main metric because dataset is hard (#37)
-
Fix alloprof dataset (#40)
-
change revision to use
-
remove duplicate data
-
change revision
-
handle queries train test split
-
change dataset creation method
-
change revision
-
handle queries train test split
-
change dataset creation method
-
Fix DiaBLa by inheriting CrossLingual class (#42)
-
Fix DiaBLa by inheriting CrossLingual class
-
remove remaining print
-
Fix DiaBLa integration
-
Update mteb/tasks/BitextMining/FloresBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Classification/MasakhaNEWSClassification.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Update README.md
-
Update mteb/tasks/BitextMining/FloresBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/abstasks/AbsTaskPairClassification.py
Co-authored-by: Imene Kerboua <[email protected]>
-
Update README.md
-
Update scripts/data/syntec/create_data_reranking.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/data/alloprof/create_data_reranking.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/run_mteb_french.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/run_mteb_french.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Retrieval/HagridRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/MLSUMClusteringP2P.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/MLSUMClusteringS2S.py
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py
-
Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py
-
Update mteb/tasks/STS/SickFrSTS.py
-
Inherit OpusparcusPC init from MultilingualTask
-
remove unnecessary init
-
Remove train split from evaluation on MasakhaNEWSClassification (#52)
remove train split from evaluation
-
put script on HF dataset repos (#56)
-
put script on HF dataset repos
-
remove scripts
-
49 fix dictionnary in syntecretrieval (#54)
-
add trust remote code arg
-
leave corpus as dict
-
remove trust remote code
-
add Tatoeba & BUCC BitextMining tasks (#57)
add bucc and tatoeba bitextmining tasks
-
46 add other languages to masakhaneweclusterings2s and p2p (#58)
-
add other language to clustering tasks
-
fix main score and S2S task
-
update run fr becnhmark script
-
Update run_mteb_french.py
-
Update AbsTaskClustering.py
-
remove train and validation splits
-
remove Hagrid (#60)
Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Sequeira Gabriel <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: wissam-sib <[email protected]>
Co-authored-by: Wissam Siblini <[email protected]> (d01d053
)
-
Restore TRECCOVID import (
9f8e897
) -
Extend MTEB with French datasets (#218)
-
add Masakhane dataset config
-
add trigram lang code for dataset who use it
-
create french script eval
-
fix French word
-
add some documentation
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
4 pair classification (#10)
-
add Opusparcus dataset
-
multilingual usage
-
use eval_split of config files
-
change eval_split according to data
Co-authored-by: Gabriel Sequeira <[email protected]>
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
Clustering with HAL S2S dataset (#11)
HAL S2S dataset creation and evaluation on clustering task.
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
DiaBLa and Flores Bitext Mining evaluation (#12)
-
Add DiaBLa dataset for bitext mining
-
Add DiaBLa dataset for bitext mining
-
deduplicate bitext task
-
add Flores
-
format files
-
add flores to evaluation script
-
remove prints
-
add revision
Co-authored-by: Gabriel Sequeira <[email protected]>
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
adding dataset processing for mteb
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
fix change on langmapping
-
reset alphabetical order
-
add revision handling
-
Clustering: Add AlloProf dataset (#17)
AlloProf dataset for clustering task
-
handling of revision
-
change split + add revision handling
-
add script to process and upload alloprof on HF
-
build script for HF
-
adding dataset processing for mteb
-
refactor few thing
-
remove whitespaces
-
adding dataset processing for mteb
-
adding BSARD dataset
-
add BSARD to benchmark
-
adding Hagrid dataset
-
add script to process and upload alloprof on HF
-
adding dataset processing for mteb
-
refactor few thing
-
reset alphabetical order
-
add revision handling
-
handling of revision
-
change split + add revision handling
-
use eval variable
-
alphabetic order
-
Add MLSUM dataset for clustering task (#21)
-
Use Masakhane dataset for clustering task (#23)
-
16 add datasets to readmemd (#18)
-
run task table
-
run task table
-
Add MLSUM dataset for clustering task (#21)
-
Use Masakhane dataset for clustering task (#23)
-
run task table
-
refresh readme
-
refresh readme
-
run task table
-
refresh readme
Co-authored-by: Gabriel Sequeira <[email protected]> Co-authored-by: Marion Schaeffer <[email protected]>
- load only test split (#25)
Co-authored-by: Gabriel Sequeira <[email protected]>
- Update mteb/tasks/BitextMining/DiaBLaBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/HALClusteringS2S.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- renaming masakhane (#28)
Co-authored-by: Gabriel Sequeira <[email protected]>
-
Syntec dataset addition (#26)
-
add scrpit to process & load to HF
-
add script to enable download of data from HF
-
add syntec dataset files to gitignore
-
add syntecretrieval
-
add syntec retrival
-
build dataloading script
-
remove datasets
-
correct typo
Co-authored-by: Sequeira Gabriel <[email protected]>
-
30 add syntec reranking (#31)
-
change name to secify retrieval
-
add reranking tasks
-
create script to upload dataset fo reranking task
-
create reranking task
-
add reranking tasks
-
add model name in description
-
SummEval translated to french (#32)
-
7 sts (#33)
-
taike into account multilingual tasks
-
add stsbenchmark multilingual dataset
-
add STS tasks
-
taike into account multilingual tasks
-
add stsbenchmark multilingual dataset
-
add STS tasks
-
add coma
-
Adding sick fr dataset to sts tasks (#34)
-
Adding sick fr dataset to sts tasks
-
modifying dataset in load function to have the right column names
-
Fix alloprof dataset (#36)
-
change revision to use
-
remove duplicate data
-
change main metric because dataset is hard (#37)
-
Fix alloprof dataset (#40)
-
change revision to use
-
remove duplicate data
-
change revision
-
handle queries train test split
-
change dataset creation method
-
change revision
-
handle queries train test split
-
change dataset creation method
-
Fix DiaBLa by inheriting CrossLingual class (#42)
-
Fix DiaBLa by inheriting CrossLingual class
-
remove remaining print
-
Fix DiaBLa integration
-
Update mteb/tasks/BitextMining/FloresBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Classification/MasakhaNEWSClassification.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update README.md
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Update README.md
-
Update mteb/tasks/BitextMining/FloresBitextMining.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/abstasks/AbsTaskPairClassification.py
Co-authored-by: Imene Kerboua <[email protected]>
-
Update README.md
-
Update scripts/data/syntec/create_data_reranking.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/data/alloprof/create_data_reranking.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/run_mteb_french.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/run_mteb_french.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/evaluation/MTEB.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Retrieval/HagridRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/MLSUMClusteringP2P.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/Clustering/MLSUMClusteringS2S.py
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py
-
Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py
-
Update mteb/tasks/STS/SickFrSTS.py
-
Inherit OpusparcusPC init from MultilingualTask
-
remove unnecessary init
-
Remove train split from evaluation on MasakhaNEWSClassification (#52)
remove train split from evaluation
-
put script on HF dataset repos (#56)
-
put script on HF dataset repos
-
remove scripts
-
49 fix dictionnary in syntecretrieval (#54)
-
add trust remote code arg
-
leave corpus as dict
-
remove trust remote code
-
add Tatoeba & BUCC BitextMining tasks (#57)
add bucc and tatoeba bitextmining tasks
-
46 add other languages to masakhaneweclusterings2s and p2p (#58)
-
add other language to clustering tasks
-
fix main score and S2S task
-
update run fr becnhmark script
-
Update run_mteb_french.py
-
Update AbsTaskClustering.py
-
remove train and validation splits
Co-authored-by: Gabriel Sequeira <[email protected]>
Co-authored-by: Marion Schaeffer <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Imene Kerboua <[email protected]>
Co-authored-by: mciancone <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: wissam-sib <[email protected]>
Co-authored-by: Wissam Siblini <[email protected]> (3d8b8ec
)
-
dev (
c16eddc
) -
Dev (
08c7317
) -
Add tasks for Spanish Embedding Evaluation (#227)
-
feat: add xmarket es dataset
-
refactor: use multilingual dataset
-
fix: update revision id
-
refactor: add constant for language
-
feat: add two clustering datasets
Signed-off-by: jupyterjazz <[email protected]>
- feat: import classes
Signed-off-by: jupyterjazz <[email protected]>
- refactor: flores dataset
Signed-off-by: jupyterjazz <[email protected]>
-
feat: add miracl reranking task for spanish
-
feat: use hf repo with all reranking langs
-
feat: update revision hash
-
refactor: use description for language
-
feat: add stses task
-
fix: get scores from label column
-
refactor: add revision to data loading
-
Added spanish passage retrieval
-
feat: mintaka and xpqa retrieval tasks
Signed-off-by: jupyterjazz <[email protected]>
- feat: import classes
Signed-off-by: jupyterjazz <[email protected]>
-
fix: typo in data loading
-
fix: id
Signed-off-by: jupyterjazz <[email protected]>
- refactor: try out multilingual task
Signed-off-by: jupyterjazz <[email protected]>
- refactor: multilingual task import
Signed-off-by: jupyterjazz <[email protected]>
- refactor: cmon man
Signed-off-by: jupyterjazz <[email protected]>
- refactor: go back to monolingual tasks
Signed-off-by: jupyterjazz <[email protected]>
- refactor: remove unused import
Signed-off-by: jupyterjazz <[email protected]>
- refactor: loading logic
Signed-off-by: jupyterjazz <[email protected]>
-
feat: add miracl as retrieval task
-
fix: nested corpus
-
refactor: get lang from description
-
Update mteb/tasks/Retrieval/MIRACLRetrieval.py
Co-authored-by: Michael Günther <[email protected]>
-
feat: allow multlingual reranking tasks
-
feat: make miraclreranking multilingual
-
refactor: rename miraclretrieval
Co-authored-by: Niklas Muennighoff <[email protected]>
-
style: add missing eof empty line
-
feat: make xmarket retrieval task multilingual
-
refactor: rename xmarket
-
refactor: turn spanish tasks multilingual (#11)
-
refactor: make xpqa retrieval multilingual
-
fix: formatting of xpqa dataset
-
refactor: make mintaka into multilingual task
-
refactor: make miracl retrieval multilingual
-
feat: add revision ids for hf datasets
-
refactor: remove patool
-
Update mteb/tasks/Reranking/init.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update mteb/tasks/STS/init.py
Co-authored-by: Niklas Muennighoff <[email protected]>
Signed-off-by: jupyterjazz <[email protected]>
Co-authored-by: guenthermi <[email protected]>
Co-authored-by: jupyterjazz <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Michael Günther <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (52d5c9f
)
- feat: update revision id of wikicitiesclustering task (
fb90c02
)
-
fix: remove debugging print statement (
d292d93
) -
fix: pass parallel_retrieval kwarg to use DenseRetrievalParallelExactSearch (
19b8f66
)
-
Release: 1.1.2 (
def3c91
) -
Add task list (#228)
-
Add task list
-
Update mteb/init.py
-
Update README.md (
10bf6f8
) -
Update BeIRPLTask.py (#225)
-
Update BeIRPLTask.py
-
Update BeIRPLTask.py (
a8922c1
) -
Allow multiple languages (
2cc222e
) -
Add Korean Text Search Tasks to MTEB (#210)
-
add Ko-miracl, Ko-StrategyQA, Ko-mrtydi tasks
-
Update mteb/abstasks/AbsTaskRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]>
-
Update AbsTaskRetrieval.py
-
Update mteb/abstasks/AbsTaskRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]>
- Update scripts/run_mteb_korean.py
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (dadf2da
)
-
Add MultiLongDocRetrieval task to MTEB. (#224)
-
Update AbsTaskRetrieval.py.
-
Add Retrieval Task:
MultiLongDocRetrieval
-
Update AbsTaskRetrieval.py and
MLDR
task -
Update reference of MLDR (
2f65179
) -
Fix name (
2989f76
) -
only save top-k (#209)
-
Update AbsTaskRetrieval.py
-
Add json import; rename kwarg
-
Pass OF
-
Update mteb/abstasks/AbsTaskRetrieval.py
-
Update AbsTaskRetrieval.py
-
Update AbsTaskRetrieval.py
-
Update mteb/abstasks/AbsTaskRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]> (f58888d
)
-
Add tasks for German Embedding Evaluation (#214)
-
chore: solve merge conflict
-
fix: gerdalir dataset
-
fix: lang from en to de
-
chore: solve merge conflict
-
chore: add ir datasets to requirements
-
refactor: limit queries to 10k
-
refactor: update description of task with limit
-
revert style changes
-
feat: add german stsbenchmarksts task
-
feat: update revision id
-
refactor: update revision id after changes in scores
-
add XMarket dataset
-
add xmarket to init file
-
feat: add revision id
-
add paws x dataset
-
Add ir_datasets as dependency
-
add GermanDPR dataset
-
fix loading
-
Update mteb/tasks/Retrieval/GermanDPRRetrieval.py
Co-authored-by: Saba Sturua <[email protected]>
-
feat: add miracl reranking task for german
-
refactor: cleanup task
-
prevent duplicate pos docs
-
fix: use test split in MIRACL (#13)
Fixes mismatch between description and HuggingFace dataset
-
refactor: remove WikiCLIR
-
fix: double import; xmarket name
-
add German tasks to run_mteb_german script
-
fupdate revisions and style
-
update MIRACL to work with latest version
-
revert adding ir_datasets
-
support multilingual pair classification
-
remove print statement
-
Apply suggestions from code review
Co-authored-by: Niklas Muennighoff <[email protected]>
-
fix monolingual pair classification
-
remove lang for monolingual tasks
Co-authored-by: Isabelle Mohr <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Saba Sturua <[email protected]>
Co-authored-by: Markus Krimmel <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (9aba9ee
)
-
Simplify (
1cd07db
) -
Refer to other works (
8f28bcb
) -
Update mteb/tasks/Retrieval/GermanQuADRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]> (09a9cb0
)
-
clean up (
51c40fd
) -
WIP: implement requested changes (
58baad2
) -
remove code for writing JSONL dataset (
d23eac3
) -
add docstring, remove local qrels (
af7ee50
) -
fix query id in qrel dataset, ready to merge (
33c9dd4
) -
WIP: use HF dataset instead of local JSONL (
db3fea1
) -
rename BeIRDETask (
e56cf86
) -
Update scripts/run_mteb_german.py
Co-authored-by: Niklas Muennighoff <[email protected]> (4b18a7e
)
- Update mteb/tasks/Retrieval/GermanRetrieval.py
Co-authored-by: Niklas Muennighoff <[email protected]> (3fef61a
)
-
add reference to GermanQuAD (
ae268e0
) -
fix results folder path (
dc7fc01
) -
copy from local (
9c0880d
) -
Update mteb/abstasks/AbsTaskRetrieval.py (
be1fcc1
) -
Pass OF (
b0e6316
) -
Add json import; rename kwarg (
d39c21c
) -
Update AbsTaskRetrieval.py (
4eb8e02
) -
Added Norwegian Bokmål-Nynorsk bitext mining task (
c3fb742
) -
Add STS revisions (
38277ae
) -
Add RTR revisions (
8da9487
) -
Add RRK revisions (
2011cd8
) -
Add PCLF revisions (
9b6f4b9
) -
Add CLST revisions (
da73236
) -
Add CLF revisions (
fd91a9c
) -
Update Revision (
6b0fae5
) -
Fix SweFAQ linkage (
2341c48
) -
Fix SummEval linkage (
7252322
) -
Fix Dalaj linkage (
fb9ccd8
) -
Fix medrxiv mislinkage (
620defc
) -
Fix stripping (
02e84b2
) -
add datasets for long document evaluation
Co-authored-by: Isabelle Mohr <[email protected]> (88beb46
)
-
Do not enforce rich import (
aa11fe7
) -
fix RerankingEvaluator's compute_metrics_individual (
fd7bfac
) -
Fix SummEval import (
859d38e
) -
Increment version (
4d75ddf
)
-
fix: msmarco-v2 uses dev.tsv, not dev1.tsv (
6908d21
) -
fix: add missing task-langs attribute (#152) (
bc22909
)
-
Release: 1.1.1 (
d3aaf4f
) -
Merge branch 'main' into fixconversion (
d292258
) -
Fix eval_lang (
7836148
) -
Simplify code snippets (
d434f52
) -
Simplify wording (
3adb0b5
) -
Clarify multi-gpu usage (
5a2da23
) -
Fix splits (
93f6f85
) -
Improve Cust Model explanation (
52c1fd8
) -
Add bs to Clustering test (
4df0d2e
) -
Rely on auto-conversion to tensor in score function (
d8512f7
) -
Rely on standard encode kwargs only (
4c1660e
) -
Improve Cust Model explanation (
23d758f
) -
Add bs to Clustering test (
6e0c0d2
) -
Rely on auto-conversion to tensor in score function (
7ec4c57
) -
Rely on standard encode kwargs only (
2fad0f9
) -
Update README.md (
d9aa70f
) -
Update README.md (
2211f83
) -
Simplify assertion (
f7fcbc1
) -
Default to false (
d64f6c7
) -
Add multi gpu eval to readme (#140)
update readme (1b1c9d3
)
-
Support Multi-node Evaluation (#132)
-
styling
-
USE_HF_DATASETS
-
Support DRPES
-
we use beir.datasets.data_loader_hf in case of non dist
-
distributed fixes
-
update run command
-
cleanup
-
.
-
sugg
-
ruff (
0dd82a9
) -
Add Chinese tasks (C-MTEB) (#134)
-
add C_MTEB
-
add C_MTEB
-
rename MMarcoReranking
-
rename MMarcoReranking
-
Update mteb/tasks/Retrieval/CMTEBRetrieval.py
-
Update README.md
-
Allow custom encode functions
Co-authored-by: shitao <[email protected]>
Co-authored-by: Nouamane Tazi <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (071974a
)
-
Add Polish tasks (PL-MTEB) (#137)
-
Add Polish tasks (PL-MTEB)
-
Add Polish datasets to README
-
Add newline
Co-authored-by: rposwiata <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (2779344
)
-
Add BEIR-PL datasets to MTEB (#121)
-
Add BIER-PL benchmark
-
Update README with BEIR-PL datasets
-
Update names
-
Add tasks to init to be visible during evaluation
Co-authored-by: Konrad Wojtasik <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]> (5972c02
)
-
Replaced prints with logging (#133)
-
Make sure that main score is added to bitext mining tasks
-
Added scandinavian languages: da, no, sv
-
merge upstream main
-
fix: Replaced prints with logging statements
-
chore: removed accidental commits (
d7ca378
) -
add logging (
6412a6a
) -
Merge pull request #131 from embeddings-benchmark/nouamane/quick-fixes
Code cleanup (4fb97d0
)
-
. (
3ebb039
) -
add eval_splits arg (
c407c4b
) -
quick fixes (
6c5a3fa
) -
clean MTEB tasks (
b276f1d
) -
clean args (
9365755
) -
styling (
dd02b48
) -
black (
652d07c
) -
Set dev version (
bf98c2c
)
-
Release: 1.1.0 (
80d0344
) -
Bump version ID and update PyPI (#128)
Bump version ID and update PyPI after adding additional tasks. (4a4b54b
)
-
Fix typo (
33a3140
) -
Sort imports (
ab2eef8
) -
Sort imports (
3432374
) -
Raise error first (
0b1bfd2
) -
Added support for Scandinavian Languages (#124)
-
Make sure that main score is added to bitext mining tasks
-
Added scandinavian languages: da, no, sv
-
Updated readme with scandinavian tasks
-
Changes n samples for the nordic lang CLF
-
Added scandinavian models to init
-
Added error logs to gitignore
-
fix import error
-
fix dataset columns
-
rename dataset columns
-
remove swefaq
-
fix: Added functionality to raise error
-
fix: Updated names
-
fix: Removed no as a language
-
Added missing data transformation
-
Fix spelling error (
acb0f59
) -
Install beir (
c50b8ab
) -
Update README.md (
29ffedf
) -
ruff (
6a58b5d
) -
Update README.md (
5825536
) -
fix revision hash for TenKGnadClusteringP2P dataset
Co-authored-by: Niklas Muennighoff <[email protected]> (eb622f8
)
- change dataset order for BlurbsClustering in README
Co-authored-by: Niklas Muennighoff <[email protected]> (f6e49ba
)
- change dataset order for TenKGnadClustering in README
Co-authored-by: Niklas Muennighoff <[email protected]> (2a2c47f
)
-
fix descriptions for German clustering datasets (
30a966c
) -
add German clustering tasks to README (
62457e3
) -
update reference & category for TenKGnad datasets (
2174a47
) -
add German clustering tasks (
ab469be
) -
Allow abs path (
b56528c
) -
Add @property annotation to description method of AbsTask (
98b0443
) -
fix typo (
37a986b
) -
fix extend lang pairs (
865dffc
) -
Fix clustering eval, black, isort (
bc43665
) -
Add 'auto' to sklearn clustering, add test, fix warning (
15ce352
) -
Update MSMARCORetrieval.py (
d913f56
) -
Revert to old split (
1f3ff6e
) -
Add wheel instruction (
62fad9b
) -
Dev version (
d988e48
)
-
Release: 1.0.2 (
e189bae
) -
Add comment
Co-authored-by: Nouamane Tazi <[email protected]> (3e72ee8
)
-
Fix naming (
33f2db9
) -
Cleaner logging & tqdm usage (
542d871
) -
Add kwargs (
e0b801d
) -
Produce embeddings in one go (
e88bcf2
) -
Fix naming (
6c62f18
) -
Make inputs always List[str] & call in one (
bdeeedf
) -
Fix SummEval description (
0c2b1be
) -
fix SemmEval description
Unless I'm missing something, I think the SemmEval description is incorrect---the dataset consists of summaries of news articles, not biomedical abstracts. (1ccc068
)
-
Clarify script for running all of MTEB English (
9f72434
) -
Update run_mteb_english.py (
6ff57d3
) -
Update run_mteb_english.py (
7803eea
) -
Point to English benchmarking script (
57f3371
) -
Eexample script for benchmarking all of MTEB English (
77e6b22
) -
Clarify MSMARCO split (
bbeada8
) -
Allow re-merging (
b0ce501
) -
Set dataset name; Sort imports (
2a5a661
) -
Standardize CQA merging script (
5d5a2fb
) -
Update merge_cqadupstack.py (
b0304c1
) -
Update README.md (
8c60c22
) -
Update README.md (
6255449
) -
Remove validation split (
875a98e
) -
Remove validation set (
b3f9585
) -
Update ClassificationEvaluator.py (
93b89b6
) -
Set dev version (
8a0d6b1
)
-
Release: 1.0.1 (
b9f423b
) -
Delete mteb_diagram.png (
76dc363
) -
Deactivate beir (
b263157
) -
Update BeIRTask.py (
37b7b79
) -
Remove validation (
6922840
) -
Fix typo (
7247233
) -
Add files via upload (
9d2bb67
) -
Increment version & use abslink (
a792a65
)
-
Release: 1.0.0 (
9c544a4
) -
Add paper (
b73457a
) -
Fix formatting (
c523d16
) -
print -> logging (
4f3a559
) -
Do not ignore data scripts (
891b455
) -
Reorganize scripts (
e157bb0
) -
Add release instructions & dev suffix to version (
164b9ae
)
-
Release: 0.9.1 (
5c438cc
) -
Merge pull request #80 from embeddings-benchmark/Muennighoff-patch-5
Update STS22CrosslingualSTS.py (1459309
)
-
Update installation (
f96ee73
) -
Update SummEvalSummrization.py (
d8f232d
) -
Update AmazonPolarityClassification.py (
114b0e3
) -
Update STS22CrosslingualSTS.py (
c8df727
) -
Temporarily change README installation instruction (
e53e77c
) -
Fix res keyword (
769ac67
) -
Update example to be visible for non-registered users (
d4f75fc
) -
Merge pull request #79 from Muennighoff/feature/leaderboardexp
Add leaderboard instructions (4d2683a
)
-
Move meta script (
7a8398f
) -
dataset_version -> dataset_revision & logging (
fe34f84
) -
Add leaderboard instructions (
f325aca
) -
Merge pull request #78 from embeddings-benchmark/feature/add-mteb-ds-name
Add ds name to res dict (53b763a
)
-
Update MTEB.py (
ae86e2f
) -
Merge pull request #73 from Muennighoff/fix/cqadupstackbeir11
Fallback to old dataloader for cqadupstack (7791b41
)
- Merge pull request #77 from Muennighoff/fix/bcpc
Update init imports (865bf47
)
-
Update init imports (
39b7712
) -
Merge pull request #76 from Muennighoff/fix/bcpc
BC -> PC (82d3228
)
-
Merge branch 'main' into fix/bcpc (
f18c6df
) -
BC -> PC (
7a430c2
) -
Merge pull request #75 from Muennighoff/feature/leaderboard
Add LB link (36dbd14
)
- Merge pull request #72 from Muennighoff/fix/revisions
Fix/revisions (4a8d3db
)
- Merge pull request #74 from Muennighoff/fix/mteblogo
Update logo files (d939de6
)
-
Add LB link (
6aeb7ed
) -
Update logo files (
5bfb65a
) -
Fallback to old dataloader for cqadupstack (
262930e
) -
Add revision (
488f1f7
) -
Add revisions 2/2 (
c8ba2b8
) -
Add revisions 1/2 (
c75a503
) -
Merge pull request #69 from Muennighoff/feature/custombeirmodel
Feature/custombeirmodel (da9ae9a
)
-
BeIRModel -> DRES (
ff554bb
) -
Do not wrap 2x (
255c416
) -
Adapt naming (
3c8f672
) -
Add explanation of BeIRModel (
3edad09
) -
Merge pull request #68 from Muennighoff/feature/beirmrr
Add MRR (7a0993d
)
Fix categories (03ed576
)
-
Fix categories (
08088d7
) -
Update RedditClusteringP2P.py (
77a1606
) -
Merge pull request #62 from Muennighoff/feature/hublinks
Feature/hublinks (4f04719
)
-
Fix hub mistakes (
02f9e6c
) -
Merge branch 'feature/hublinks' of https://github.com/Muennighoff/mteb into feature/hublinks (
c98b9a6
) -
Add dataset stats (
bbf2a82
) -
Add desc (
46078aa
) -
Add desc (
9ca92b0
) -
Update MSMARCOv2Retrieval.py (
f43cd1a
) -
Merge pull request #63 from embeddings-benchmark/Muennighoff-patch-4
Add desc (f93abff
)
-
Add desc (
c972cc9
) -
Merge pull request #61 from embeddings-benchmark/fix/nolangs
Fix no langs (c15e1a7
)
-
Merge branch 'main' into feature/hublinks (
c3990d6
) -
Simplify (
936eee2
) -
Add Hub links & descriptions (
b8182bb
) -
Update MTEB.py (
0be4a06
) -
Merge pull request #57 from embeddings-benchmark/Muennighoff-patch-2
Update README.md (1ebca84
)
- Merge pull request #59 from embeddings-benchmark/Muennighoff-patch-3
Update README.md (3f53c85
)
-
Update README.md (
8097f31
) -
Update README.md (
5b260a4
) -
Merge pull request #56 from Muennighoff/feature/readmelinks
Add README Links & Images (f473dbd
)
-
Center title (
1341db7
) -
Center title (
8b80471
) -
Beautify (
1ab8764
) -
Merge pull request #49 from Muennighoff/fix/cqadupstack
Fix CQADupstack (3a4dd84
)
- Merge pull request #50 from Muennighoff/fix/redditp2p
New RedditP2P Script (7bc547e
)
- Merge pull request #52 from Muennighoff/fix/bucc
Default to 1-indexed gold (9aff7f2
)
- Merge pull request #54 from embeddings-benchmark/Muennighoff-patch-1
Update MSMARCORetrieval.py (3951c41
)
-
Update MSMARCORetrieval.py (
6922be0
) -
Default to 1-indexed gold (
f29e1fb
) -
New RedditP2P Script (
f73b179
) -
Fix split (
e3ea40b
) -
Add CQADupStack subsets (
a32c00b
) -
Fix CQADupstack (
a26229f
) -
Merge pull request #46 from Muennighoff/fix/scidocs
Fix/scidocs (ea10703
)
-
Update README name (
afddfd3
) -
Merge pull request #45 from Muennighoff/feature/cachetestembs
Feature/cachetestembs (475420a
)
- Merge pull request #44 from Muennighoff/fix/silentskip
Fix/silentskip (f7d6fd1
)
- Merge pull request #43 from Muennighoff/main
Add flag to overwrite results (ece590f
)
- Merge pull request #33 from Muennighoff/fix/summeval
Fix SummEval NaN scores (48586e2
)
-
Merge branch 'main' into main (
e986cd1
) -
Merge pull request #42 from Muennighoff/feature/versioning
Feature/versioning (1aeaede
)
-
Update mteb/evaluation/MTEB.py (
23a473f
) -
Rename SciDocs (
edc2917
) -
Return test cache in all clf evaluators (
309a867
) -
Cache test embedding / exp for all clf evals (
7dd867f
) -
Add testcache (
08cb352
) -
Split into two lines (
f756399
) -
Sort tasks (
03658fa
) -
Log known tasks (
86f9cd6
) -
Log tasks not found (
9ab0a7a
) -
Add flag to overwrite (
529541d
) -
Version mteb & ds (
78b90e9
) -
Formatting (
67f6070
) -
Add versioning (
fa852de
) -
Merge pull request #41 from Muennighoff/fix/sts22 (
064e47c
) -
Rmv superfluous imports (
7e8ee18
) -
Make revision optional (
90afba5
) -
Remove space (
e0d22bc
) -
Modify script to invert scores (
9b9f43a
) -
Add revision to CL (
5f68fda
) -
Add revision kwarg (
3448d1e
) -
Merge pull request #26 from AmrMKayid/return-results (
8f3242c
) -
Merge pull request #38 from Muennighoff/fix/seeds (
720c597
) -
Update docs (
dd4a1f2
) -
Merge pull request #37 from embeddings-benchmark/mindref
Fix Mind Reference (1834041
)
-
Seed cuda (
d33d748
) -
Merge pull request #35 from embeddings-benchmark/bootstrap-logs (
3ff35c5
) -
Update mteb/abstasks/AbsTaskClassification.py
Co-authored-by: Niklas Muennighoff <[email protected]> (9255249
)
-
Remove superfluous import (
124bebe
) -
Remove superfluous comments (
bf5f912
) -
Add seed to task (
acf8b1c
) -
Add missing super calls (
b32195e
) -
Set evaluation seeds (
e69d40b
) -
Set seeds (
ef2985b
) -
Fix Mind Reference
Two other notes:
- The renaming can create confusion as there exists a test set just that I assume we don't have the labels
- MIND uses AUC & MRR & NDCG scores, not MAP, see https://msnews.github.io/ (
7ce4bb1
)
- Update mteb/evaluation/evaluators/SummarizationEvaluator.py
Co-authored-by: Nouamane Tazi <[email protected]> (f667749
)
-
Merge pull request #36 from embeddings-benchmark/mindsmall-test (
6fc710b
) -
rename
validation
split totest
(9c4d5c6
) -
styling (
c66610e
) -
add logs for classification bootstrap experiments (
e4000e1
) -
Merge pull request #32 from Muennighoff/fixsplits (
39d0926
) -
Add consistent brackets (
2cdd283
) -
Remove debug leftovers (
c674d0a
) -
Remove superfluous imports (
68f7307
) -
Skip samples with no variance (
d39be65
) -
Drop nans (
20c22a9
) -
Fix BEIR splits (
752d49f
) -
Fix splits (
07bea18
) -
Merge branch 'main' into return-results (
314e5d7
) -
Merge pull request #30 from embeddings-benchmark:selected_tasks
fix printing selected tasks for evaluation (f1cab40
)
-
fix printing selected tasks for evaluation (
ba0dd76
) -
Merge pull request #29 from cycycc/fix-sickr-hf-hub-name (
cb87c7a
) -
fix sick-r huggingface hub name (
2ea195a
) -
Update mteb/evaluation/MTEB.py
Co-authored-by: holidaydrien <[email protected]> (a4d952b
)
- Update mteb/evaluation/MTEB.py
Co-authored-by: holidaydrien <[email protected]> (c4acb76
)
-
Returning Evaluation results (
3d60490
) -
Merge pull request #18 from Muennighoff/evalfix (
4dabbaf
) -
Merge pull request #19 from Muennighoff/patch-2 (
9e56ad3
) -
Merge pull request #20 from Muennighoff/updatemainscores (
a0fbd83
) -
Update to ndcg_at_10 (
8d010d0
) -
Update main scores (
c0e773a
) -
Update README.md (
8b495b6
) -
Fix task splits (
1755356
) -
Merge pull request #15 from Muennighoff/mainscorefix (
4b5fe2b
) -
Fix monolingual mainscore (
61647df
) -
Fix main score warning multilingual (
831a218
) -
Merge pull request #14 from Muennighoff/patch-1 (
6055ecc
) -
Fix task language example (
115c280
) -
styling (
2ff07d2
) -
update example (
b581d00
) -
we can now select all tasks of a specific language (
b36e58c
) -
update test (
53d123e
) -
keep only langs defined in task's description when loading (
efa189f
) -
better prints for multilingual and crosslingual evaluation (
5b86950
) -
styling (
8fd8fb0
) -
move scripts to respective folders (
028ed3e
) -
Update gitignore (
a3cee03
) -
update setup.py (
89aaa43
) -
update setup (
2645323
) -
update setup.cfg (
bc5ec1d
) -
Create first pip version (
210d012
) -
make default evaluation for classification 10 experiments each using 8 samples per label (
b062405
) -
use seed from init arg (
f58f8da
) -
styling (
4d1bd09
) -
add error message when trying to load beir (
a3d58f3
) -
add argument to specify error logs path (
d6cef16
) -
make beir an optional package (
5bcee12
) -
quick modifications (
d774ce6
) -
add example (
21fc624
) -
make beir optional dependency (
fdd922a
) -
Smaller fixes in Classification task (
c6eda26
) -
update available tasks (
0923e50
) -
update available tasks (
e192823
) -
add evaluation time to final scores (
9a1ca7d
) -
quick fix loading beir task (
8e46cc8
) -
add available tasks (
b7a1987
) -
Merge pull request #11 from embeddings-benchmark/summarization (
bdb2691
) -
add more scores to summarization evaluator (
12ae05f
) -
add SummEval task (
3ba3e65
) -
add Summarization abstract task (
f2b0e53
) -
add specifying language for task example (
cdf1f18
) -
fix bitext mining evaluation (
073a254
) -
update README (
3b30e9b
) -
update README (
529ec6b
) -
add --available_tasks flag to CLI (
de97d9a
) -
styling (
324b94c
) -
fix missing params eval_splits in load_data (
ecb9d12
) -
CLI quick fixes (
693bffa
) -
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb-draft into main (
bba225d
) -
quick fixes (
2c01099
) -
styling (
75d0449
) -
fix eval_splits loading using beir (
26ec6b9
) -
capture errors instead of failing (
c6aafa4
) -
quick fixes (
8a7e3ec
) -
update BeIRModel (
e8b5ff9
) -
load data and free it after each task evaluation (
aa467f2
) -
update reqs (
6005c10
) -
fixing beir imports (
5d74d42
) -
Merge pull request #10 from embeddings-benchmark/optimisation (
2b6caf2
) -
add multiproc test (
fe8b963
) -
update BitextMining main scores (
3b0f912
) -
support distributed evaluation for IR 🥳 (
5e91971
) -
remove "train" from eval_splits (
6da5ed1
) -
gather all nodes outputs in CPU after distributed computation (
5eb3661
) -
support DRPES for Parallel IR evaluation (
36962e9
) -
quick fix (
6e0e6bd
) -
set logistic regression default max_iter to 200 (
8963b83
) -
add evaluators logs 📜 (
e9d326f
) -
make style (
ab8f13e
) -
add Makefile and better styling tools ✨ (
156e828
) -
dataloading moved from init to run (
c2b7901
) -
Merge pull request #8 from embeddings-benchmark/beir-integration
Beir integration (f7f2426
)
-
Merge branch 'main' into beir-integration (
af12b49
) -
Merge pull request #9 from embeddings-benchmark/display
Display (11e5758
)
-
fixes (
8902f59
) -
fixes (
a394cc2
) -
fixes+black (
b0527a8
) -
beautiful task display (
0ff2db2
) -
rich library (
27cd4cb
) -
datasets (
895c23d
) -
fever (
0724070
) -
quora (
43b93e5
) -
dbpedia (
50d6700
) -
climatefever (
e506637
) -
cqadupstack (
217009f
) -
arguana (
019b2b7
) -
beir retrieval (
e52171b
) -
only save if output_folder argument is specified (
2e1eb24
) -
Update python-package.yml (
6c32b6b
) -
all tests are passing now ✅ (
6c41b75
) -
Create python-package.yml (
3cce88f
) -
Merge pull request #6 from embeddings-benchmark/testing (
06bd1df
) -
Merge branch 'main' into testing (
5226907
) -
normalize STS scores (
6f98396
) -
normalize score names (
6db134f
) -
format @k scores (
dcb77a0
) -
rename CrossLingual to Crosslingual (
3af3b4f
) -
remove train split from evaluation splits (
6317bb6
) -
bug fix (
ba8c906
) -
calculate AP only in binary classification (
bc293ca
) -
add kwargs and batch_size to evaluate funcs (
7926d3c
) -
update main scores for some tasks (
099a32b
) -
add limit argument to limit evaluation data (
92e5d09
) -
add test for PairClassificationEvaluator (
ecffd35
) -
use evaluators.PairClassificationEvaluator instead of sent-formers BinaryClassificationEvaluator (
9ffdf2b
) -
reformatting (
ca25e17
) -
add test for RerankingEvaluator (
9646892
) -
reformat RerankingEvaluator (
3ce99cf
) -
more docs (
8d07d59
) -
tests folder (
9cd9dc2
) -
add test_RetrievalEvaluator (
5236588
) -
more docs (
06ff3d1
) -
add AP score to ClassificationEvaluator (
c950ce8
) -
add nDCG score to RerankingEvaluator (
e4170c8
) -
Merge pull request #5 from embeddings-benchmark:update-reranking
Support multiple queries in Reranking tasks (cf51493
)
-
quick fix (
0d133e1
) -
use max cross similarity in case of multiple queries (
3f80a70
) -
support multiple queries in Reranking tasks (
47f871f
) -
bug fixes (
a3dc4f6
) -
rename binary classification to pair classification (
63374fe
) -
rename available_splits to eval_splits (
04b9f55
) -
rename available_langs to eval_langs (
99ad04c
) -
minor fixes (
c2307ef
) -
Merge pull request #4 from embeddings-benchmark/packaging (
297560d
) -
quick fix bug (
fc8ea9f
) -
report stderr in AbsTaskClassification in case of bootstrapping (
d3723a5
) -
add STS22CrosslingualSTS (
87f92e5
) -
add MindSmallReranking (
940642a
) -
precision recall f1 bitext evaluator (
4f4a9e2
) -
korean to sts17 (
4767300
) -
quick fix RetrievalEvaluator (
afb574a
) -
update README (
db6edde
) -
update example (
d363a49
) -
remove useless import (
4a8966f
) -
fix cmd.py arguments (
2b17b4a
) -
add kwargs where needed (
30f0efd
) -
add cli script (
5a77900
) -
adopt pbr packaging (
c2fc3c1
) -
quick fixes (
f5d3287
) -
rename kNNClassification to Classification (
44ceb4a
) -
add bootstrap parameters to AbsTaskKNNClassification (
8ecf9d4
) -
add EmotionClassification (
e03db39
) -
add TweetSentimentExtractionClassification (
5c7ef5c
) -
add ToxicConversationsClassification (
94bfb4f
) -
add AmazonCounterfactualClassification (
4052ae1
) -
add ImdbClassification task (
eb70842
) -
add AmazonPolarityClassification dataset (
75684ed
) -
hack fix bug loading tasks twice (
23cc372
) -
add AmazonReviewsClassification (
5f4731c
) -
add create data script for amazon reviews multi (
761b70e
) -
make shuffling reproducible in logReg-10-splits-5-intents (
52e4743
) -
add logReg-10-splits-5-intents for kNNClassificationEvaluator (
72c67e0
) -
quick fix batch size (
a88d8bf
) -
quick fixes (
d4e5549
) -
add batch size to kNNClassificationEvaluator (
c5127d8
) -
Merge pull request #3 from embeddings-benchmark/cross-lingual
Cross lingual (c844875
)
-
black (
a6ce618
) -
bitext mining evaluator (
db7a934
) -
bucc (
96848c1
) -
tatoeba (
4783ace
) -
bitext mining (
e0ec3a5
) -
bitext mining (
50b2f48
) -
add MTOP classification tasks (
1ecec57
) -
crosslingual tasks (
582aa15
) -
STS17 benchmark (
0c38bf0
) -
add methods (
49afe21
) -
formatting (
cebaf56
) -
quick fix (
2284d17
) -
Merge pull request #2 from embeddings-benchmark/knn-classification (
6a5faec
) -
add MultilingualTask (
5614a03
) -
fix loading for multilingual datasets (
8658f68
) -
skip task if results alrdy exist (
24f83c1
) -
add banking77 and massive scenario datasets (
12d4d40
) -
add logRegClassificationEvaluator (
804e3b0
) -
add kNNClassificationEvaluatorPytorch (
31bf4d1
) -
cosine and euclidean distances in kNNClassificationEvaluator (
4720480
) -
add requirements dev file (
a446d3f
) -
update results json file format to account for multi langs (
1fe472a
) -
load_dataset directly inside AbsTask (
4475fee
) -
add default language as "en" for all tasks (
faee9db
) -
WIP add kNN Classification and MassiveIntentClassification task (
885c06d
) -
tasks can be provided as class now in task_list (
3bcb767
) -
add bs param in clusteringevaluator (
b4c83e0
) -
quick docs fixes (
1a48f29
) -
fix line length (
faa978f
) -
linting (
49a4138
) -
add reqs (
006756c
) -
redditp2p + sep2p (
2ec9c44
) -
clustering tasks (
b8d37a0
) -
scripts (
2760969
) -
first commit (
7fbd064
) -
loading scripts (
24a4310
) -
Update README.md (
c03618c
) -
init file (
fd182b6
) -
Update README.md (
97c6a99
) -
retrieval evaluator (
39db013
) -
removed results folder (
751e1fd
) -
reranking evaluator (
b62a0f5
) -
added custom evaluators (
1bf7c94
) -
STS datasets (
9093bc1
) -
gitignore (
8d309e4
) -
added STS (
3a2f4b9
) -
reranking (
0cf6e1a
) -
binary classification (
3a15b96
) -
added verbosity level (
1ee8f3d
) -
added file logging (
36f7cf3
) -
added available tasks/categories/selected list (
5cc63a5
) -
finegrained task selection (
7c0087b
) -
added retrieval (
22415e9
) -
fixed seed (
50ada77
) -
typos (
16dc4a9
) -
added clustering tasks (
1106c15
) -
seeded benchmarks (
518bc82
) -
evaluation schema (
bdb79d0
) -
basic tasks schema (
bacb9d0
) -
proof of concept (
6886d1b
) -
Create README.md (
26df27b
) -
Initial commit (
7841bca
)