fix: Leaderboard Speedup (#1745)
Added get_scores_fast
Made leaderboard faster with smarter dependency graph and event management and caching
Changed print to logger.info (9eff8ca)

Test

test: Add script to test model loading below n_parameters threshold (#1698)
add model loading test for models below 2B params
add failure message to include model namne
use the real get_model_meta
use cache folder
teardown per function
fix directory removal
write to file
wip loading from before
wip
Rename model_loading_testing.py to model_loading.py
Delete tests/test_models/test_model_loading.py
checks for models below 2B
try not using cache folder
update script with scan_cache_dir and add args
add github CI: detect changed model files and run model loading test
install all model dependencies
dependecy installations and move file location
should trigger a model load test in CI
find correct commit for diff
explicity fetch base branch
add make command
try to run in python instead and add pytest
fix attribute error and add read mode
separate script calling
let pip install be cached and specify repo path
check ancestry
add cache and rebase
try to merge instead of rebase
try without merge base
check if file exists first
Apply suggestions from code review
Co-authored-by: Kenneth Enevoldsen <[email protected]>
Update .github/workflows/model_loading.yml
Co-authored-by: Kenneth Enevoldsen <[email protected]>
address review comments to run test once from CI and not pytest

Co-authored-by: Kenneth Enevoldsen <[email protected]> (8d033f3)

Unknown

Fixed result loading on leaderboard (#1739)
Only main_score gets loaded for leaderboard thereby avoiding OOM errors
Fixed plot failing because of missing embedding dimensions
Ran linting (752d2b8)

Assets 6

09 Jan 12:11

KennethEnevoldsen

1.28.0

3c68ea6

1.28.0

1.28.0 (2025-01-09)

Feature

feat: Add nomic modern bert (#1684)
add nomic modern bert
use SentenceTransformerWrapper
use SentenceTransformerWrapper
try nomic wrapper
update
use all prompts
pass prompts
use fp16
lint
change to version
remove commented code (95f143a)

Fix

fix: allow kwargs in init for RerankingWrapper (#1676)
allow kwargs in init
fix retrieval
convert corpus_in_pair to list (f5962c6)

Assets 6

08 Jan 21:46

KennethEnevoldsen

1.27.0

2ae00a2

1.27.0

1.27.0 (2025-01-08)

Feature

feat: reduce logging for load_results()

redacts missing subsets to avoid 100+ subsets printed
reduce to logging.info
removed splits that are commonly never evaluated on and thus also the errors for them being missing

The second part removed quite a few warnings (4930 to XX)

It also seems like the splits were accidentally included in some of the MMTEB benchmark.

This will remove those splits from those benchmarks (which are all in beta). We will have to recompute the tables for the paper though (we should do that anyway)

Other potential thing to consider:

Scifact is included in MTEB(Medical). I have removed the "train" split from it as I think that was a mistake. (checked other dataset in benchmark)

Here is a count of the current top errors:

{
    &#34;MassiveScenarioClassification: Missing splits {&#39;validation&#39;}&#34;: 238,  # included in e.g. mteb(fra)
    &#34;MassiveIntentClassification: Missing splits {&#39;validation&#39;}&#34;: 237, # included in e.g. mteb(fra)
    &#34;MassiveScenarioClassification: Missing subsets {&#39;af&#39;, &#39;da&#39;, ...} for split test&#34;: 230,
    &#34;AmazonReviewsClassification: Missing splits {&#39;validation&#39;}&#34;: 229, # included in e.g. mteb(deu)
    &#34;MassiveIntentClassification: Missing subsets {&#39;af&#39;, &#39;da&#39;, ...} for split test&#34;: 228,
    &#34;STS22: Missing subsets {&#39;fr-pl&#39;, &#39;de-en&#39;, ...} for split test&#34;: 223,
    &#34;AmazonReviewsClassification: Missing subsets {&#39;es&#39;, &#39;ja&#39;, ...} for split test&#34;: 196,
    &#34;MTOPDomainClassification: Missing splits {&#39;validation&#39;}&#34;: 195, # included in mteb(fra)
    &#34;MTOPIntentClassification: Missing splits {&#39;validation&#39;}&#34;: 194, # included in mteb(fra)
    &#34;AmazonCounterfactualClassification: Missing splits {&#39;validation&#39;}&#34;: 189, # included in mteb(deu)
    &#34;MTOPDomainClassification: Missing subsets {&#39;es&#39;, &#39;th&#39;, ...} for split test&#34;: 165,
    &#34;STS17: Missing subsets {&#39;en-ar&#39;, &#39;es-es&#39;, ...} for split test&#34;: 164,
    &#34;MTOPIntentClassification: Missing subsets {&#39;es&#39;, &#39;th&#39;, ...} for split test&#34;: 164,
    &#34;AmazonCounterfactualClassification: Missing subsets {&#39;de&#39;, &#39;ja&#39;, ...} for split test&#34;: 148,
}
``` ([`7e16fa2`](https://github.com/embeddings-benchmark/mteb/commit/7e16fa2565b2058e12303a1feedbd0d4dea96a41))

Assets 6

08 Jan 16:18

KennethEnevoldsen

1.26.6

18cefab

1.26.6

1.26.6 (2025-01-08)

Fix

fix: Added zero shot tag to benchmark (#1710)
Added method for determining whether a model is zero shot
Added .items() where intended
Added filtering functions for zero shot models
Added zero-shot filtering button and error message when table is empty.:
Ran linting
Fixed docstring linting error
is_zero_shot returns None when no training data is specified
Added zero-shot emoji column to leaderboard
Added explanation for zero shot column
Added soft and hard zero-shot buttons
Added training data annotations to 24 models from HuggingFace Hub (8702815)

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.28.7 (2025-01-13)

Ci

Fix

1.28.6 (2025-01-11)

Fix

1.28.5 (2025-01-11)

Fix

Unknown

1.28.4 (2025-01-10)

Fix

1.28.3 (2025-01-10)

Fix

1.28.2 (2025-01-10)

Fix

1.28.1 (2025-01-10)

Fix

Test

Unknown

1.28.0 (2025-01-09)

Feature

Fix

1.27.0 (2025-01-08)

Feature

1.26.6 (2025-01-08)

Fix

Releases: embeddings-benchmark/mteb

1.28.7

1.28.7 (2025-01-13)

Ci

Fix

1.28.6

1.28.6 (2025-01-11)

Fix

1.28.5

1.28.5 (2025-01-11)

Fix

Unknown

1.28.4

1.28.4 (2025-01-10)

Fix

1.28.3

1.28.3 (2025-01-10)

Fix

1.28.2

1.28.2 (2025-01-10)

Fix

1.28.1

1.28.1 (2025-01-10)

Fix

Test

Unknown

1.28.0

1.28.0 (2025-01-09)

Feature

Fix

1.27.0

1.27.0 (2025-01-08)

Feature

1.26.6

1.26.6 (2025-01-08)

Fix