MTEB

Multimodal toolbox for evaluating embeddings and retrieval systems

Installation | Usage | Leaderboard | Documentation | Citing

Installation

You can install mteb simply using pip. For more on installation please see the documentation.

pip install mteb

Example Usage

Below we present a simple use-case example. For more information, see the documentation.

import mteb
from sentence_transformers import SentenceTransformer

# Select model
model_name = "sentence-transformers/all-MiniLM-L6-v2"
model = mteb.get_model(model_name) # if the model is not implemented in MTEB it will be eq. to SentenceTransformer(model_name)

# Select tasks
tasks = mteb.get_tasks(tasks=["Banking77Classification.v2"])

# evaluate
results = mteb.evaluate(model, tasks=tasks)

You can also run it using the CLI:

mteb run \
    -m sentence-transformers/all-MiniLM-L6-v2 \
    -t "Banking77Classification.v2" \
    --output-folder results

For more on how to use the CLI check out the related documentation.

Overview

Overview
📈 Leaderboard	The interactive leaderboard of the benchmark
Get Started.
🏃 Get Started	Overview of how to use mteb
🤖 Defining Models	How to use existing model and define custom ones
📋 Selecting tasks	How to select tasks, benchmarks, splits etc.
🏭 Running Evaluation	How to run the evaluations, including cache management, speeding up evaluations etc.
📊 Loading Results	How to load and work with existing model results
Overview.
📋 Tasks	Overview of available tasks
📐 Benchmarks	Overview of available benchmarks
🤖 Models	Overview of available Models
Contributing
🤖 Adding a model	How to submit a model to MTEB and to the leaderboard
👩‍💻 Adding a dataset	How to add a new task/dataset to MTEB
👩‍💻 Adding a benchmark	How to add a new benchmark to MTEB and to the leaderboard
🤝 Contributing	How to contribute to MTEB and set it up for development

Citing

MTEB was introduced in "MTEB: Massive Text Embedding Benchmark", and heavily expanded in "MMTEB: Massive Multilingual Text Embedding Benchmark". When using mteb, we recommend that you cite both articles.

Bibtex Citation (click to unfold)

@article{muennighoff2022mteb,
  author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Loïc and Reimers, Nils},
  title = {MTEB: Massive Text Embedding Benchmark},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2210.07316},
  year = {2022}
  url = {https://arxiv.org/abs/2210.07316},
  doi = {10.48550/ARXIV.2210.07316},
}

@article{enevoldsen2025mmtebmassivemultilingualtext,
  title={MMTEB: Massive Multilingual Text Embedding Benchmark},
  author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  publisher = {arXiv},
  journal={arXiv preprint arXiv:2502.13595},
  year={2025},
  url={https://arxiv.org/abs/2502.13595},
  doi = {10.48550/arXiv.2502.13595},
}

If you use any of the specific benchmarks, we also recommend that you cite the authors of both the benchmark and its tasks:

benchmark = mteb.get_benchmark("MTEB(eng, v2)")
benchmark.citation # get citation for a specific benchmark

# you can also create a table of the task for the appendix using:
benchmark.tasks.to_latex()

Name		Name	Last commit message	Last commit date
Latest commit History 3,366 Commits
.github		.github
.vscode		.vscode
docs		docs
mteb		mteb
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
citation.cff		citation.cff
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MTEB

Multimodal toolbox for evaluating embeddings and retrieval systems

Installation | Usage | Leaderboard | Documentation | Citing

Installation

Example Usage

Overview

Citing

About

Uh oh!

Releases 588

Packages

Used by 525

Contributors 229

Languages

License

embeddings-benchmark/mteb

Folders and files

Latest commit

History

Repository files navigation

MTEB

Multimodal toolbox for evaluating embeddings and retrieval systems

Installation | Usage | Leaderboard | Documentation | Citing

Installation

Example Usage

Overview

Citing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 588

Packages 0

Used by 525

Contributors 229

Languages

Packages