add is_cross_encoder #1869

Samoed · 2025-01-24T16:52:34Z

Checklist

Run tests locally to make sure nothing is broken using make test.
Run the formatter to format the code using make lint.

Small step to #1841. There are a lot of T5 and LLM that used for reranking tasks, but I'm not sure how to annotate them.

sam-hey · 2025-01-26T08:57:27Z

Suggestion: If I understand the intent of this PR correctly, the primary focus is to enable filtering on the leaderboard. Wouldn't it make more sense to add a list of model types as an Enum? This way, it would also allow filtering for specific categories like Late Interaction models or other types that might be added in the future.

x-tabdeveloping

Thanks for looking into this. I think setting False as a default would make more sense, since most models are not cross-encoders, but I'd love to here your arguments about this.

x-tabdeveloping · 2025-01-27T13:26:17Z

mteb/model_meta.py

@@ -103,6 +104,7 @@ class ModelMeta(BaseModel):
    training_datasets: dict[str, list[str]] | None
    adapted_from: str | None = None
    superseded_by: str | None = None
+    is_cross_encoder: bool | None = None


Shouldn't the default be False? Doesn't None signal that we don't know?

Yes, like don't know. I think we should specify True or False only if someone added this, because in other cases this might be misleading

x-tabdeveloping · 2025-01-27T13:27:08Z

mteb/models/rerankers_monot5_based.py

@@ -591,6 +601,7 @@ def get_prediction_tokens(self, *args, **kwargs):
    use_instructions=None,
    training_datasets=None,
    framework=["PyTorch"],
+    is_cross_encoder=None,


Why is it None? Do we not know whether it is a cross-encoder or not?

This is LLM Mistral and because it is not truly cross-encoder I marked it as None (like other LLM and T5 models)

I think I added this as a cross-encoder. I used it like the MonoT5 reranker but zero-shot (no retrieval training).

It’s actually not bad zero-shot, although ofc worse than if trained.

I’d add it as a cross-encoder in this setting since it’s doing full attention between query and doc. But agree it’s a weird case

x-tabdeveloping · 2025-01-27T13:28:54Z

@sam-hey Yeah, I think, that would probably make sense (an enum or Literal[...]).
What categories would you then choose though?

sam-hey · 2025-01-27T13:57:57Z

Thanks for raising @sam-hey!

I can definitely see the benefit! On the other hand, having it standardized makes it so each model class has the same function and is more reliable that way.

I can see both sides, but personally I think I would prefer to keep the core search functions in MTEB, so users can see them there and assume each model searches the same within their own “class” (eg that all dense retrievers use the same base functionality). I think it’d be great if we made BM25 a first class MTEB model so we didn’t have to rely on that (and could also add other sparse non-neural versions like Pyserini).

OTOH, there are probably 3 ish other model “classes” or types that would involve a different search functionality: multi-vector (like ColBERT as you say), and then perhaps neural sparse retrieval (like Splade) and generative retrieval.

So we should definitely make it so that each of these could be added, which as @KennethEnevoldsen says likely involves a change to the interface. But since there are less than 10 model “classes”, it seems like we could do that with an if statement. But perhaps it’s too early in the morning and I’m missing something!

Originally posted by @orionw in #1826

I believe the categories would be more or less the same @x-tabdeveloping.

Samoed · 2025-01-27T14:41:24Z

I think we can add these model types:

Encoder
SparseEncoder (ColBERT, BM25)
BiEncoder
CrossEncoder
T5 (rename)
LLM

orionw · 2025-01-27T14:48:35Z

Catching up a bit, but are these supposed to be model "classes" or "types" or just general descriptions? This is for the leaderboard, so I could see either way.

If model "classes", it should probably be each type that has a unique way of searching (and correspond to searching code in the repo). e.g. for encoders it's a dense encode followed by some similarity func. For cross-encoders it's prediction over all (query, doc) pairs. For sparse models, it's an index-building step followed by a search step. And so on. If this is the case I think we can simplify that list a bit and exclude LLMs since they're cross-encoders.

If we're adding general descriptors then I think it makes sense to add fine-grained labels like LLM (zero-shot), Cross-Encoder (which implies fine-tuning), Bi-Encoder, Late-Interaction, etc.

Samoed · 2025-01-27T15:01:29Z

These namings are only for description and the leaderboard.

Samoed · 2025-01-31T11:20:38Z

@x-tabdeveloping What do you think we should do?

add is_cross_encoder

988afc2

Samoed requested review from x-tabdeveloping and isaac-chung January 24, 2025 16:52

isaac-chung approved these changes Jan 25, 2025

View reviewed changes

sam-hey mentioned this pull request Jan 26, 2025

New leaderboard filtering cross-encoders #1841

Open

x-tabdeveloping reviewed Jan 27, 2025

View reviewed changes

Samoed mentioned this pull request Feb 1, 2025

Add support for STS evaluation for Cross-Encoders #1725

Closed

Merge branch 'main' into add_rerankers

f12d99e

Samoed requested a review from x-tabdeveloping February 5, 2025 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add is_cross_encoder #1869

add is_cross_encoder #1869

Samoed commented Jan 24, 2025 •

edited

Loading

sam-hey commented Jan 26, 2025

x-tabdeveloping left a comment

x-tabdeveloping Jan 27, 2025

Samoed Jan 27, 2025 •

edited

Loading

x-tabdeveloping Jan 27, 2025

Samoed Jan 27, 2025

orionw Jan 27, 2025

x-tabdeveloping commented Jan 27, 2025

sam-hey commented Jan 27, 2025

Samoed commented Jan 27, 2025

orionw commented Jan 27, 2025

Samoed commented Jan 27, 2025

Samoed commented Jan 31, 2025

add is_cross_encoder #1869

Are you sure you want to change the base?

add is_cross_encoder #1869

Conversation

Samoed commented Jan 24, 2025 • edited Loading

Checklist

sam-hey commented Jan 26, 2025

x-tabdeveloping left a comment

Choose a reason for hiding this comment

x-tabdeveloping Jan 27, 2025

Choose a reason for hiding this comment

Samoed Jan 27, 2025 • edited Loading

Choose a reason for hiding this comment

x-tabdeveloping Jan 27, 2025

Choose a reason for hiding this comment

Samoed Jan 27, 2025

Choose a reason for hiding this comment

orionw Jan 27, 2025

Choose a reason for hiding this comment

x-tabdeveloping commented Jan 27, 2025

sam-hey commented Jan 27, 2025

Samoed commented Jan 27, 2025

orionw commented Jan 27, 2025

Samoed commented Jan 27, 2025

Samoed commented Jan 31, 2025

Samoed commented Jan 24, 2025 •

edited

Loading

Samoed Jan 27, 2025 •

edited

Loading