Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Refactior task specific scores #2016

Open
Tracked by #1791
Samoed opened this issue Feb 7, 2025 · 0 comments
Open
Tracked by #1791

[v2] Refactior task specific scores #2016

Samoed opened this issue Feb 7, 2025 · 0 comments
Labels
v2 Issues and PRs related to `v2` branch
Milestone

Comments

@Samoed
Copy link
Collaborator

Samoed commented Feb 7, 2025

TASK_TO_HF_DATASET = {
"Core17InstructionRetrieval": ("jhu-clsp/core17-instructions-mteb", False),
"Robust04InstructionRetrieval": ("jhu-clsp/robust04-instructions-mteb", False),
"News21InstructionRetrieval": ("jhu-clsp/news21-instructions-mteb", False),
"mFollowIR": ("jhu-clsp/mfollowir-parquet-mteb", True),
"mFollowIRCrossLingual": (
"jhu-clsp/mfollowir-cross-lingual-parquet-mteb",
True,
),
}
hf_path, is_multilingual = TASK_TO_HF_DATASET[task_name]
if is_multilingual:
# figure out which of the languages this is: ["zho", "rus", "fas"]
# gather the changed_qrels for each, and store the keys as a check
for lang in ["zho", "rus", "fas"]:
config_name = f"qrel_diff-{lang}"
changed_qrels = {

task_scores = {}
if task_name in ["NevIR"]:
paired_score = paired_accuracy(qrels, results, scores)
task_scores["paired_accuracy"] = paired_score
if task_name in ["InstructIR"]:
robustness_at_10_score = robustness_at_10(qrels, results, scores)
task_scores["robustness_at_10"] = robustness_at_10_score
if task_name in [
"mFollowIR",
"mFollowIRCrossLingual",
"Robust04InstructionRetrieval",
"Core17InstructionRetrieval",
"News21InstructionRetrieval",
]:
p_mrr_and_consolidated_scores = evaluate_p_mrr_change(
results, qrels, task_name, k_values
)
task_scores.update(p_mrr_and_consolidated_scores)
if task_name in ["MindSmallReranking"]:
take_max_over_subqueries = max_over_subqueries(qrels, results, k_values)
task_scores.update(take_max_over_subqueries)

@Samoed Samoed added the v2 Issues and PRs related to `v2` branch label Feb 7, 2025
@Samoed Samoed added this to the v2.0.0 milestone Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2 Issues and PRs related to `v2` branch
Projects
None yet
Development

No branches or pull requests

1 participant