Skip to content
Change the repository type filter

All

    Repositories list

    • Evaluate your LLM's response with Prometheus and GPT4 💯
      Python
      60979121Updated Apr 25, 2025Apr 25, 2025
    • Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
      01210Updated Mar 25, 2025Mar 25, 2025
    • [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically designed for fine-grained evaluation on customized score rubric, Prometheus-Vision is a good alternative for human evaluation and GPT-4V evaluation.
      Python
      77430Updated Sep 13, 2024Sep 13, 2024
    • .github

      Public
      Organization README for prometheus-eval
      0000Updated Jun 11, 2024Jun 11, 2024
    • BiGGen-Bench Leaderboard
      Python
      0000Updated Jun 4, 2024Jun 4, 2024
    • Documentation and blogposts for Prometheus
      1000Updated May 1, 2024May 1, 2024
    • [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically designed for fine-grained evaluation on a customized score rubric, Prometheus is a good alternative for human evaluation and GPT-4 evaluation.
      Python
      1830340Updated Nov 11, 2023Nov 11, 2023