Skip to content
Change the repository type filter

All

    Repositories list

    • Landing page + leaderboard for SWE-Bench benchmark
      HTML
      5311Updated Mar 31, 2025Mar 31, 2025
    • Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
      Shell
      16515858Updated Mar 31, 2025Mar 31, 2025
    • SWE-bench

      Public
      SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
      Python
      MIT License
      4622.7k305Updated Mar 28, 2025Mar 28, 2025
    • sb-cli

      Public
      Run SWE-bench evaluations remotely
      Python
      MIT License
      0830Updated Mar 7, 2025Mar 7, 2025
    • .github

      Public
      0000Updated Feb 25, 2025Feb 25, 2025
    • Evaluation data + results for SWE-agent inference on HumanEvalFix task
      Jupyter Notebook
      0000Updated Jul 11, 2024Jul 11, 2024