Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use reasoning models using oaieval #1580

Open
kedhar-kb-ta opened this issue Mar 5, 2025 · 0 comments
Open

Unable to use reasoning models using oaieval #1580

kedhar-kb-ta opened this issue Mar 5, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@kedhar-kb-ta
Copy link

kedhar-kb-ta commented Mar 5, 2025

I have written an eval using the Model-Graded Classification template. Below is my eval YAML and modelgraded_spec

custom_eval:
  id: custom_eval.dev.match-v1
  metrics: [accuracy]
  description: 

custom_eval.dev.match-v1:
  class: evals.elsuite.modelgraded.classify:ModelBasedClassify
  args:
    samples_jsonl: inputs.jsonl
    eval_type: classify
    modelgraded_spec: entity_identification
entity_identification:
  prompt: |-
    You are comparing a submitted answer to an expert answer on a given question. Here is the data:
    [BEGIN DATA]
    ************
    [Question]: {input}
    ************
    [Expert]: {ideal}
    ************
    [Submission]: {completion}
    ************
    [END DATA]

    Compare the factual content of the submitted answer with the expert answer. Ignore any differences in style, grammar, or punctuation.
    The submitted answer may either be a subset or superset of the expert answer, or it may conflict with it. Determine which case applies. Answer the question by selecting one of the following options:
    (A) The submitted answer contains all the same details as the expert answer.
    (B) The answers differ, but these differences don't matter from the perspective of factuality.
  choice_strings: AB
  input_outputs:
    input: completion

oaieval --registry_path=evals/registry/ gpt-4o-mini custom_eval.dev.match-v1

When running the eval using a reasoning model, I encounter the following error:
openai.NotFoundError: Error code: 404 - {'error': {'message': 'This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?', 'type': 'invalid_request_error', 'param': 'model', 'code': None}}

To Reproduce

oaieval --registry_path=evals/registry/ gpt-4o-mini "any eval template"

Code snippets

OS

ubuntu 22.04

Python version

3.12

Library version

oaieval==1.0.6

@kedhar-kb-ta kedhar-kb-ta added the bug Something isn't working label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant