You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have tried asking for help in the community on discord or discussions and have not received a response.
I have tried searching the documentation and have not found an answer.
What Model are you using?
gpt-3.5-turbo
gpt-4-turbo
gpt-4
Other (please specify) Gemini
Describe the bug
Gemini function calling doesn't preserve the order of the keys of the schema provided, which significantly reduces the performance of tasks that depend on chain-of-thought reasoning.
For example, for a sample of 200 questions from GSM8K, you get this performance difference:
What Model are you using?
Describe the bug
Gemini function calling doesn't preserve the order of the keys of the schema provided, which significantly reduces the performance of tasks that depend on chain-of-thought reasoning.
For example, for a sample of 200 questions from GSM8K, you get this performance difference:
GEMINI_TOOL - Mean: 39.00% CI: 32.22% - 45.78%
GEMINI_JSON - Mean: 94.50% CI: 91.33% - 97.67%
You can verify this by extracting the text generated in
_raw_response
.To Reproduce
Here's a notebook you can use to reproduce the result: https://github.com/dylanjcastillo/blog/blob/main/_extras/gemini-structured-outputs/gemini-structured-outputs-benchmarks-instructor.ipynb
I wrote a more detailed analysis here: http://dylancastillo.co/posts/gemini-structured-outputs.html
Expected behavior
Given that this is fairly typical use case, I'd suggest making
GEMINI_JSON
the default approach and warn users about this issue.Screenshots
If applicable, add screenshots to help explain your problem.
The text was updated successfully, but these errors were encountered: