Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for the GenAI Sdk #1393

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Adding support for the GenAI Sdk #1393

wants to merge 7 commits into from

Conversation

ivanleomk
Copy link
Collaborator

@ivanleomk ivanleomk commented Mar 11, 2025

This PR adds support for the GenAI SDK with instructor through a new from_genai method.

Will stack 1-2 more PRs on this to support jinja templating and streaming


Important

Adds support for GenAI SDK with a new from_genai method, handling GENAI_TOOLS mode, and includes tests and dependency updates.

  • Behavior:
    • Adds from_genai method in client_genai.py to support GenAI SDK with AsyncInstructor and Instructor based on use_async flag.
    • Supports GENAI_TOOLS mode in function_calls.py, process_response.py, and reask.py.
  • Modes:
    • Adds GENAI_TOOLS to Mode enum in mode.py.
  • Utilities:
    • Adds extract_genai_system_message and convert_to_genai_messages in utils.py for message handling.
  • Dependencies:
    • Adds google-genai to pyproject.toml.
  • Tests:
    • Adds tests for GenAI SDK in tests/llm/test_genai/ directory.

This description was created by Ellipsis for 4c28430. It will automatically update as commits are pushed.

Copy link

cloudflare-workers-and-pages bot commented Mar 11, 2025

Deploying instructor-py with  Cloudflare Pages  Cloudflare Pages

Latest commit: 4c28430
Status: ✅  Deploy successful!
Preview URL: https://ef1f22aa.instructor-py.pages.dev
Branch Preview URL: https://add-supp-genai-sdk.instructor-py.pages.dev

View logs

@github-actions github-actions bot added dependencies Pull requests that update a dependency file enhancement New feature or request python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 11, 2025
@ivanleomk
Copy link
Collaborator Author

Vertex tests are ok

tests/llm/test_vertexai/test_format.py ....                                                  [ 14%]
tests/llm/test_vertexai/test_message_parser.py ....                                          [ 28%]
tests/llm/test_vertexai/test_modes.py ......                                                 [ 50%]
tests/llm/test_vertexai/test_retries.py ....                                                 [ 64%]
tests/llm/test_vertexai/test_simple_types.py ......                                          [ 85%]
tests/llm/test_vertexai/test_stream.py ....                                                  [100%]

Google.Generative-AI sdk is ok too

tests/llm/test_gemini/evals/test_classification_enums.py ..........                          [ 15%]
tests/llm/test_gemini/evals/test_classification_literals.py ..........                       [ 31%]
tests/llm/test_gemini/evals/test_entities.py ..                                              [ 34%]
tests/llm/test_gemini/evals/test_extract_users.py ......                                     [ 44%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ......                                [ 53%]
tests/llm/test_gemini/test_format.py ....                                                    [ 60%]
tests/llm/test_gemini/test_list_content.py .                                                 [ 61%]
tests/llm/test_gemini/test_modes.py ....                                                     [ 68%]
tests/llm/test_gemini/test_multimodal_content.py ..                                          [ 71%]
tests/llm/test_gemini/test_patch.py ....                                                     [ 77%]
tests/llm/test_gemini/test_retries.py ....                                                   [ 84%]
tests/llm/test_gemini/test_roles.py .                                                        [ 85%]
tests/llm/test_gemini/test_simple_types.py ...                                               [ 90%]
tests/llm/test_gemini/test_stream.py ......                                                  [100%]

@ivanleomk ivanleomk marked this pull request as ready for review March 11, 2025 21:42
@ivanleomk ivanleomk requested a review from jxnl March 11, 2025 21:42
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 4c28430 in 3 minutes and 37 seconds

More details
  • Looked at 1189 lines of code in 14 files
  • Skipped 0 files when reviewing.
  • Skipped posting 13 drafted comments based on config settings.
1. instructor/process_response.py:523
  • Draft comment:
    The use of map_to_gemini_function_schema in handle_genai_tools may be confusing. Consider defining a dedicated mapper (e.g., map_to_genai_function_schema) if the schema for GENAI differs from GEMINI.
  • Reason this comment was not posted:
    Confidence changes required: 80% <= threshold 85%
    None
2. instructor/utils.py:889
  • Draft comment:
    When concatenating system messages in extract_genai_system_message, consider stripping extra trailing newlines to avoid unintended formatting in the final system message.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 85%
    None
3. instructor/client_genai.py:21
  • Draft comment:
    Consider using an f-string or a clearer error message in the assert checking the 'mode' value so that the allowed value is properly interpolated.
  • Reason this comment was not posted:
    Marked as duplicate.
4. instructor/function_calls.py:253
  • Draft comment:
    Add a custom error message to the isinstance assert for 'completion' to enhance debuggability if the type check fails.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. instructor/utils.py:888
  • Draft comment:
    Assertions in convert_to_genai_messages lack descriptive error messages. Add custom messages for checks like 'assert "role" in message' and 'assert "content" in message' to make failures clearer.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. instructor/reask.py:370
  • Draft comment:
    Before directly accessing nested elements like response.candidates[0].content.parts[0], consider validating the structure to prevent unexpected IndexError in edge cases.
  • Reason this comment was not posted:
    Confidence changes required: 66% <= threshold 85%
    None
7. pyproject.toml:71
  • Draft comment:
    Duplicate dependency 'google-genai' is declared in multiple sections. Please remove the redundancy to avoid confusion.
  • Reason this comment was not posted:
    Comment was on unchanged code.
8. instructor/function_calls.py:107
  • Draft comment:
    Minor typographical error in the docstring: "Its important to add a docstring..." should be "It's important to add a docstring..." to maintain proper grammar.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. instructor/process_response.py:107
  • Draft comment:
    Typo: The debug log says 'Returning takes from IterableBase', but it should probably read 'Returning tasks from IterableBase' since the code iterates over model.tasks.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. instructor/process_response.py:381
  • Draft comment:
    Typo: The variable name 'implict_forced_tool_message' appears to be misspelled. Consider renaming it to 'implicit_forced_tool_message' for clarity.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. instructor/reask.py:27
  • Draft comment:
    Typo in the assertion message: 'Response must be a Anthropic Message' should be corrected to 'Response must be an Anthropic Message'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. instructor/utils.py:899
  • Draft comment:
    Typo found in the docstring: 'explicit system messsage' should be corrected to 'explicit system message'.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 85%
    The comment is about a real typo in new code that was added in this PR. While docstring typos are minor, they are still worth fixing for code quality. The comment is clear, actionable, and even provides a suggestion for the fix. The typo is in newly added code, not existing code.
    Docstring typos are very minor issues. Is this really important enough to flag in a code review?
    While minor, docstrings are part of the code's documentation and should be professional. The fix is trivial and will improve code quality. Since this is new code being added, now is the right time to fix it.
    Keep the comment. The typo is in new code, the fix is clear and simple, and maintaining quality documentation is important.
13. pyproject.toml:74
  • Draft comment:
    The dependency name 'google-genai' on line 74 seems inconsistent with 'google-generativeai' used elsewhere in the file. Please verify if this is a typographical error and correct it for consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 85%
    The comment points out a real inconsistency. The same package appears to be referenced in two different ways: 'google-generativeai' and 'google-genai'. These are actually different packages - google-generativeai is the official Google package name, while google-genai appears to be different. This could cause confusion or issues. The comment is about a changed line and raises a valid concern.
    I might be wrong about these being the same package - they could be two different packages with different purposes. Without checking the actual packages on PyPI, I can't be 100% certain.
    Even if they are different packages, having such similar names in the same project is confusing and worth bringing up. The comment appropriately asks for verification rather than assuming they should be the same.
    Keep the comment as it raises a valid concern about potentially confusing dependency names that needs clarification.

Workflow ID: wflow_0dRIzlqwTmTrlhZV


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -106,3 +106,8 @@
from .client_perplexity import from_perplexity

__all__ += ["from_perplexity"]

if importlib.util.find_spec("google") is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider checking for 'google.genai' via importlib.util.find_spec('google.genai') instead of 'google' to ensure the correct dependency is detected.

Suggested change
if importlib.util.find_spec("google") is not None:
if importlib.util.find_spec("google.genai") is not None:

) -> instructor.Instructor | instructor.AsyncInstructor:
assert mode in {
instructor.Mode.GENAI_TOOLS,
}, "Mode must be one of {instructor.Mode.GENAI_TOOLS}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve the error message for the mode assertion by using an f-string to dynamically include allowed mode(s) instead of a hardcoded literal.

Suggested change
}, "Mode must be one of {instructor.Mode.GENAI_TOOLS}"
}, f"Mode must be one of {instructor.Mode.GENAI_TOOLS}"

@overload
def from_genai(
client: Client,
mode: instructor.Mode = instructor.Mode.GEMINI_JSON,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent default mode values: the async overload uses instructor.Mode.GEMINI_JSON while the sync overload and implementation default to instructor.Mode.GENAI_TOOLS. Please align these defaults or document the difference.

Suggested change
mode: instructor.Mode = instructor.Mode.GEMINI_JSON,
mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,

@@ -135,3 +136,4 @@ cohere = ["cohere<6.0.0,>=5.1.8"]
cerebras_cloud_sdk = ["cerebras-cloud-sdk<2.0.0,>=1.5.0"]
fireworks-ai = ["fireworks-ai<1.0.0,>=0.15.4"]
writer = ["writer-sdk<2.0.0,>=1.2.0"]
google-genai=["google-genai>=1.5.0"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency key google-genai on line 139 appears to be inconsistent with the google-generativeai key used in the [project.optional-dependencies] section. Please check and correct if this is a typo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant