Adding support for the GenAI Sdk #1393

ivanleomk · 2025-03-11T19:45:19Z

This PR adds support for the GenAI SDK with instructor through a new from_genai method.

Will stack 1-2 more PRs on this to support jinja templating and streaming

Important

Adds support for GenAI SDK with a new from_genai method, handling GENAI_TOOLS mode, and includes tests and dependency updates.

Behavior:
- Adds from_genai method in client_genai.py to support GenAI SDK with AsyncInstructor and Instructor based on use_async flag.
- Supports GENAI_TOOLS mode in function_calls.py, process_response.py, and reask.py.
Modes:
- Adds GENAI_TOOLS to Mode enum in mode.py.
Utilities:
- Adds extract_genai_system_message and convert_to_genai_messages in utils.py for message handling.
Dependencies:
- Adds google-genai to pyproject.toml.
Tests:
- Adds tests for GenAI SDK in tests/llm/test_genai/ directory.

^{This description was created by}^{for 4c28430. It will automatically update as commits are pushed.}

cloudflare-workers-and-pages · 2025-03-11T19:46:06Z

Deploying instructor-py with Cloudflare Pages

Latest commit:	`4c28430`
Status:	✅ Deploy successful!
Preview URL:	https://ef1f22aa.instructor-py.pages.dev
Branch Preview URL:	https://add-supp-genai-sdk.instructor-py.pages.dev

View logs

ivanleomk · 2025-03-11T20:04:22Z

Vertex tests are ok

tests/llm/test_vertexai/test_format.py ....                                                  [ 14%]
tests/llm/test_vertexai/test_message_parser.py ....                                          [ 28%]
tests/llm/test_vertexai/test_modes.py ......                                                 [ 50%]
tests/llm/test_vertexai/test_retries.py ....                                                 [ 64%]
tests/llm/test_vertexai/test_simple_types.py ......                                          [ 85%]
tests/llm/test_vertexai/test_stream.py ....                                                  [100%]

Google.Generative-AI sdk is ok too

tests/llm/test_gemini/evals/test_classification_enums.py ..........                          [ 15%]
tests/llm/test_gemini/evals/test_classification_literals.py ..........                       [ 31%]
tests/llm/test_gemini/evals/test_entities.py ..                                              [ 34%]
tests/llm/test_gemini/evals/test_extract_users.py ......                                     [ 44%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ......                                [ 53%]
tests/llm/test_gemini/test_format.py ....                                                    [ 60%]
tests/llm/test_gemini/test_list_content.py .                                                 [ 61%]
tests/llm/test_gemini/test_modes.py ....                                                     [ 68%]
tests/llm/test_gemini/test_multimodal_content.py ..                                          [ 71%]
tests/llm/test_gemini/test_patch.py ....                                                     [ 77%]
tests/llm/test_gemini/test_retries.py ....                                                   [ 84%]
tests/llm/test_gemini/test_roles.py .                                                        [ 85%]
tests/llm/test_gemini/test_simple_types.py ...                                               [ 90%]
tests/llm/test_gemini/test_stream.py ......                                                  [100%]

ellipsis-dev

❌ Changes requested. Reviewed everything up to 4c28430 in 3 minutes and 37 seconds

More details

Looked at 1189 lines of code in 14 files
Skipped 0 files when reviewing.
Skipped posting 13 drafted comments based on config settings.

1. instructor/process_response.py:523

Draft comment:
The use of map_to_gemini_function_schema in handle_genai_tools may be confusing. Consider defining a dedicated mapper (e.g., map_to_genai_function_schema) if the schema for GENAI differs from GEMINI.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85%
None

2. instructor/utils.py:889

Draft comment:
When concatenating system messages in extract_genai_system_message, consider stripping extra trailing newlines to avoid unintended formatting in the final system message.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 85%
None

3. instructor/client_genai.py:21

Draft comment:
Consider using an f-string or a clearer error message in the assert checking the 'mode' value so that the allowed value is properly interpolated.
Reason this comment was not posted:
Marked as duplicate.

4. instructor/function_calls.py:253

Draft comment:
Add a custom error message to the isinstance assert for 'completion' to enhance debuggability if the type check fails.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

5. instructor/utils.py:888

Draft comment:
Assertions in convert_to_genai_messages lack descriptive error messages. Add custom messages for checks like 'assert "role" in message' and 'assert "content" in message' to make failures clearer.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

6. instructor/reask.py:370

Draft comment:
Before directly accessing nested elements like response.candidates[0].content.parts[0], consider validating the structure to prevent unexpected IndexError in edge cases.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 85%
None

7. pyproject.toml:71

Draft comment:
Duplicate dependency 'google-genai' is declared in multiple sections. Please remove the redundancy to avoid confusion.
Reason this comment was not posted:
Comment was on unchanged code.

8. instructor/function_calls.py:107

Draft comment:
Minor typographical error in the docstring: "Its important to add a docstring..." should be "It's important to add a docstring..." to maintain proper grammar.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

9. instructor/process_response.py:107

Draft comment:
Typo: The debug log says 'Returning takes from IterableBase', but it should probably read 'Returning tasks from IterableBase' since the code iterates over model.tasks.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

10. instructor/process_response.py:381

Draft comment:
Typo: The variable name 'implict_forced_tool_message' appears to be misspelled. Consider renaming it to 'implicit_forced_tool_message' for clarity.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

11. instructor/reask.py:27

Draft comment:
Typo in the assertion message: 'Response must be a Anthropic Message' should be corrected to 'Response must be an Anthropic Message'.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

12. instructor/utils.py:899

Draft comment:
Typo found in the docstring: 'explicit system messsage' should be corrected to 'explicit system message'.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 85%
The comment is about a real typo in new code that was added in this PR. While docstring typos are minor, they are still worth fixing for code quality. The comment is clear, actionable, and even provides a suggestion for the fix. The typo is in newly added code, not existing code.
Docstring typos are very minor issues. Is this really important enough to flag in a code review?
While minor, docstrings are part of the code's documentation and should be professional. The fix is trivial and will improve code quality. Since this is new code being added, now is the right time to fix it.
Keep the comment. The typo is in new code, the fix is clear and simple, and maintaining quality documentation is important.

13. pyproject.toml:74

Draft comment:
The dependency name 'google-genai' on line 74 seems inconsistent with 'google-generativeai' used elsewhere in the file. Please verify if this is a typographical error and correct it for consistency.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 85%
The comment points out a real inconsistency. The same package appears to be referenced in two different ways: 'google-generativeai' and 'google-genai'. These are actually different packages - google-generativeai is the official Google package name, while google-genai appears to be different. This could cause confusion or issues. The comment is about a changed line and raises a valid concern.
I might be wrong about these being the same package - they could be two different packages with different purposes. Without checking the actual packages on PyPI, I can't be 100% certain.
Even if they are different packages, having such similar names in the same project is confusing and worth bringing up. The comment appropriately asks for verification rather than assuming they should be the same.
Keep the comment as it raises a valid concern about potentially confusing dependency names that needs clarification.

Workflow ID: wflow_0dRIzlqwTmTrlhZV

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2025-03-11T21:46:35Z

instructor/__init__.py

@@ -106,3 +106,8 @@
    from .client_perplexity import from_perplexity

    __all__ += ["from_perplexity"]
+
+if importlib.util.find_spec("google") is not None:


Consider checking for 'google.genai' via importlib.util.find_spec('google.genai') instead of 'google' to ensure the correct dependency is detected.

Suggested change

if importlib.util.find_spec("google") is not None:

if importlib.util.find_spec("google.genai") is not None:

ellipsis-dev · 2025-03-11T21:46:35Z

instructor/client_genai.py

+) -> instructor.Instructor | instructor.AsyncInstructor:
+    assert mode in {
+        instructor.Mode.GENAI_TOOLS,
+    }, "Mode must be one of {instructor.Mode.GENAI_TOOLS}"


Improve the error message for the mode assertion by using an f-string to dynamically include allowed mode(s) instead of a hardcoded literal.

Suggested change

}, "Mode must be one of {instructor.Mode.GENAI_TOOLS}"

}, f"Mode must be one of {instructor.Mode.GENAI_TOOLS}"

ellipsis-dev · 2025-03-11T21:46:35Z

instructor/client_genai.py

+@overload
+def from_genai(
+    client: Client,
+    mode: instructor.Mode = instructor.Mode.GEMINI_JSON,


Inconsistent default mode values: the async overload uses instructor.Mode.GEMINI_JSON while the sync overload and implementation default to instructor.Mode.GENAI_TOOLS. Please align these defaults or document the difference.

Suggested change

mode: instructor.Mode = instructor.Mode.GEMINI_JSON,

mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,

ellipsis-dev · 2025-03-11T21:46:36Z

pyproject.toml

@@ -135,3 +136,4 @@ cohere = ["cohere<6.0.0,>=5.1.8"]
 cerebras_cloud_sdk = ["cerebras-cloud-sdk<2.0.0,>=1.5.0"]
 fireworks-ai = ["fireworks-ai<1.0.0,>=0.15.4"]
 writer = ["writer-sdk<2.0.0,>=1.2.0"]
+google-genai=["google-genai>=1.5.0"]


The dependency key google-genai on line 139 appears to be inconsistent with the google-generativeai key used in the [project.optional-dependencies] section. Please check and correct if this is a typo.

ivanleomk added 4 commits March 11, 2025 17:13

feat: update pyproject.toml

0be816a

fix: figured out how to handle the function calling

ebceff3

fix: added support for retries and openai messages

b2f83f0

fix: add support for system prompt and add tests to verify its working

c24d523

github-actions bot added dependencies Pull requests that update a dependency file enhancement New feature or request python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 11, 2025

ivanleomk added 3 commits March 11, 2025 20:46

fix: add support for test format

75d192e

fix: fixed pyright errors

3ad58c7

fix: remove unused import

4c28430

ivanleomk marked this pull request as ready for review March 11, 2025 21:42

ivanleomk requested a review from jxnl March 11, 2025 21:42

ellipsis-dev bot reviewed Mar 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for the GenAI Sdk #1393

Adding support for the GenAI Sdk #1393

ivanleomk commented Mar 11, 2025 •

edited by ellipsis-dev bot

Loading

cloudflare-workers-and-pages bot commented Mar 11, 2025 •

edited

Loading

ivanleomk commented Mar 11, 2025

ellipsis-dev bot left a comment

ellipsis-dev bot Mar 11, 2025

ellipsis-dev bot Mar 11, 2025

ellipsis-dev bot Mar 11, 2025

ellipsis-dev bot Mar 11, 2025

	if importlib.util.find_spec("google") is not None:
	if importlib.util.find_spec("google.genai") is not None:

	}, "Mode must be one of {instructor.Mode.GENAI_TOOLS}"
	}, f"Mode must be one of {instructor.Mode.GENAI_TOOLS}"

	mode: instructor.Mode = instructor.Mode.GEMINI_JSON,
	mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,

Adding support for the GenAI Sdk #1393

Are you sure you want to change the base?

Adding support for the GenAI Sdk #1393

Conversation

ivanleomk commented Mar 11, 2025 • edited by ellipsis-dev bot Loading

cloudflare-workers-and-pages bot commented Mar 11, 2025 • edited Loading

Deploying instructor-py with Cloudflare Pages

ivanleomk commented Mar 11, 2025

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Mar 11, 2025

Choose a reason for hiding this comment

ellipsis-dev bot Mar 11, 2025

Choose a reason for hiding this comment

ellipsis-dev bot Mar 11, 2025

Choose a reason for hiding this comment

ellipsis-dev bot Mar 11, 2025

Choose a reason for hiding this comment

ivanleomk commented Mar 11, 2025 •

edited by ellipsis-dev bot

Loading

cloudflare-workers-and-pages bot commented Mar 11, 2025 •

edited

Loading