feat: Adding watsonx support in Haystack #1949

divyaruhil · 2025-06-15T20:10:39Z

Related Issues

fixes Add watsonx support to Haystack #1891

Proposed Changes:

Add WatsonxGenerator and WatsonxChatGenerator components to wrap IBM watsonx.ai’s text- and chat-generation APIs (supporting streaming, custom models, and generation parameters).

Add WatsonxTextEmbedder and WatsonxDocumentEmbedder components that support embedding use cases for both text and documents.

Ensure all components follow the existing Haystack interfaces and patterns for Generator and Embedder.

Include tests files accordingly.

How did you test it?

Created unit and integration test files for the new WatsonxGenerator, WatsonxTextEmbedder, and WatsonxDocumentEmbedder components.
Verified functionality by running the test suite to ensure all components behave as expected.
Manually tested by importing the components into a Jupyter Notebook and running example queries to confirm that:
--Text generation works with WatsonxGenerator
--Chat generation works with WatsonxChatGenerator
-- Embeddings are correctly generated for both text and documents using the respective embedder components

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

CLAassistant · 2025-06-15T20:10:45Z

All committers have signed the CLA.

sjrl · 2025-06-16T06:51:56Z

Hey @divyaruhil thanks for working on this integration! Here are some initial comments I have before doing an in-depth review:

Add a section for integration:watsonx in our labeler https://github.com/deepset-ai/haystack-core-integrations/blob/main/.github/labeler.yml
Add a GitHub CI workflow in this folder https://github.com/deepset-ai/haystack-core-integrations/tree/main/.github/workflows You can follow the one for Anthropic as inspiration
Add a CHANGELOG.md file to integrations/watsonx/. You can see an example for how we did this when we added the OpenRouter integration here
Update the repo-level README.md (can find here) with a row for watsonx-haystack
Please add py.typed files at the same level that the watsonx folders appear. For example, check out how this is done in our Bedrock integration here I also attached an image to show the structure

integrations/watsonx/src/haystack_integrations/components/__init__.py

integrations/watsonx/pyproject.toml

divyaruhil · 2025-06-16T12:26:46Z

Thankyou @sjrl , for reviewing , sure i'll make the requested changes .

.github/workflows/watsonx.yml

integrations/watsonx/tests/test_chat_generator.py

integrations/watsonx/tests/test_generator.py

integrations/watsonx/src/haystack_integrations/components/embedders/watsonx/__init__.py

sjrl · 2025-06-23T13:41:31Z

@divyaruhil thanks for making the changes so far!

I'll also go ahead and do a deeper dive review on the actual components themselves now this week.

integrations/watsonx/pyproject.toml

integrations/watsonx/tests/test_document_embedder.py

integrations/watsonx/tests/test_text_embedder.py

integrations/watsonx/pyproject.toml

sjrl · 2025-06-24T06:16:07Z

Hey @divyaruhil I realize there are quite a few comments. I'd also be happy to push the changes myself if you are willing to give me write access to your branch. Let me know!

divyaruhil · 2025-06-24T06:38:28Z

Hi @sjrl, thank you so much for reviewing! I know it’s quite a large PR, and I really appreciate your time. This is my first big contribution, so I’m still learning a lot as I go — which is probably why there are quite a few mistakes. I’ve already enabled “Allow edits by maintainers,” so feel free to push any changes directly to the branch!

divyaruhil · 2025-06-24T06:43:09Z

Update the repo-level README.md (can find here) with a row for watsonx-haystack

Also, @sjrl can you please help me with this part? I’m not entirely sure how to go about it and could use a bit of guidance.

sjrl · 2025-06-24T06:50:23Z

Update the repo-level README.md (can find here) with a row for watsonx-haystack

Also, @sjrl can you please help me with this part? I’m not entirely sure how to go about it and could use a bit of guidance.

Yeah for sure! Add a new line for this integration here https://github.com/divyaruhil/haystack-core-integrations/blob/45a180bd2071550b222e35939903d3617b94036f/README.md?plain=1#L62

I'd suggest using the Bedrock (near the top) as an example to follow

sjrl · 2025-06-24T07:09:25Z

...grations/watsonx/src/haystack_integrations/components/embedders/watsonx/document_embedder.py

+        self,
+        model: str = "ibm/slate-30m-english-rtrvr",


Repeating this just in case it may be missed.

For all component's let's enforce keyword arguments by making the following change

Suggested change

self,

model: str = "ibm/slate-30m-english-rtrvr",

self,

*,

model: str = "ibm/slate-30m-english-rtrvr",

...tions/watsonx/src/haystack_integrations/components/generators/watsonx/chat/chat_generator.py

sjrl · 2025-06-24T07:14:09Z

integrations/watsonx/src/haystack_integrations/components/generators/watsonx/generator.py

+        deserialize_secrets_inplace(data["init_parameters"], keys=["api_key"])
+        return default_from_dict(cls, data)
+
+    @component.output_types(replies=list[str], meta=list[dict[str, Any]], chunks=list[StreamingChunk])


Ahh we don't typically return the chunks as part of the response. We usually only create them internally and pass them to a streaming_callback function. You can see an example of that in our OpenAIChatGenerator._handle_stream_response where you can see one of it's args is callback: SyncStreamingCallbackT. You can find the function here

sjrl · 2025-06-24T07:17:40Z

integrations/watsonx/src/haystack_integrations/components/generators/watsonx/generator.py

+@component
+class WatsonxGenerator:
+    """
+    Generates text using IBM's watsonx.ai foundational models.


Sorry to have not brought this up earlier, but it would be great if you could actually have WatsonxGenerator inherit from WatsonxChatGenerator and simply overwrite the relevant methods (e.g. run and run_async) to work with our standard Generator.run inputs which are

@component.output_types(replies=List[str], meta=List[Dict[str, Any]]) def run( self, prompt: str, system_prompt: Optional[str] = None, streaming_callback: Optional[StreamingCallbackT] = None, generation_kwargs: Optional[Dict[str, Any]] = None, ):

This is a pattern we are planning to follow for our other chat generators but haven't had a chance to do yet. But hopefully this should help reduce duplicate code between the two components.

Let me know if you need anymore clarifications!

integrations/watsonx/pyproject.toml

Co-authored-by: Sebastian Husch Lee <[email protected]>

divyaruhil · 2025-06-25T05:16:52Z

Hi @sjrl, after committing the suggested changes, some tests are now failing. Should I revert those changes? Also, just a quick heads-up—it might take me a little while to thoughtfully work through all the requested updates, but I’m on it!

integrations/watsonx/pyproject.toml

sjrl · 2025-06-25T07:56:33Z

Hi @sjrl, after committing the suggested changes, some tests are now failing. Should I revert those changes? Also, just a quick heads-up—it might take me a little while to thoughtfully work through all the requested updates, but I’m on it!

Hey @divyaruhil my colleague @anakin87 and I decided to go through the pyproject.toml and the github workflow to make sure it matched with our other integrations. Currently everything is passing which is great!

So I think it's safe to leave those files alone now and please go ahead with the other requested updates!

divyaruhil · 2025-06-25T10:30:14Z

Sure, @sjrl! Really appreciate the time and effort you and @anakin87 have put into the review. I’ll work through the remaining comments and aim to wrap things up as my schedule allows

Add WatsonX integration

3b75838

divyaruhil requested a review from a team as a code owner June 15, 2025 20:10

divyaruhil requested review from Amnah199 and removed request for a team June 15, 2025 20:10

github-actions bot added the type:documentation Improvements or additions to documentation label Jun 15, 2025

This was referenced Jun 15, 2025

feat: Adding watsonx support in Haystack deepset-ai/haystack#9478

Closed

Add watsonx support to Haystack #1891

Open

pre-commit fix

849bdf7

sjrl reviewed Jun 16, 2025

View reviewed changes

integrations/watsonx/src/haystack_integrations/components/__init__.py Outdated Show resolved Hide resolved

sjrl added the integration:watsonx label Jun 16, 2025

sjrl reviewed Jun 16, 2025

View reviewed changes

integrations/watsonx/pyproject.toml Outdated Show resolved Hide resolved

sjrl reviewed Jun 16, 2025

View reviewed changes

integrations/watsonx/pyproject.toml Show resolved Hide resolved

divyaruhil added 4 commits June 18, 2025 06:13

Remove file from commit

bcdac1d

adding py.typed files

fdf7b5e

adding CHANGELOG.md

d65c1ee

adding workflow for watsonx

9d41492

github-actions bot added the topic:CI label Jun 23, 2025

update pyproject.toml

7e3f46f

sjrl reviewed Jun 23, 2025

View reviewed changes

.github/workflows/watsonx.yml Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

.github/workflows/watsonx.yml Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

.github/workflows/watsonx.yml Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

.github/workflows/watsonx.yml Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

integrations/watsonx/tests/test_chat_generator.py Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

integrations/watsonx/tests/test_generator.py Outdated Show resolved Hide resolved

sjrl reviewed Jun 23, 2025

View reviewed changes

integrations/watsonx/src/haystack_integrations/components/embedders/watsonx/__init__.py Show resolved Hide resolved

sjrl removed the request for review from Amnah199 June 23, 2025 13:41