Skip to content

v2.5.0-rc2

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 02 Sep 15:22
· 141 commits to main since this release

Release Notes

Upgrade Notes

  • Remove ChatMessage.to_openai_format method. Use haystack.components.generators.openai_utils._convert_message_to_openai_format instead.
  • Remove unused debug parameter from Pipeline.run method.
  • Removing deprecated SentenceWindowRetrieval, replaced by SentenceWindowRetriever

New Features

  • Add unsafe argument to enable behaviour that could lead to remote code execution in ConditionalRouter and OutputAdapter. By default unsafe behaviour is not enabled, the user must set it explicitly to True. This means that user types like ChatMessage, Document, and Answer can be used as output types when unsafe is True. We recommend using unsafe behaviour only when the Jinja templates source is trusted. For more info see the documentation for ConditionalRouter and OutputAdapter

Enhancement Notes

  • The parameter min_top_k is added to the TopPSampler which sets the minimum number of documents to be returned when the top-p sampling algorithm results in fewer documents being selected. The documents with the next highest scores are added to the selection. This is useful when we want to guarantee a set number of documents will always be passed on, but allow the Top-P algorithm to still determine if more documents should be sent based on document score.
  • Introduce an utility function to deserialize a generic Document Store from the init_parameters of a serialized component.
  • Refactor deserialize_document_store_in_init_parameters so that new function name indicates that the operation occurs in place, with no return value.
  • The SentenceWindowRetriever has now an extra output key containing all the documents belonging to the context window.

Deprecation Notes

  • SentenceWindowRetrieval is deprecated and will be removed in future. Use SentenceWindowRetriever instead.
  • The 'gpt-3.5-turbo' as the default model for the OpenAIGenerator and OpenAIChatGenerator will be replaced by 'gpt-4o-mini'.

Bug Fixes

  • Fixed an issue where page breaks were not being extracted from DOCX files.
  • Use a forward reference for the Paragraph class in the DOCXToDocument converter to prevent import errors.
  • The metadata produced by DOCXToDocument component is now JSON serializable. Previously, it contained datetime objects automatically extracted from DOCX files, which are not JSON serializable. Now, the datetime objects are converted to strings.
  • Starting from haystack-ai==2.4.0, Haystack is compatible with sentence-transformers>=3.0.0; earlier versions of sentence-transformers are not supported. We are updating the test dependency and the LazyImport messages to reflect that.
  • For components that support multiple Document Stores, prioritize using the specific from_dict class method for deserialization when available. Otherwise, fall back to the generic default_from_dict method. This impacts the following generic components: CacheChecker, DocumentWriter, FilterRetriever, and SentenceWindowRetriever.