fix(ollama): handle incomplete JSON chunks in stream #995
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses one of the issues raised in #686
Problem
Ollama LLM returns the final chunk in parts when the chunk is too long. I only noticed this behaviour on the final chunk, I'm not sure if it happens on other chunks as well.
Solution
Improve the
json_responses_chunk_handler
to gracefully handle cases where a JSON chunk is split across buffer boundaries. If a chunk does not end with '}', it is considered incomplete and buffered until the next chunk arrives. This prevents JSON parsing errors and ensures all responses are processed correctly.I took part of the solution from this diff: https://github.com/patterns-ai-core/langchainrb/pull/644/files#diff-746ba2cd57580e32b0f013cbe3c8eaf8f1621e112c89f3af07983321dd6846dbL143-L148