Handle chunked multibyte characters #25
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In some large messages (in my case, roughly 8MB strings with very large JSON objects) I was running into
JSON.parse
failures on the server side.Before this fix, if the split between different chunks in a message occurred in the middle of a two-byte string, the first byte would still be parsed by
data.toString()
and converted into a � character. Then the second byte in the next chunk would also be converted into a � character. This would then throw the length of the string off increasing it by 1, and since this library uses the content length prefix at the beginning of the string, it would effectively ignore the last character, which is often a}
or]
. SoJSON.parse
would fail because of the missing closing bracket.Node's built-in StringDecoder module is made specifically to ensure decoded strings don't contain incomplete multibyte characters. When it tries to decode a buffer that has an incomplete character, it'll set that last byte or two aside and hold on to them until the next time it's called.
I believe this will address the concerns in #11.