fixChangeFeedHangWhenUsingStaleContainerRid #43114
Open
+334
−16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue:
When query changeFeed with an invalid continuation token (container got recreated in this case, which means the continuationToken uses stale containerRid), SDK will return incorrect result or hang.
Root Cause:
For container re-created with same RU or same feedRanges -> SDK will reuse the token(lsn) to resume the read from the new container, which could cause missing data
For container re-created with different RU or different feed Ranges, the feedRange in the continuationToken spans multiple partitions of the new container -> SDK will be in a hang status (due to endless retries).
Fixes:
x-ms-cosmos-intended-collection-rid
for changeFeed request, eventually 400/1024 will be bubbled up to customer