kv-cells : fix tracking of seq_pos during cache reuse #14339
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fix #14298
We tracked the sequence positions present in the KV cells using an
std::set
because of the assumption that a position would only be present once at a maximum. However, during cache reuse, we do the following sequence of operations:llama.cpp/tools/server/server.cpp
Lines 3224 to 3228 in 439d562
I.e. we remove a chunk of positions in
[p0, p0 + match)
and then move another chunk[p1, p1 + match)
into its place by addingshift = p0 - p1
to all positions in that chunk. During this "movement", depending on the order that we apply the shift, a position could occur more than once for a short time, which breaks the assumption ofstd::set
. To fix that, we replacestd::set
withstd::map
and count the occurrences of each position, erasing the elements when the count reaches 0.