-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: race when bumping items while loading a snapshot #4564
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: kostas <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstand something but why do we need to orcherstrate SetLoadInProgress on all the shards? Can we do it locally on each shard? i.e. independently for each shard?
Good question! Because master and replica might have different number of proactors. So imagine the following case: Flow 1 -> Sets the shard's 1 loading state to true. Flow 1 had only one item, However, saying this, I think there is a better solution! If we call |
i did not analyse the code but maybe using a |
Signed-off-by: kostas <[email protected]>
@@ -735,4 +735,19 @@ TEST_F(RdbTest, HugeKeyIssue4497) { | |||
EXPECT_EQ(Run({"flushall"}), "OK"); | |||
} | |||
|
|||
TEST_F(RdbTest, HugeKeyIssue4554) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails without my changes.
IMO, I wanted to reproduce via a replication test, because only replication is used on the issue. However, it proved to be a little more difficult than expected for reasons hard for me to explain in a few lines. I am fairly positive that this PR will address all the problems but I would like to have a binary asap to sent and verify that we did see the same/similar case on the issue as here.
Now that we have a test I will polish. Let me think and I will get back to you soon |
@romange sequence numbers worked like charm! No more extra dispatches :) |
@@ -2554,6 +2562,8 @@ void RdbLoader::LoadItemsBuffer(DbIndex db_ind, const ItemsBuf& ib) { | |||
DbContext db_cntx{&namespaces->GetDefaultNamespace(), db_ind, GetCurrentTimeMs()}; | |||
DbSlice& db_slice = db_cntx.GetDbSlice(es->shard_id()); | |||
|
|||
DCHECK(!db_slice.IsCacheMode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe CHECK
instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need
@@ -283,7 +283,6 @@ DbSlice::DbSlice(uint32_t index, bool cache_mode, EngineShard* owner) | |||
cache_mode_(cache_mode), | |||
owner_(owner), | |||
client_tracking_map_(owner->memory_resource()) { | |||
load_in_progress_ = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the CHECK at db_slice.cc:783] Check failed: fetched_items_.empty()
and replace it with DFATAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I forgot this, you mentioned in it in the standup
src/server/db_slice.h
Outdated
@@ -598,6 +601,7 @@ class DbSlice { | |||
size_t soft_budget_limit_ = 0; | |||
size_t table_memory_ = 0; | |||
uint64_t entries_count_ = 0; | |||
size_t load_in_progress_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use size_t for sizes, unsigned for counters, also maybe change the name to load_ref_cnt_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure no issue with that! Out of curiosity why unsigned for counter ? (size_t is also unsigned 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's very subjective size_t
and uint64_t are the same type but size_t tells me it's sizes, lengths etc.
counters - are usually uintxxx, that's how I try to declare types (proper readability aspect)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I will keep it in mind/ apply the notation. Cheers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: kostas <[email protected]>
src/server/db_slice.cc
Outdated
@@ -794,7 +794,8 @@ void DbSlice::FlushDbIndexes(const std::vector<DbIndex>& indexes) { | |||
std::swap(db_arr_[index]->trans_locks, flush_db_arr[index]->trans_locks); | |||
} | |||
|
|||
CHECK(fetched_items_.empty()); | |||
LOG_IF(DFATAL, fetched_items_.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since now we don't crash in release I will ask for the logs (just to make sure that we don't have any gaps)
The original issue was submitted in #4497 and we supplied a fix in #4507. However, the fix ignored that the
RdbLoader::Load()
function is run per flow/shard thread and the "poison pill" of updating the loading state at the end ofRdbLoader::Load()
introduced a race condition:Any flow
F
that finished loading its own snapshot first (relatively to the rest of the flows) will callSetLoadInProgress(false)
on ALL shard threads. The consequence of that is that other flows are not yet done (their respective RdbLoader::Load()` is still processing) and next time the use the db slice API will start Bumping up items because now load in progress is false.The fix is to update the state after all shard flows are done and similarly to update all shard flow
before
we start theLoad()
which shall provide a consistent state/view among all shard threads.Should resolve #4554
P.s. we might be able to simplify the new db slice state via the global loading state. That's something I will need to follow but I won't do this as part of this PR.