Skip to content

Introduces a separate thread in the UserManager. #21857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: devel
Choose a base branch
from

Conversation

johann-listunov
Copy link
Contributor

@johann-listunov johann-listunov commented Jul 15, 2025

Add a separate thread to the UserManger to handle the load of the user-cache asynchronously.
This pre-loads the user-cache as fast as possible (as soon as the global version changes)
Also, compared to the old version, this allows the read-only operation to work on the user-cache
for longer and not block each other while trying to load a new version. On a high load environment this also
means a more responsive UserManger.

Scope & Purpose

  • 💩 Bugfix
  • 🍕 New feature
  • 🔥 Performance improvement
  • 🔨 Refactoring/simplification

Checklist

  • Tests
    • Regression tests
    • C++ Unit tests
    • integration tests
    • resilience tests
  • 📖 CHANGELOG entry made
  • 📚 documentation written (release notes, API changes, ...)
  • Backports
    • Backport for 3.12.0: (Please link PR)
    • Backport for 3.11: (Please link PR)
    • Backport for 3.10: (Please link PR)

Related Information

(Please reference tickets / specification / other PRs etc)

  • Docs PR:
  • Enterprise PR:
  • GitHub issue / Jira ticket:
  • Design document:

@cla-bot cla-bot bot added the cla-signed label Jul 15, 2025
Comment on lines 183 to 194
void auth::UserManager::loadFromDB() {
TRI_ASSERT(ServerState::instance()->isSingleServerOrCoordinator());

if (_internalVersion.load(std::memory_order_acquire) == globalVersion()) {
void auth::UserManager::loadUserCacheAndStartUpdateThread() noexcept {
if (_usersInitialized.load(std::memory_order_relaxed) == true) {
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this method might be called multiple times, but not concurrently? Otherwise this would not work.

Copy link
Contributor

@jvolmer jvolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already discussed some stuff about trying to get rid of the failure points and possibly waiting for a version change in the tests, so here just some small comments I noticed.

Copy link
Contributor

@jvolmer jvolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. My main question is why you need the additional failure points or if we could get rid of them.

while (_internalVersion.load(std::memory_order_acquire) < globalVersion()) {
_internalVersion.wait(versionBeforeReload);
}
TRI_IF_FAILURE("UserManager::FailReload") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this failure point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted a way to really tell if the reload did anything. So use this in a test to verify internal state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't quite understand what the goal of this (in context of the test using the failure point) is. The test checks that this throws an exception, which is done if internalVersionWasUpdated && versionsAreEqual is true. internalVersionWasUpdated must always be true, so it basically checks versionsAreEqual.

In the context of the test, I think we assume noone else does any changes to either the users collection or the global version. In which case the previous loop also guarantees that. I think the way it currently is, the test with the failure point does not give any more value than such an assertion:

TRI_ASSERT(versionAfterReload <= _internalVersion.load(std::memory_order_relaxed));

or

TRI_ASSERT(versionAfterReload <= internalVersion);

, which is just the inverted loop condition, and thus quite obviously correct.

}
// Internally it will do globalReload, setGlobalVersion, and blocks
// until the internal thread synchronizes the versions (internal and global).
auto const internalVersionBeforeReload = um->internalVersion();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we do an assertion here on the internal version, just to make more visible that the internal version changes after triggerCacheRevalidation?

Comment on lines 404 to 410
void auth::UserManager::triggerCacheRevalidation() {
uint64_t versionToReloadTo = globalVersion() + 1;
uint64_t const versionBeforeReload = globalVersion();
triggerGlobalReload();
setGlobalVersion(versionToReloadTo);
while (_internalVersion.load() < versionToReloadTo) {
triggerLocalReload();
_internalVersion.wait(0);
setGlobalVersion(versionBeforeReload + 1);
while (_internalVersion.load(std::memory_order_acquire) < globalVersion()) {
_internalVersion.wait(versionBeforeReload);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code warrants a comment that describes exactly what guarantees this gives, and why, by describing the order of events this imposes.

goedderz and others added 2 commits August 6, 2025 15:21
Changes the sleep in the update to be interruptible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants