Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add clustering_max_lag to /metrics #805

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

viktorerlingsson
Copy link
Member

WHAT is this pull request doing?

Adds clustering_max_lag to prometheus metrics

HOW can this pull request be tested?

curl http://guest:guest@localhost:15672/metrics

@viktorerlingsson viktorerlingsson requested a review from a team as a code owner October 11, 2024 13:53
writer.write({name: "clustering_max_lag",
value: Config.instance.clustering_max_lag,
type: "gauge",
help: "Max unsynced replicated messages"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT?

Suggested change
help: "Max unsynced replicated messages"})
help: "Maximum allowed unsynced replicated messages"})

Also, here it says "actions"

property clustering_max_lag = 8192 # number of clustering actions
, not sure if we want to be consistent :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied the text from here https://github.com/cloudamqp/lavinmq/blob/main/src/lavinmq/config.cr#L136 🙈
But yeah, we should be consistent!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe easiest to update the comment then, hehe :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah :)
Hmm, this all might be a little confusing. clustering_max_lag is the maximum number of actions that a follower can lag behind, but the lag we show in GUI/metrics is how many actual bytes the follower is lagging behind. So the numbers don't really have much to do with each other... 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it can be confusing. Could we name it something like max_allowed_cluster_actions, and let lag be lag_bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be reasonable to rename at least one of them.
Maybe rename
lag -> lag_in_bytes,
max_lag -> max_lag_in_actions,
and add lag_in_actions 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's PRs for the changes mentioned in this thread
#810
#811

Co-authored-by: Patrik Ragnarsson <[email protected]>
@viktorerlingsson viktorerlingsson marked this pull request as draft November 21, 2024 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants