Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

producer/consumer metrics using redpanda_kafka_request_bytes_total are wrong #40

Open
hcoyote opened this issue Sep 26, 2024 · 2 comments

Comments

@hcoyote
Copy link
Contributor

hcoyote commented Sep 26, 2024

This metric incorrectly calculates the usage when a learner event is happening (decom, node add, etc). we should be using these instead for determining on-the-wire traffic for the cluster for produce/consume side throughput.

The metric should update to:

Producer traffic:

sum(rate(redpanda_rpc_received_bytes{redpanda_server="kafka", redpanda_id="$redpanda_id"}[5m])) by (cluster)

Consumer traffic:

sum(rate(redpanda_rpc_sent_bytes{redpanda_server="kafka", redpanda_id="$redpanda_id"}[5m])) by (cluster)

Adjust the labels accordingly to fit the observability repo dashboards.

@bpraseed
Copy link

bpraseed commented Oct 9, 2024

@hcoyote - does this fix sharechat issue of them seeing replication traffic on the consumer side ?

@pmw-rp
Copy link
Contributor

pmw-rp commented Nov 8, 2024

Neither redpanda_rpc_received_bytes nor redpanda_rpc_sent_bytes includes topic-level detail. In contrast, redpanda_kafka_request_bytes_total does provide topic-level detail (using the redpanda_topic label).

Whether or not that matters is down to the use case. In this example, I wouldn't say we can use one in place of the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants