Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObservationConventions for Grpc can cause exception in PrometheusMeterRegistry #5609

Open
rafal-dudek opened this issue Oct 28, 2024 · 1 comment

Comments

@rafal-dudek
Copy link
Contributor

Describe the bug
DefaultGrpcServerObservationConvention and DefaultGrpcClientObservationConvention does not set 'grpc.status_code' when status code is missing (e.g. at the start of the call).

Micrometer Observation is invoking "addLowCardinalityKeyValues" at the start of the observation and at the end of the observation. At the start, value for status code is null.
In such case, the label is not added: https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/grpc/DefaultGrpcServerObservationConvention.java#L51

If the call is interrupted (e.g. cancelled) and does not finish with the status code, the metric is collected with 4 labels, e.g:
[tag(error=none),tag(rpc.method=hello),tag(rpc.service=Greeter),tag(rpc.type=UNARY)]

When the next call is finished, the metric is collected with 5 labels e.g.:
[tag(error=none),tag(grpc.status_code=OK),tag(rpc.method=hello),tag(rpc.service=Greeter),tag(rpc.type=UNARY)]

For many MeterRegistries this is not a problem, but when using PrometheusMeterRegistry it creates exception:

Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'grpc.server' containing tag keys [error, rpc.method, rpc.service, rpc.type]. The meter you are attempting to register has keys [error, grpc.status_code, rpc.method, rpc.service, rpc.type].

https://github.com/micrometer-metrics/micrometer/blob/main/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheusmetrics/PrometheusMeterRegistry.java#L583

Different components are reporting "UNKNOWN" value for such cases, instead of omitting label e.g.:
https://github.com/spring-projects/spring-framework/blob/main/spring-web/src/main/java/org/springframework/http/server/observation/DefaultServerRequestObservationConvention.java#L125
https://github.com/spring-projects/spring-framework/blob/main/spring-web/src/main/java/org/springframework/http/server/reactive/observation/DefaultServerRequestObservationConvention.java#L126

Environment

  • Micrometer version: 1.13.6
  • Micrometer registry: prometheus
  • OS: Windows
  • Java version: 21

To Reproduce
How to reproduce the bug:
Create request to GrpcService and cancel it before it is finished (artificial Sleep in the Service may be required), then create request and let it be completed.

GreeterGrpc.GreeterFutureStub stub = GreeterGrpc.newFutureStub(inProcChannel);
stub.hello(helloRequest).cancel(true);
stub.hello(helloRequest).get();

Set Debug breakpoint here: https://github.com/micrometer-metrics/micrometer/blob/main/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheusmetrics/PrometheusMeterRegistry.java#L583
or add MeterRegistrationFailedListener to see the exception.

Expected behavior
DefaultGrpcServerObservationConvention and DefaultGrpcClientObservationConvention should not cause error in PrometheusMeterRegistry in any scenario.

@ttddyy
Copy link
Contributor

ttddyy commented Oct 29, 2024

I think part of the PR(#3512) fixes this problem.
It always sets the status code to UNKNOWN when the value is not available.

@shakuzen @jonatan-ivanov
Let me know how you’d like to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants