-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(scheduler): update otel-collector-config after jaeger endpoint does not supported #5170
Conversation
Hi @ducminhle. Thanks for your PR. I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ducminhle many thanks for your contributions, this looks good in general. I left one comment on whether we still need to expose port 14250
?
This PR covers docker compose installations, however the same issue you are fixing exists potentially with installations in k8s, e.g. check: tracing/k8s/otel-collector.yaml
. If you are willing to fix it as well in k8s that would be great, but happy to leave it to a separate contribution.
Please also outline the tests that you have done and any screenshots you might have. e.g. to make sure that traces appear in jeager
ui after this change.
ff8a241
to
1b10b4c
Compare
Thank you, @sakoush. I have solved your comment. Here is some screenshot, please let me know if you need more information.
In tracing/k8s/Makefile, I checked that we still use jaeger-operator v1.33.0 (I'm not sure this version is removed jaeger). My laptop does not have enough resources to run k8s, I will check it later and create new PR if we need to update. |
@cin-otis: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. |
/ok-to-test |
@ducminhle: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. |
- fixes dataflow engine logging errors by specifying the otel exporter protocol the updates are needed because the opentelemetry-java-instrumentation library requires a http(s) URI for the otel collector endpoint, regardless of the actual protocol. As the default for us is grpc, explicitly set the OTEL_EXPORTER_OTLP_PROTOCOL environment variable on dataflow pods - fixes otel-collector config to remove deprecated jagger exporter (jagger now supports otel directly). add the OTEL_EXPORTED_OTLP_PROTOCOL key to the seldon-tracing configMap and update the operator, crds and helm charts to support getting the value for this key from tracing config, similarly to how OTEL_EXPORTER_OTLP_PROTOCOL is fetched - update versions used by ansible for jagger and opentelemetry-operator - port 14250 no longer needs to be exposed under any config - fix dependency ordering for dataflow/gradle. previous ordering caused kafka-streams not to be able to find the slf4j logging provider this lead to logs produced by kafka-streams not being recorded Fixes # Internal issue references: #INFRA-568 Jagger latest is crashing #INFRA-464 Otel is not able to parse its config (deprecated exporters) Public issues: otel-collector-1 container not starting: #5189 related PR with partial otel functionality: #5170
I believe this is now solved via #5291, which includes changes required for things to work on k8s, as well as in local deployments. Many thanks for your contribution, and for bringing the issue to our attention. |
What this PR does / why we need it:
With existing config: otel-collector-config.yaml, I got an error when try to run the
make deploy-local
,otel-collector
container crashed with logs:I found the problem:
opentelemetry-collector-contrib
has removed jaeger and jaegerthrifthttp exporters from this commit: open-telemetry/opentelemetry-collector-contrib#26546. We need to update the endpoint to fix this issue (open-telemetry/opentelemetry-collector-contrib#15291 (comment))Which issue(s) this PR fixes:
Special notes for your reviewer: