Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traces sampler also total transactions #438

Open
ezioda004 opened this issue Nov 13, 2024 · 8 comments
Open

Traces sampler also total transactions #438

ezioda004 opened this issue Nov 13, 2024 · 8 comments

Comments

@ezioda004
Copy link

ezioda004 commented Nov 13, 2024

Hello,

I have a custom OTEL based instrumentation which I was testing out with ElasticNodeSDK. I can see that with 1% sample rate, total spans and total transaction count metrics both are being sampled.
I'm expecting only traces to be sampled and not transaction count.

Any way to fix this?

Below were the config:

OTEL_TRACES_SAMPLER="traceidratio"
OTEL_TRACES_SAMPLER_ARG="0.01"

I'm using Elastic Observability Cloud for APM.

@david-luna
Copy link
Member

Hi @ezioda004,

thanks for using out Elastic's Distribution of OpenTelemetry for Node.js :)

OpenTelemetry does not have the concept of a transaction and only exports Spans when instrumenting a service. What corresponds to a transaction is the Span of the incoming HTTP request. Elastic stack is smart enough to find which spans are transactions and shows them into your Kibana service detail view.

Samplers work on every Span including the ones that correspond to a transaction. So that's why you see transactions being sampled as well. I need to double check but I think transaction count is calculated using a query to Elastisearch.

Worth mentioning that this configuration will sample spans regardless if parent span was sampled out or not and maybe is not what you want. With this config a sampled span (a transaction one) may have some of its child spans sampled out resulting in gaps on the traces you look at Kibana. If you want to collect all spans form a sampled root span (aka transaction) you can set OTEL_TRACES_SAMPLER="parentbased_traceidratio" so child spans of a sampled one are always sampled as well. With this configuration of sampler you will get a similar behaviour that transactionSampleRate gives to elastic-apm-node

@ezioda004
Copy link
Author

Thanks for the quick response @david-luna.

From the elastic docs:

Regardless of the sampling decision, all traces retain transaction and error data. This means the following data will always accurately reflect all of your application’s requests, regardless of the configured sampling rate:

  • Transaction duration and transactions per minute
  • Transaction breakdown metrics
  • Errors, error occurrence, and error rate

Which does not seem to be the case here. I tried parentbased_traceidratio, but still the total requests, TPM, etc are also sampled.

@david-luna
Copy link
Member

Hi @ezioda004

This was dropped also for APM agents sending data to APM server with version > v8.0. You can check it in the specs https://github.com/elastic/apm/blob/main/specs/agents/tracing-sampling.md#non-sampled-transactions

It may be possible to implement your own sampler and pass it to ElasticNodeSDK configuration but I need to investigate further.

@ezioda004
Copy link
Author

ezioda004 commented Nov 15, 2024

Sure @david-luna

I guess what I'm looking for is a way to sample transactions/span on application level but send complete transactions/span metrics which are mentioned here.

Because currently, sending 100% sampled spans is reducing my applications's throughput by 2x in terms of TPM. So I want to keep config like 10% sampled spans and 100% endpoint metric as a balance.

@david-luna
Copy link
Member

@ezioda004

I've dug a bit on Samplers and I cannot see a clear way of having this behavior. However I think you may achieve what you want with a custom implementation by:

  • Using the AlwaysOn sampler
  • Having a middleware in your app which decides to suppress tracing based on your criteria (env var, etc.)

This way you will always have the root span of the incoming HTTP request and:

  • no child spans if you decide to suppress tracing
  • all child spans otherwise

This is a very simple example so you get the idea.

const { context } = require('@opentelemetry/api');
const { suppressTracing } = require('@opentelemetry/core');

// Utility function
function toSampledFn (fn) {
  return function () {
    const shouldSample = // your sampling logic here;
    const self = this;

    if (shouldSample) {
      return fn.apply(self, arguments); 
    }

    return context.with(suppressTracing(context.active()), () => fn.apply(self, arguments));
  }
}

// in your code
app.use(toSampledFn(function (req, res, next) {
  // Your app logic here
}));

@ezioda004
Copy link
Author

Hi @david-luna

Thanks for this, I'll try this.
This will do custom sampling, which should work for sampling out spans. I'm not clear on how this will ensure that transaction related metrics will be counted and sent to elastic APM. Could you help me understand that?

@david-luna
Copy link
Member

Could you help me understand that?

Usually when your request handler kicks in @opentelemetry/instrumentation-http already started a Span for the incoming request. Whatever operations you do inside the handler will run within the context of that root Span. The line

return context.with(suppressTracing(context.active()), () => fn.apply(self, arguments));

tells Opentelemetry API to run the target function with a given context. In that case the same context modified to produce NoopSpans if any instrumentation or user code used the API the start a new Span.

Then at export time the root span is sent and all child NoopSpans are dropped resulting on only having the one corresponding to a transaction.

@ezioda004
Copy link
Author

ezioda004 commented Dec 9, 2024

Hi @david-luna

I think I should've been more clear earlier, but my http server instrumentation is independent of node:http and node:net, its for a C++ http server which has binding in nodejs.

Is there any way to get calculated metrics like total transactions etc via head based sampling of OTEL based instrumentation in elastic APM?

It seems Kibana APM configuration is not supported yet.
https://www.elastic.co/guide/en/observability/8.16/apm-agent-configuration.html

Image

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants