TraceQL: Fix lookups .attr != nil with special columns #4864

stoewer · 2025-03-17T05:14:02Z

What this PR does:
Improves attribute existence checks using { .attr != nil }. Queries involving well-known attributes and attributes stored in dedicated columns currently don't check generic attribute columns for attributes with the same name but a different type.

For example, the query { span.http.status_code != nil } will not match spans where http.status_code exists as a string (e.g. "200").

Which issue(s) this PR fixes:
Fixes #2968

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

joe-elliott · 2025-03-18T12:10:00Z

tempodb/encoding/vparquet4/block_traceql.go

@@ -1981,6 +1981,9 @@ func createSpanIterator(makeIter makeIterFn, innerIterators []parquetquery.Itera
 			if cond.Op == traceql.OpNone {
 				addPredicate(entry.columnPath, nil) // No filtering
 				columnSelectAs[entry.columnPath] = cond.Attribute.Name
+
+				// Look up generic columns as there may be attributes with the same name but of different types
+				genericConditions = append(genericConditions, cond)


what are the conditions where OpNone is passed? { span.foo != nil } {span.foo == span.bar}, others ?

Also, the attribute conditions are joined in differently depending on whether allConditions is true or not. Does this work in all cases?

what are the conditions where OpNone is passed? { span.foo != nil } {span.foo == span.bar}, others ?

The only other one is that comes to my mind is { span.foo }

Also, the attribute conditions are joined in differently depending on whether allConditions is true or not. Does this work in all cases?

Good call. Queries with allConditions=true were actually not working correctly. I now added a commit that disables allConditions when this kind of lookup occurs. Now it works. Do you think this is OK, or should we try to keep the allConditions optimization for this edge case?

others that came to mind:

{} | select(span.foo) // i think? { span.foo + 3 > 2 } // interesting that we could solve expressions like this for span.foo for increased perf

This change is quite similar to this one: #4391

Also Grafana has requested that we consider an "all types" mode where we attempt to coalesce data types for all comparisons in a query. This is a growing class of problems that we need a way to solve efficiently.

Do you think this is OK, or should we try to keep the allConditions optimization for this edge case?

I'm surprised our tests didn't catch that bug. Can you run some benches to get a sense of the regression? I think that will help us decide if it's worth optimizing.

I'm surprised our tests didn't catch that bug

I think two things happened here:

We didn't have good test cases for existence checks in TestBackendBlockSearchTraceQL (I now added them here)

In the new test TestBackendBlockSearchTraceQLNilSpecialColumns I focused on the actual edge case where we have a special column and a colliding attribute with the same name in the generic attrs, but didn't check for cases without this kind of collision

Yes the 40x increase in allocs looks wild. I'll have a closer look at the benchmark 👍

There is a good explanation for the 40x increase: the test case with the increase has the following query:

{ span.http.status_code.type = "string" && span.http.status_code != nil }

The test data I generated contains spans where http.status_code has string values. The above query filters those spans and then looks for the existence of of http.status_code, which matches all those strings. When benchmarking the same query on main, the iterators don't look at the generic attribute column at all and therefore the query does not match any spans.

In short: there were more allocations because there were much more matches

I ran a second round of benchmarks with the same data but replaced the following test cases:

{"spanWellKnownAndNotNilMatch", `{ span.net.host.name = "auth-service.local" && span.http.status_code != nil }`}, {"spanWellKnownOrNotNilMatch", `{ span.net.host.name = "auth-service.local" || span.http.status_code != nil }`},

Now the benchmark on the main branch also finds spans with span.http.status_code. Running the benchmark on this branch still matches more spans within the generic columns. Therefore the benchmarks still show an increase in allocations and sec/op, but on a much smaller scale:

Benchmark results

❯ benchstat bench-nil-special-cols-02-main.txt bench-nil-special-cols-02-fix.txt goos: linux goarch: amd64 pkg: github.com/grafana/tempo/tempodb/encoding/vparquet4 cpu: AMD Ryzen 7 PRO 6850U with Radeon Graphics │ bench-nil-special-cols-02-main.txt │ bench-nil-special-cols-02-fix.txt │ │ sec/op │ sec/op vs base │ BackendBlockTraceQL/spanAttrValMatch-16 684.0m ± 1% 670.4m ± 1% -1.98% (p=0.000 n=8) BackendBlockTraceQL/spanAttrValNoMatch-16 10.07m ± 1% 10.16m ± 2% ~ (p=0.083 n=8) BackendBlockTraceQL/spanIntrinsicMatch-16 336.7m ± 1% 337.2m ± 2% ~ (p=0.959 n=8) BackendBlockTraceQL/spanIntrinsicNoMatch-16 1.208m ± 2% 1.227m ± 1% ~ (p=0.130 n=8) BackendBlockTraceQL/spanAttrAndMatch-16 718.1m ± 1% 715.8m ± 1% ~ (p=0.234 n=8) BackendBlockTraceQL/spanAttrAndNoMatch-16 1.226m ± 2% 1.208m ± 1% -1.44% (p=0.000 n=8) BackendBlockTraceQL/spanAttrOrMatch-16 1.033 ± 1% 1.038 ± 2% ~ (p=0.328 n=8) BackendBlockTraceQL/spanAttrOrNoMatch-16 847.4m ± 0% 844.5m ± 1% ~ (p=0.065 n=8) BackendBlockTraceQL/spanAttrNotNilMatch-16 2.536 ± 1% 2.535 ± 1% ~ (p=0.574 n=8) BackendBlockTraceQL/spanAttrNotNilNoMatch-16 2.106m ± 2% 2.070m ± 1% ~ (p=0.083 n=8) BackendBlockTraceQL/spanWellKnownNotNilMatch-16 1.994 ± 1% 3.216 ± 1% +61.24% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedNotNilMatch-16 1.997 ± 1% 3.205 ± 1% +60.49% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownAndNotNilMatch-16 1.271 ± 1% 3.050 ± 1% +139.99% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownOrNotNilMatch-16 2.894 ± 1% 3.782 ± 1% +30.67% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedAndNotNilMatch-16 726.1m ± 1% 1566.0m ± 1% +115.66% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedOrNotNilMatch-16 1.841 ± 0% 2.726 ± 1% +48.06% (p=0.000 n=8) geomean 249.5m 305.3m +22.34% │ bench-nil-special-cols-02-main.txt │ bench-nil-special-cols-02-fix.txt │ │ B/s │ B/s vs base │ BackendBlockTraceQL/spanAttrValMatch-16 115.4Mi ± 0% 117.7Mi ± 1% +2.02% (p=0.000 n=8) BackendBlockTraceQL/spanAttrValNoMatch-16 786.4Mi ± 1% 779.6Mi ± 6% ~ (p=0.083 n=8) BackendBlockTraceQL/spanIntrinsicMatch-16 84.55Mi ± 1% 84.43Mi ± 2% ~ (p=0.959 n=8) BackendBlockTraceQL/spanIntrinsicNoMatch-16 983.9Mi ± 3% 969.0Mi ± 1% ~ (p=0.130 n=8) BackendBlockTraceQL/spanAttrAndMatch-16 110.8Mi ± 1% 111.1Mi ± 1% ~ (p=0.234 n=8) BackendBlockTraceQL/spanAttrAndNoMatch-16 584.6Mi ± 2% 593.1Mi ± 1% +1.46% (p=0.000 n=8) BackendBlockTraceQL/spanAttrOrMatch-16 77.30Mi ± 2% 76.89Mi ± 3% ~ (p=0.328 n=8) BackendBlockTraceQL/spanAttrOrNoMatch-16 94.19Mi ± 1% 94.51Mi ± 1% ~ (p=0.065 n=8) BackendBlockTraceQL/spanAttrNotNilMatch-16 39.84Mi ± 1% 39.85Mi ± 1% ~ (p=0.556 n=8) BackendBlockTraceQL/spanAttrNotNilNoMatch-16 986.7Mi ± 2% 1004.0Mi ± 3% ~ (p=0.083 n=8) BackendBlockTraceQL/spanWellKnownNotNilMatch-16 14.46Mi ± 1% 31.95Mi ± 1% +121.01% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedNotNilMatch-16 14.43Mi ± 1% 32.06Mi ± 1% +122.10% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownAndNotNilMatch-16 63.24Mi ± 1% 33.69Mi ± 1% -46.73% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownOrNotNilMatch-16 27.87Mi ± 1% 27.17Mi ± 1% -2.50% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedAndNotNilMatch-16 109.57Mi ± 1% 65.09Mi ± 3% -40.59% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedOrNotNilMatch-16 43.35Mi ± 2% 37.39Mi ± 1% -13.75% (p=0.000 n=8) geomean 105.7Mi 107.7Mi +1.86% │ bench-nil-special-cols-02-main.txt │ bench-nil-special-cols-02-fix.txt │ │ MB_io/op │ MB_io/op vs base │ BackendBlockTraceQL/spanAttrValMatch-16 82.75 ± 0% 82.75 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrValNoMatch-16 8.308 ± 0% 8.308 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanIntrinsicMatch-16 29.85 ± 0% 29.85 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanIntrinsicNoMatch-16 1.246 ± 0% 1.246 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrAndMatch-16 83.42 ± 0% 83.42 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrAndNoMatch-16 751.3m ± 0% 751.3m ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrOrMatch-16 83.69 ± 0% 83.69 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrOrNoMatch-16 83.69 ± 0% 83.69 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrNotNilMatch-16 105.9 ± 0% 105.9 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanAttrNotNilNoMatch-16 2.179 ± 0% 2.179 ± 0% ~ (p=1.000 n=8) ¹ BackendBlockTraceQL/spanWellKnownNotNilMatch-16 30.23 ± 0% 107.80 ± 0% +256.60% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedNotNilMatch-16 30.23 ± 0% 107.80 ± 0% +256.60% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownAndNotNilMatch-16 84.29 ± 0% 107.80 ± 0% +27.89% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownOrNotNilMatch-16 84.57 ± 0% 107.80 ± 0% +27.47% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedAndNotNilMatch-16 83.42 ± 0% 106.90 ± 0% +28.15% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedOrNotNilMatch-16 83.69 ± 0% 106.90 ± 0% +27.73% (p=0.000 n=8) geomean 27.67 34.48 +24.64% ¹ all samples are equal │ bench-nil-special-cols-02-main.txt │ bench-nil-special-cols-02-fix.txt │ │ B/op │ B/op vs base │ BackendBlockTraceQL/spanAttrValMatch-16 154.6Mi ± 1% 154.5Mi ± 1% ~ (p=0.878 n=8) BackendBlockTraceQL/spanAttrValNoMatch-16 8.555Mi ± 2% 8.555Mi ± 4% ~ (p=0.878 n=8) BackendBlockTraceQL/spanIntrinsicMatch-16 134.5Mi ± 1% 133.9Mi ± 0% -0.49% (p=0.050 n=8) BackendBlockTraceQL/spanIntrinsicNoMatch-16 876.5Ki ± 5% 883.1Ki ± 4% ~ (p=0.382 n=8) BackendBlockTraceQL/spanAttrAndMatch-16 156.9Mi ± 3% 156.1Mi ± 3% ~ (p=0.328 n=8) BackendBlockTraceQL/spanAttrAndNoMatch-16 818.5Ki ± 3% 826.0Ki ± 2% ~ (p=0.505 n=8) BackendBlockTraceQL/spanAttrOrMatch-16 285.4Mi ± 1% 285.4Mi ± 1% ~ (p=0.878 n=8) BackendBlockTraceQL/spanAttrOrNoMatch-16 155.6Mi ± 1% 155.7Mi ± 1% ~ (p=0.798 n=8) BackendBlockTraceQL/spanAttrNotNilMatch-16 948.1Mi ± 1% 948.3Mi ± 1% ~ (p=0.645 n=8) BackendBlockTraceQL/spanAttrNotNilNoMatch-16 1.418Mi ± 3% 1.412Mi ± 2% ~ (p=0.878 n=8) BackendBlockTraceQL/spanWellKnownNotNilMatch-16 1.380Gi ± 0% 1.573Gi ± 0% +14.03% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedNotNilMatch-16 1.380Gi ± 0% 1.571Gi ± 0% +13.85% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownAndNotNilMatch-16 428.6Mi ± 0% 1047.7Mi ± 0% +144.44% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownOrNotNilMatch-16 1.646Gi ± 0% 1.724Gi ± 0% +4.74% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedAndNotNilMatch-16 156.7Mi ± 1% 165.5Mi ± 2% +5.56% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedOrNotNilMatch-16 809.3Mi ± 0% 937.5Mi ± 0% +15.84% (p=0.000 n=8) geomean 104.7Mi 114.3Mi +9.17% │ bench-nil-special-cols-02-main.txt │ bench-nil-special-cols-02-fix.txt │ │ allocs/op │ allocs/op vs base │ BackendBlockTraceQL/spanAttrValMatch-16 1.699M ± 0% 1.699M ± 0% +0.00% (p=0.019 n=8) BackendBlockTraceQL/spanAttrValNoMatch-16 10.97k ± 0% 10.97k ± 0% ~ (p=0.545 n=8) BackendBlockTraceQL/spanIntrinsicMatch-16 1.826M ± 0% 1.826M ± 0% ~ (p=0.896 n=8) BackendBlockTraceQL/spanIntrinsicNoMatch-16 10.81k ± 0% 10.81k ± 0% ~ (p=0.631 n=8) BackendBlockTraceQL/spanAttrAndMatch-16 1.832M ± 0% 1.832M ± 0% ~ (p=0.700 n=8) BackendBlockTraceQL/spanAttrAndNoMatch-16 10.92k ± 0% 10.92k ± 0% ~ (p=0.608 n=8) BackendBlockTraceQL/spanAttrOrMatch-16 3.324M ± 0% 3.324M ± 0% ~ (p=0.798 n=8) BackendBlockTraceQL/spanAttrOrNoMatch-16 1.722M ± 0% 1.722M ± 0% ~ (p=0.554 n=8) BackendBlockTraceQL/spanAttrNotNilMatch-16 11.03M ± 0% 11.03M ± 0% ~ (p=0.940 n=8) BackendBlockTraceQL/spanAttrNotNilNoMatch-16 10.91k ± 0% 10.91k ± 0% ~ (p=0.345 n=8) BackendBlockTraceQL/spanWellKnownNotNilMatch-16 14.76M ± 0% 16.11M ± 0% +9.16% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedNotNilMatch-16 14.76M ± 0% 16.11M ± 0% +9.16% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownAndNotNilMatch-16 6.083M ± 0% 11.937M ± 0% +96.25% (p=0.000 n=8) BackendBlockTraceQL/spanWellKnownOrNotNilMatch-16 16.93M ± 0% 18.33M ± 0% +8.25% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedAndNotNilMatch-16 1.832M ± 0% 2.191M ± 0% +19.61% (p=0.000 n=8) BackendBlockTraceQL/spanDedicatedOrNotNilMatch-16 9.547M ± 0% 11.225M ± 0% +17.58% (p=0.000 n=8) geomean 1.044M 1.130M +8.26%

Is this worth it? Should we wait for a new block format to fix this?

It's similar to the struggles we had in the int vs float comparisons. It's so much more costly to scan the main attributes column and you lose all the benefits of having the well known/dedicated column.

@joe-elliott Good point. I agree that scanning the generic attributes column is more costly and cancels out some of the benefits of well-known/dedicated columns. For most queries like { span.http.status_code != nil }, the overhead isn't too large if there are no matching values in the generic attribute columns.

It still might make sense to shelve this for now. The new block format will likely provide a cleaner fix, so revisiting this later seems like the better approach.

stoewer added 6 commits March 17, 2025 13:42

Test attr != nil for special columns

50e3175

Fix attr!=nil for special span columns

a068772

Test attr != nil for special resource columns

87e5f31

Fix attr!=nil for special resource columns

16e59b7

Improve comment

43786a5

Test cases for unscoped attributes

6e7b5b2

stoewer changed the title ~~Fix lookups .attr != nil with special columns~~ TraceQL: Fix lookups .attr != nil with special columns Mar 17, 2025

stoewer added 3 commits March 17, 2025 15:15

CHANGELOG.md

cec2da9

Range loop

bb0724c

Fix failing test

80a3129

stoewer marked this pull request as ready for review March 17, 2025 10:33

stoewer requested review from joe-elliott, mdisibio, mapno, yvrhdn, zalegrala, electron0zero, ie-pham, javiermolinar and carles-grafana as code owners March 17, 2025 10:33

joe-elliott reviewed Mar 18, 2025

View reviewed changes

stoewer added 2 commits March 20, 2025 19:07

Disable allConditions when using OpNone on special columns

a2beeee

More test cases for existence checks with != nil

0760253

stoewer mentioned this pull request Mar 21, 2025

TraceQL: make comparison to nil symmetric #4869

Merged

3 tasks

stoewer added 3 commits March 25, 2025 17:40

Merge remote-tracking branch 'upstream/main' into nil-special-columns

a28f7ec

Consolidate test cases using != nil

4ff608b

Reformat test case

824f3a7

stoewer mentioned this pull request Mar 26, 2025

TraceQL: support mixed-type attribute querying (int/float) #4391

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TraceQL: Fix lookups .attr != nil with special columns #4864

TraceQL: Fix lookups .attr != nil with special columns #4864

stoewer commented Mar 17, 2025 •

edited

Loading

joe-elliott Mar 18, 2025

stoewer Mar 18, 2025

stoewer Mar 20, 2025

joe-elliott Mar 20, 2025

stoewer Mar 21, 2025

stoewer Mar 24, 2025

stoewer Mar 25, 2025 •

edited

Loading

stoewer Mar 25, 2025 •

edited

Loading

joe-elliott Mar 25, 2025

stoewer Mar 26, 2025 •

edited

Loading

TraceQL: Fix lookups .attr != nil with special columns #4864

Are you sure you want to change the base?

TraceQL: Fix lookups .attr != nil with special columns #4864

Conversation

stoewer commented Mar 17, 2025 • edited Loading

joe-elliott Mar 18, 2025

Choose a reason for hiding this comment

stoewer Mar 18, 2025

Choose a reason for hiding this comment

stoewer Mar 20, 2025

Choose a reason for hiding this comment

joe-elliott Mar 20, 2025

Choose a reason for hiding this comment

stoewer Mar 21, 2025

Choose a reason for hiding this comment

stoewer Mar 24, 2025

Choose a reason for hiding this comment

stoewer Mar 25, 2025 • edited Loading

Choose a reason for hiding this comment

stoewer Mar 25, 2025 • edited Loading

Choose a reason for hiding this comment

joe-elliott Mar 25, 2025

Choose a reason for hiding this comment

stoewer Mar 26, 2025 • edited Loading

Choose a reason for hiding this comment

stoewer commented Mar 17, 2025 •

edited

Loading

stoewer Mar 25, 2025 •

edited

Loading

stoewer Mar 25, 2025 •

edited

Loading

stoewer Mar 26, 2025 •

edited

Loading