Skip to content

Commit

Permalink
Add support for multi-tenant queries in streaming search (grafana#3262)
Browse files Browse the repository at this point in the history
Add multi-tenant query support in streaming search endpoints.

follow up on grafana#3087
  • Loading branch information
electron0zero authored Jan 25, 2024
1 parent 865383b commit 96c4e72
Show file tree
Hide file tree
Showing 18 changed files with 734 additions and 188 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,3 @@
/tempodb/encoding/benchmark_block
private-key.key
integration/e2e/e2e_integration_test[0-9]*
integration/e2e/metrics_*_dump.txt
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
## main / unreleased

* [FEATURE] Add support for multi-tenant queries in streaming search [#3262](https://github.com/grafana/tempo/pull/3262) (@electron0zero)
* [FEATURE] TraceQL metrics queries [#3227](https://github.com/grafana/tempo/pull/3227) [#3252](https://github.com/grafana/tempo/pull/3252) [#3258](https://github.com/grafana/tempo/pull/3258) (@mdisibio @zalegrala)
* [FEATURE] Add support for multi-tenant queries. [#3087](https://github.com/grafana/tempo/pull/3087) (@electron0zero)
* [BUGFIX] Fix parsing of span.resource.xyz attributes in TraceQL. [#3284](https://github.com/grafana/tempo/pull/3284) (@mghildiy)
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ lint:

.PHONY: docker-component # Not intended to be used directly
docker-component: check-component exe
docker build -t grafana/$(COMPONENT) --load --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
docker build -t grafana/$(COMPONENT) --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
docker tag grafana/$(COMPONENT) $(COMPONENT)

.PHONY: docker-component-debug
Expand Down
48 changes: 48 additions & 0 deletions example/docker-compose/multi-tenant/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
version: "3"
services:
tempo:
image: grafana/tempo:latest
environment:
- GRPC_TRACE=api
command: [ "-config.file=/etc/tempo.yaml" ]
volumes:
- ./tempo.yaml:/etc/tempo.yaml
- ./tempo-data:/tmp/tempo
ports:
- "14268:14268" # jaeger ingest
- "3200:3200" # tempo
- "9095:9095" # tempo grpc
- "4317:4317" # otlp grpc
- "4318:4318" # otlp http
- "9411:9411" # zipkin

k6-tracing:
image: ghcr.io/grafana/xk6-client-tracing:latest
environment:
- ENDPOINT=tempo:4317
- TEMPO_X_SCOPE_ORGID=test
restart: always
depends_on:
- tempo

k6-tracing-2:
image: ghcr.io/grafana/xk6-client-tracing:latest
environment:
- ENDPOINT=tempo:4317
- TEMPO_X_SCOPE_ORGID=test2
restart: always
depends_on:
- tempo

grafana:
image: grafana/grafana:main
volumes:
- ./grafana-datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
environment:
- GF_DEFAULT_APP_MODE=development
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
- GF_AUTH_DISABLE_LOGIN_FORM=true
- GF_FEATURE_TOGGLES_ENABLE=traceqlEditor traceQLStreaming metricsSummary
ports:
- "3000:3000"
33 changes: 33 additions & 0 deletions example/docker-compose/multi-tenant/grafana-datasources.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: 1

datasources:
- name: Prometheus
type: prometheus
uid: prometheus
access: proxy
orgId: 1
url: http://prometheus:9090
basicAuth: false
isDefault: false
version: 1
editable: false
jsonData:
httpMethod: GET
- name: Tempo
type: tempo
access: proxy
orgId: 1
url: http://tempo:3200
basicAuth: false
isDefault: true
version: 1
editable: false
apiVersion: 1
uid: tempo
jsonData:
httpMethod: GET
serviceMap:
datasourceUid: prometheus
httpHeaderName1: 'X-Scope-Orgid'
secureJsonData:
httpHeaderValue1: 'test|test2'
65 changes: 65 additions & 0 deletions example/docker-compose/multi-tenant/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
## Local Storage
In this example all data is stored locally in the `tempo-data` folder. Local storage is fine for experimenting with Tempo
or when using the single binary, but does not work in a distributed/microservices scenario.

1. First start up the local stack.

```console
$ docker-compose up -d
Starting multi-tenant_grafana_1 ... done
Starting multi-tenant_tempo_1 ... done
Starting multi-tenant_k6-tracing-2_1 ... done
Starting multi-tenant_k6-tracing_1 ... done
```

At this point, the following containers should be spun up -

```console
$ docker-compose ps
Name Command State Ports
------------------------------------------------------------------------------------------------------------------------------------------------------------
multi-tenant_grafana_1 /run.sh Up 0.0.0.0:3000->3000/tcp,:::3000->3000/tcp
multi-tenant_k6-tracing-2_1 /k6-tracing run /example-s ... Up
multi-tenant_k6-tracing_1 /k6-tracing run /example-s ... Up
multi-tenant_tempo_1 /tempo -config.file=/etc/t ... Up 0.0.0.0:14268->14268/tcp,:::14268->14268/tcp, 0.0.0.0:3200->3200/tcp,:::3200->3200/tcp, 0.0.0.0:4317->4317/tcp,:::4317->4317/tcp,
0.0.0.0:4318->4318/tcp,:::4318->4318/tcp, 0.0.0.0:9095->9095/tcp,:::9095->9095/tcp, 0.0.0.0:9411->9411/tcp,:::9411->9411/tcp


```

2. If you're interested you can see the wal/blocks as they are being created.

```console
$ ls tempo-data/
```

3. Navigate to [Grafana](http://localhost:3000/explore) select the Tempo data source and use the "Search"
tab to find traces. Also notice that you can query Tempo metrics from the Prometheus data source setup in
Grafana.

4. Tail logs of a container (eg: tempo)
```bash
$ docker logs multi-tenant_tempo_1 -f
```

5. To stop the setup use -

```console
docker-compose down -v
```

## streaming and multi-tenant search

- needs `traceQLStreaming` feature flag set in Grafana, see `docker-compose.yaml`
- needs `stream_over_http_enabled: true`, `multitenancy_enabled: true`,
and `query_frontend.multi_tenant_queries_enabled: true` in the tempo config file, see `tempo.yaml`

You can use Grafana or tempo-cli to make a query.

**grpc streaming query using tempo-cli**
- `$ tempo-cli query api search "0.0.0.0:3200" --use-grpc --limit 10000 "{}" "2023-12-05T08:11:18Z" "2023-12-05T08:12:18Z" --org-id="test"`

**multi-tenant streaming queries using tempo-cli**
- pass multiple tenant ids with `|` like this `--org-id="test|test2"`

example: `$ ./bin/linux/tempo-cli-amd64 query api search "0.0.0.0:3200" --use-grpc --limit 10000 "{ true } >> { true }" "2024-01-15T11:00:00Z" "2024-01-19T12:30:00Z" --org-id="test|test2"`
50 changes: 50 additions & 0 deletions example/docker-compose/multi-tenant/tempo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
target: all
multitenancy_enabled: true
stream_over_http_enabled: true
server:
http_listen_port: 3200
log_level: info

query_frontend:
multi_tenant_queries_enabled: true
search:
duration_slo: 5s
throughput_bytes_slo: 1.073741824e+09
trace_by_id:
duration_slo: 5s

distributor:
receivers: # this configuration will listen on all ports and protocols that tempo is capable of.
jaeger: # the receives all come from the OpenTelemetry collector. more configuration information can
protocols: # be found there: https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver
thrift_http: #
grpc: # for a production deployment you should only enable the receivers you need!
thrift_binary:
thrift_compact:
zipkin:
otlp:
protocols:
http:
grpc:
opencensus:

ingester:
max_block_duration: 5m # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally

compactor:
compaction:
block_retention: 1h # overall Tempo trace retention. set for demo purposes

metrics_generator:

storage:
trace:
backend: local # backend configuration to use
wal:
path: /tmp/tempo/wal # where to store the the wal locally
local:
path: /tmp/tempo/blocks

overrides:
defaults:
metrics_generator:
1 change: 1 addition & 0 deletions integration/e2e/config-multi-tenant-local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ metrics_generator:

storage:
trace:
blocklist_poll: 1s
backend: local
local:
path: /var/tempo
Expand Down
2 changes: 1 addition & 1 deletion integration/e2e/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ func TestAllInOne(t *testing.T) {
grpcClient, err := util.NewSearchGRPCClient(context.Background(), tempo.Endpoint(3200))
require.NoError(t, err)

util.SearchStreamAndAssertTrace(t, grpcClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
util.SearchStreamAndAssertTrace(t, context.Background(), grpcClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
})
}
}
Expand Down
2 changes: 1 addition & 1 deletion integration/e2e/encodings_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ func TestEncodings(t *testing.T) {
// search the backend. this works b/c we're passing a start/end AND setting query ingesters within min/max to 0
integration.SearchAndAssertTraceBackend(t, apiClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
// find the trace with streaming. using the http server b/c that's what Grafana will do
integration.SearchStreamAndAssertTrace(t, grpcClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
integration.SearchStreamAndAssertTrace(t, context.Background(), grpcClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
}
})
}
Expand Down
2 changes: 1 addition & 1 deletion integration/e2e/https_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,5 +82,5 @@ func TestHTTPS(t *testing.T) {
require.NoError(t, err)

now := time.Now()
util.SearchStreamAndAssertTrace(t, grpcClient, info, now.Add(-time.Hour).Unix(), now.Add(time.Hour).Unix())
util.SearchStreamAndAssertTrace(t, context.Background(), grpcClient, info, now.Add(-time.Hour).Unix(), now.Add(time.Hour).Unix())
}
21 changes: 8 additions & 13 deletions integration/e2e/multi_tenant_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -205,31 +205,26 @@ func TestMultiTenantSearch(t *testing.T) {
assertRequestCountMetric(t, tempo, rt.route, rt.reqCount)
}

// test all the unsupported endpoints
now := time.Now()
_, msErr := apiClient.MetricsSummary("{}", "name", 0, 0)

// test streaming search over grpc
grpcCtx := user.InjectOrgID(context.Background(), tc.tenant)
grpcCtx, err = user.InjectIntoGRPCRequest(grpcCtx)
require.NoError(t, err)

grpcClient, err := util.NewSearchGRPCClient(grpcCtx, tempo.Endpoint(3200))
require.NoError(t, err)
grpcResp, err := grpcClient.Search(grpcCtx, &tempopb.SearchRequest{
Query: "{}", Start: uint32(now.Add(-20 * time.Minute).Unix()), End: uint32(now.Unix()),
})
require.NoError(t, err)
// actual error comes in resp, need to call Recv to get it.
_, grpcErr := grpcResp.Recv()

time.Sleep(2 * time.Second) // ensure that blocklist poller has built the blocklist
now := time.Now()
util.SearchStreamAndAssertTrace(t, grpcCtx, grpcClient, info, now.Add(-5*time.Minute).Unix(), now.Add(5*time.Minute).Unix())
assertRequestCountMetric(t, tempo, "/tempopb.StreamingQuerier/Search", 1)

// test unsupported endpoint
_, msErr := apiClient.MetricsSummary("{}", "name", 0, 0)
if tc.tenantSize > 1 {
// we expect error in case of multi-tenant request for unsupported endpoints
// error for multi-tenant request for unsupported endpoints
require.Error(t, msErr)
require.Error(t, grpcErr)
} else {
require.NoError(t, msErr)
require.NoError(t, grpcErr)
}
})
}
Expand Down
6 changes: 4 additions & 2 deletions integration/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -365,14 +365,16 @@ func SearchTraceQLAndAssertTrace(t *testing.T, client *httpclient.Client, info *
require.True(t, traceIDInResults(t, info.HexID(), resp))
}

func SearchStreamAndAssertTrace(t *testing.T, client tempopb.StreamingQuerierClient, info *tempoUtil.TraceInfo, start, end int64) {
// SearchStreamAndAssertTrace will search and assert that the trace is present in the streamed results.
// nolint: revive
func SearchStreamAndAssertTrace(t *testing.T, ctx context.Context, client tempopb.StreamingQuerierClient, info *tempoUtil.TraceInfo, start, end int64) {
expected, err := info.ConstructTraceFromEpoch()
require.NoError(t, err)

attr := tempoUtil.RandomAttrFromTrace(expected)
query := fmt.Sprintf(`{ .%s = "%s"}`, attr.GetKey(), attr.GetValue().GetStringValue())

resp, err := client.Search(context.Background(), &tempopb.SearchRequest{
resp, err := client.Search(ctx, &tempopb.SearchRequest{
Query: query,
Start: uint32(start),
End: uint32(end),
Expand Down
27 changes: 27 additions & 0 deletions modules/frontend/combiner/noop.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package combiner

import (
"io"

"github.com/grafana/tempo/pkg/tempopb"
)

var _ Combiner = (*genericCombiner[*tempopb.SearchResponse])(nil)

// NewNoOp returns a combiner that doesn't is a no op, and doesn't combine.
// It is used in search streaming, in search streaming keeps trek of search progress
// and combines result on its own from the multi-tenant search.
func NewNoOp() Combiner {
return &genericCombiner[*tempopb.SearchResponse]{
code: 200,
final: &tempopb.SearchResponse{Metrics: &tempopb.SearchMetrics{}},
combine: func(body io.ReadCloser, final *tempopb.SearchResponse) error {
// no op
return nil
},
result: func(response *tempopb.SearchResponse) (string, error) {
// no op
return "", nil
},
}
}
Loading

0 comments on commit 96c4e72

Please sign in to comment.