Skip to content

Tracing: frontend downstream span missing #6635

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rodrigo14miguel opened this issue Aug 23, 2023 · 0 comments · May be fixed by #8251
Open

Tracing: frontend downstream span missing #6635

rodrigo14miguel opened this issue Aug 23, 2023 · 0 comments · May be fixed by #8251

Comments

@rodrigo14miguel
Copy link

Thanos, Prometheus and Golang version used:
Thanos 0.31.0
Grafana Tempo 2.1.1
Otel Collector 0.81.0

Object Storage Provider:
AWS S3

What happened:
Missing span of kind "client" for communications beginning in Frontend towards downstream url.
Frontend fetches fifocache with 0 keys found. Then it calls a "retry" routine to fetch the data from the downstream url (see attached image).
There is a span of kind "server" for the http url "/api/v1/query_range" on the Querier side (the downstream) but there is no matching "client" span from the Frontend side.

What you expected to happen:
Expected a "client" kind span from the Frontend when a request is made to the downstream url, something like na "HTTP Outgoing Request"

How to reproduce it (as minimally and precisely as possible):
Using the following architecture:

graph LR
A[Grafana] --http--> B(Frontend)
B --http--> C(Global Query)
N --grpc--> F
C --grpc--> N(Query pool)

N --grpc--> I
N --grpc--> J
D(Compactor) --http--> E{S3}
F(Storegateway Pool) --http--> E

I(SidecarX/PrometheusX) --http--> E
J(Receiver) --http--> E
M(PrometheusY) --http--> J
subgraph No grpc inbound
  M
end
subgraph Data at rest
  E
end
Loading

Frontend uses local memory for cache.

All Thanos components send tracing data to an otel collector that relays to a all-in-one Grafana Tempo instance (no special config).

Thanos tracing config:

type: OTLP
config:
  client_type: "grpc"
  insecure: true
  endpoint: "tempo.distributor.url:port"

Trace "ServiceName" is set via "OTEL_SERVICE_NAME" environment variable.

Full logs to relevant components:

Anything else we need to know:
For my architecture, I use a global Querier. I've tried pointing Frontend downstream directly to a Querier from the Query pool (to bypass global Querier) but It behaves the same.
image

Foxlik added a commit to Foxlik/thanos that referenced this issue May 15, 2025
Fix thanos-io#6635 by adding a client span to queryrange roundTripper Do method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant