fix: propagate trace context end-to-end for agent Services#1297
fix: propagate trace context end-to-end for agent Services#1297syn-zhu wants to merge 3 commits intokagent-dev:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the Agent manifest translation so that the Kubernetes Service created for each Agent CR sets an explicit appProtocol, enabling AgentGateway’s A2A plugin to discover and route directly to agent Services (preserving HTTP headers for distributed tracing).
Changes:
- Set
spec.ports[0].appProtocol: kgateway.dev/a2aon the per-Agent Service port.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hey there, thanks for the PR, this is a great idea! You will need to update the goldens as well as sign your commits for us to merge this |
Thanks! Just saw this now, but it looks like the issue was already fixed by opspawn@d9f2a3a :) Gonna just close this PR, thanks! |
|
Oops @EItanya i realized the commit I linked wasn't actually a merged commit, but rather a branch. I've updated and reopened my PR to address the things you mentioned. Please lmk if there's anything else! |
13799c1 to
c2c5902
Compare
c2c5902 to
256bbda
Compare
ffe2d68 to
f965911
Compare
|
Hi @syn-zhu, Could you please test this locally on your end first? As it's Claude-generated code, a brief manual validation, e.g. such as posting before/after screenshots from a tracing tool would be the minimum step to ensure it's ready for contribution. |
End-to-End Test Results: Trace PropagationTested on a live EKS cluster running kagent v0.7.13 with Langfuse (OTLP-backed) as the trace backend. Each test sends an A2A Setup
Stage 0 — Baseline (upstream v0.7.13, no patches)
Problem: Agent uses the default Stage 1 — Commit 1 only (Python SDK: W3C propagator + AioHttpClientInstrumentor)Sent request directly to agent pod (bypassing controller) to isolate the Python SDK changes.
Result: Stage 2 — Both commits (Python SDK + Go controller trace propagation)A2A request sent through the controller (the production path). Before (upstream controller, patched agent image)
Problem: The upstream controller strips After (patched controller + patched agent)
Result: The patched controller extracts Summary
|
f965911 to
3d2c566
Compare
updated |
Two changes to enable end-to-end W3C TraceContext propagation: 1. Add AppProtocol "kgateway.dev/a2a" to agent Service port so AgentGateway can discover agent Services directly via kgateway protocol matching, rather than proxying through the controller. Update all golden test outputs to include the new appProtocol field. 2. Set up W3C TraceContext propagator in the Python agent SDK tracing configuration so agent pods correctly extract incoming traceparent headers and propagate them on outgoing requests. Fixes kagent-dev#1295 Signed-off-by: Simon Zhu <[email protected]>
…t pods The A2A server deserializes incoming HTTP requests into JSON-RPC params, discarding the original HTTP headers. When the controller forwards requests to agent pods via the A2A client, trace context headers (traceparent, tracestate) are lost, breaking distributed tracing. Fix: capture W3C trace context headers from the incoming request into the Go context in the A2A auth middleware, then inject them into outgoing requests in the A2ARequestHandler. This closes the gap between the A2A server (which strips headers) and the A2A client (which constructs new HTTP requests). Also update the agent_with_passthrough golden test (added in kagent-dev#1327) to include the appProtocol field. Signed-off-by: Simon Zhu <[email protected]>
3d2c566 to
32027e9
Compare
krisztianfekete
left a comment
There was a problem hiding this comment.
This looks mostly good, but can you please look at the two comments I've just added?
| logging.info("Enabling tracing") | ||
| # Set up W3C TraceContext propagator so incoming traceparent headers | ||
| # are extracted and outgoing requests carry them forward. | ||
| set_global_textmap(CompositeHTTPPropagator([TraceContextTextMapPropagator()])) |
There was a problem hiding this comment.
Are you sure this is necessary? If it is (but I don't think it is), we should at least preserve the existing propagators that this overrides.
There was a problem hiding this comment.
These are unrelated to auth. Can we move these into internal/ or somewhere tracing-specific?







Summary
Three fixes to enable end-to-end W3C TraceContext propagation across the controller→agent boundary:
AppProtocol on agent Services — Set
appProtocol: kgateway.dev/a2aon the Service port created for each Agent CR so AgentGateway's A2A plugin can discover agent Services directly via protocol matching, rather than proxying through the kagent controller (which drops HTTP headers includingtraceparent).W3C TraceContext propagator in Python SDK — Configure the W3C TraceContext propagator in
kagent-coretracing setup so agent pods correctly extract incomingtraceparentheaders and propagate them on outgoing requests.Trace header propagation in Go controller — The A2A server deserializes incoming HTTP requests into JSON-RPC params, discarding the original HTTP headers. When the controller forwards requests to agent pods via the A2A client,
traceparent/tracestateare lost. Fix: capture W3C trace context headers from the incoming request into the Go context in the A2A auth middleware (A2AAuthenticator.Wrap), then inject them into outgoing requests inA2ARequestHandler.All golden test outputs have been updated to include the new
appProtocolfield, includingagent_with_passthrough(added in #1327).Incorporates changes from opspawn@d9f2a3a.
Test plan
testdata/outputs/*.jsonfiles includeappProtocol: "kgateway.dev/a2a"on Service portsgo test ./internal/httpserver/auth/...passeskubectl get svc <agent> -o jsonpath='{.spec.ports[0].appProtocol}'returnskgateway.dev/a2atraceparentheader through gateway → controller → agent pod, verify trace ID is preserved end-to-endCloses #1295
🤖 Generated with Claude Code