fix(voice): propagate caller's OTel context through AgentSession#1244
fix(voice): propagate caller's OTel context through AgentSession#1244mrniket wants to merge 1 commit intolivekit:mainfrom
Conversation
Honour otelContext.active() when starting the agent_session span and when building rootSpanContext, so spans the caller created before session.start() become parents of LiveKit spans. Stop calling tracerProvider.register() in setupCloudTracer to avoid replacing the global AsyncLocalStorageContextManager. Fixes livekit#924
|
| }); | ||
| tracerProvider.register(); | ||
|
|
||
| // Don't call tracerProvider.register() — it would replace the global context manager. |
There was a problem hiding this comment.
🔴 Removing tracerProvider.register() from setupCloudTracer breaks async context propagation for cloud traces
The removal of tracerProvider.register() at agents/src/telemetry/traces.ts:261 means no AsyncHooksContextManager is set up for LiveKit Cloud users. NodeTracerProvider.register() was the only call that replaced the default NoopContextManager with a real context manager capable of propagating context across async boundaries.
Without a real context manager, otelContext.active() always returns ROOT_CONTEXT and otelContext.with() doesn't propagate context to async continuations. This breaks span parent-child hierarchies throughout the pipeline:
Affected call sites that rely on implicit context propagation
agents/src/voice/generation.ts:541:const currentContext = otelContext.active()— intended to captureagent_turncontext forllm_node, but now getsROOT_CONTEXTagents/src/voice/generation.ts:707: Same issue fortts_nodespansagents/src/voice/generation.ts:1061-1063:function_toolspans have no explicit context, become orphaned root spansagents/src/voice/agent_session.ts:841:otelContext.with(this.rootSpanContext, runWithContext)— asyncrunWithContextwon't see the session contextagents/src/voice/agent_session.ts:1109:otelContext.with(this.rootSpanContext, async () => ...)— same issue incloseImpl
In the typical cloud flow, setupCloudTracer (called from job.ts:385 via initRecording) is the only place that previously set up the context manager. Users don't register their own provider in this path. The result is that cloud trace dashboards will show flat, unrelated root spans instead of properly nested trace trees.
Prompt for agents
The removal of tracerProvider.register() was intended to avoid replacing a user's existing global context manager. However, it also prevents setting up ANY context manager when no user provider is registered (the common cloud case). The fix should conditionally set up a context manager when one isn't already active, without replacing the global tracer provider. One approach: check if a real context manager is already registered (e.g., by testing otelContext.active() behavior or checking a flag), and if not, explicitly create and enable an AsyncLocalStorageContextManager via context.setGlobalContextManager() without calling the full register(). Another approach: call register() only if no global tracer provider is already set. The key constraint is: don't clobber an existing user-provided context manager, but DO set one up if none exists. Relevant files: agents/src/telemetry/traces.ts (setupCloudTracer function, around line 257-262).
Was this helpful? React with 👍 or 👎 to provide feedback.
| resource, | ||
| spanProcessors: [new MetadataSpanProcessor(metadata), new BatchSpanProcessor(spanExporter)], | ||
| }); | ||
| tracerProvider.register(); |
There was a problem hiding this comment.
I don't think we should delete this line; otherwise the cloud tracing will no longer work.
Summary
its
agent_sessionspan, so spans the caller started beforesession.start()become proper parents. Fixes feat: Add trace context propagation API for Node.js SDK (parity with Python) #924.
setupCloudTracerclarifying thattracerProvider.register()uses set-once semantics in the OTel API — it won't clobber a user's existing
context manager or tracer provider if they've already called
NodeSDK.start().Test plan
agents/src/telemetry/traces.test.tsverifying thatDynamicTracer.startSpaninherits the active OTel context when noexplicit context is passed — the contract the fix depends on.
pnpm test,pnpm lint,pnpm format:checkall pass.