What problem are you trying to solve?
Remote agents run in a separate service and return their terminal result through a durable workflow callback. The caller does not consume the remote event stream, so the remote model spans and token usage never appear in caller-side traces. This prevents observability from attributing input, output, and cache-read tokens to the invoked remote agent.
Proposed solution
Include optional, best-effort token totals in successful session callbacks. When the caller receives valid usage, emit a local invoke_agent span with the remote agent name and gen_ai usage attributes before resuming the pending result hook.
Keep the field optional so callers and remote agents can roll out independently. Usage collection, parsing, and span emission should never make an otherwise successful callback fail.
Alternatives considered
Relying only on observability from the remote deployment does not attribute usage to the calling application and requires manual cross-service correlation.
What problem are you trying to solve?
Remote agents run in a separate service and return their terminal result through a durable workflow callback. The caller does not consume the remote event stream, so the remote model spans and token usage never appear in caller-side traces. This prevents observability from attributing input, output, and cache-read tokens to the invoked remote agent.
Proposed solution
Include optional, best-effort token totals in successful session callbacks. When the caller receives valid usage, emit a local invoke_agent span with the remote agent name and gen_ai usage attributes before resuming the pending result hook.
Keep the field optional so callers and remote agents can roll out independently. Usage collection, parsing, and span emission should never make an otherwise successful callback fail.
Alternatives considered
Relying only on observability from the remote deployment does not attribute usage to the calling application and requires manual cross-service correlation.