Skip to content

[RFC] RL-Insight Trajectory Collection via Gateway Instrumentation #71

Description

@tardis-key

Feature proposal

https://github.com/verl-project/rl-insight

Instrument GatewaySession.run_generation to collect per-turn input/output records (prompt messages + decoded assistant response + metadata), and expose a get_summary() interface so rl-insight can pull structured session data for custom visualization and offline debugging.

Motivation and use case

The Gateway already sees the full generation context — run_generation receives prompt messages and returns a decoded assistant_msg. But this data is discarded during trajectory materialization: only token IDs, masks, and logprobs survive into Trajectory. This makes debugging token-blind (can't inspect what the model actually said without offline decode) and blocks external observability tools like rl-insight from building response-aware dashboards.

Related area

uni_agent/gateway/gateway.py (_handle_chat_completionssession.run_generation), uni_agent/gateway/session/session.py (GatewaySession, TrajectoryBuffer), uni_agent/framework/framework.py (session lifecycle).

Known gap

  1. TrajectoryBuffer / Trajectory carry only token IDs — no raw response text. The summary interface holds raw text in parallel; adding response_text to Trajectory itself is a longer-term follow-up.
  2. Consider providing a summary/info aggregation interface, serving both uni-agent's own debugging needs and external tools like rl-insight that can directly fetch data for visualization.

@zackcxb @wuxibin89 @mengchengTang

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions