[RFC] RL-Insight Trajectory Collection via Gateway Instrumentation

### Feature proposal

https://github.com/verl-project/rl-insight

Instrument `GatewaySession.run_generation` to collect per-turn input/output records (prompt messages + decoded assistant response + metadata), and expose a `get_summary()` interface so rl-insight can pull structured session data for custom visualization and offline debugging.

### Motivation and use case

The Gateway already sees the full generation context — `run_generation` receives prompt messages and returns a decoded `assistant_msg`. But this data is discarded during trajectory materialization: only token IDs, masks, and logprobs survive into `Trajectory`. This makes debugging token-blind (can't inspect what the model actually said without offline decode) and blocks external observability tools like rl-insight from building response-aware dashboards.


### Related area

`uni_agent/gateway/gateway.py` (`_handle_chat_completions` → `session.run_generation`), `uni_agent/gateway/session/session.py` (`GatewaySession`, `TrajectoryBuffer`), `uni_agent/framework/framework.py` (session lifecycle).



### Known gap

1. `TrajectoryBuffer` / `Trajectory` carry only token IDs — no raw response text. The summary interface holds raw text in parallel; adding `response_text` to `Trajectory` itself is a longer-term follow-up.
2.  Consider providing a summary/info aggregation interface, serving both uni-agent's own debugging needs and external tools like rl-insight that can directly fetch data for visualization.

@zackcxb @wuxibin89  @mengchengtang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] RL-Insight Trajectory Collection via Gateway Instrumentation #71

Feature proposal

Motivation and use case

Related area

Known gap

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[RFC] RL-Insight Trajectory Collection via Gateway Instrumentation #71

Description

Feature proposal

Motivation and use case

Related area

Known gap

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions