Skip to content

Add agn replay for recorded tool actions #8

@manojsinghnegiwd

Description

@manojsinghnegiwd

Summary

Add an agn replay command that can rerun a previously recorded AGN run by replaying the concrete tool actions it performed, such as write_file, read_file, shell commands, patches, etc. The replay should be based on what AGN actually did, not on the original chat/messages.

This allows the same exact changes/actions from a prior run to be applied again deterministically.

Task

Implement replay support with entry points such as:

agn replay --environment <environment-id>
agn replay --conversation <conversation-id>

The implementation should support recording enough structured execution data during an AGN run to later replay the meaningful side-effecting actions. At minimum, the replay data should capture the ordered tool/action calls, relevant inputs, outputs or status where needed, and enough metadata to identify and validate the target workspace/environment.

Use Cases

  1. Repeat the same run: Re-run the same sequence of actions again and again without re-prompting the model or relying on message history.
  2. Apply environment-tested changes: Run AGN in a disposable environment, inspect/approve the resulting behavior, then replay the recorded actions against another target such as the real working directory.

Acceptance Criteria

  • AGN records the concrete actions/tool calls performed during a run in a replayable format.
  • agn replay --conversation <conversation-id> replays actions from a recorded conversation/run.
  • agn replay --environment <environment-id> replays actions associated with a disposable environment run.
  • Replay applies the recorded actions in the original order.
  • Replay does not depend on the original prompt/messages to regenerate behavior.
  • Replay has clear behavior for reads, writes, patches, shell commands, failures, and mismatched preconditions.
  • Replay should provide useful output showing which actions were replayed and whether each succeeded or failed.
  • Documentation explains the two main workflows: deterministic repeat runs and applying changes that were first tested in an isolated environment.

Open Questions

  • Should replay be fully automatic, interactive/confirming each side-effecting step, or support both modes?
  • Should read_file and other non-mutating actions be replayed, skipped, or validated against expected prior outputs?
  • How should shell commands be classified and guarded, especially destructive or environment-specific commands?
  • Should replay validate file preconditions/hashes before applying writes or patches?
  • What should happen if the target workspace differs from the original environment?
  • Should the CLI option be named --environment rather than --environement?

References

  • Support AGN disposable Docker environments #7 — Describes AGN disposable Docker environments. Replay should integrate with this workflow by allowing a run performed safely in an isolated environment to be replayed/applied later outside that environment after inspection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions