Motivation
Testing the v1 implementation phases means re-running the demo over and over (proposals/v1-implementation-plan.md → Demo-run protocol per phase). Today the only reset is a full teardown (tilth reset → workspace.reset_session_state: git worktree remove --force + git branch -D + rm -rf sessions/<id>/), so each test cycle forces a fresh tilth prep-feature — an interactive interview (TTYFrontend ask_user), the slowest and least-automatable part of the loop.
When we're not testing the seed workflow — only the worker/evaluator/ledger/case mechanics of a given phase — the seed is reusable. We want to re-run tilth run against the same committed seed without re-interviewing.
Proposal
tilth reset --to-seed [<session_id>] — a tilth testing/research flag. It rewinds time to the instant prep-feature finished: the seed is intact, and every trace of post-seed activity is gone from both the code and the logs, as if the run never happened.
This is deliberately destructive and lossy — that's the feature, not a flaw. We are not preserving the prior run for comparison (if you want that, run separate sessions). The point is a pristine slate so the next run's logs and worktree carry zero noise from the last attempt. A [y/N] confirm is the safety gate.
Anchor: the seed_committed event records the seed commit (payload.sha + branch) — a clean HEAD on session/<id> before any task work. That sha is the rewind target.
Rewind actions:
- Worktree + branch:
git reset --hard <seed_sha> + git clean -fd. Drops all post-seed commits — per-task commits and any FAILED commit (loop.py:1275/:1287) — and untracked task files. The orphaned commits are GC'd; we keep no tip tag (no trace, by design).
prd.json: reset every task status back to pending.
checkpoint.json: status → prepared, clear last-completed-task, reset tokens_used to its post-interview value (the run's spend is erased).
events.jsonl: truncate to the seed boundary — keep only the prep events up to and including seed_committed (session_start[phase=prep-feature], session_prepared, seed_committed); drop every event after. No archive.
- Delete outright:
ledger/, progress.txt, proposed-learnings.md, summary.json, chat.html (rebuilt on the next run).
Postcondition: sessions/<id>/ is byte-for-byte equivalent to its state the moment prep-feature returned (modulo timestamps) — tilth run against it starts as a clean first attempt.
Caveats to encode
- Destructive, irreversible. The confirm prompt must say so plainly — the prior run's code and logs are unrecoverable. (Contrast with full
tilth reset, which is also destructive but at least obvious; --to-seed looks gentler, so the warning matters more.)
- Phase-boundary compatibility. Per the plan's "no backwards-compat across phase boundaries," a seed written under one phase's seeder may not be re-runnable under a later phase if the
prd.json/seed shape changed. --to-seed is for iterating within a phase. If the loaded seed fails a shape check, error out and point at full reset + prep-feature.
- Requires a
seed_committed event. A session that never finished prep has no rewind target → clean error, no-op.
Out of scope
Related
proposals/v1-implementation-plan.md — Demo-run protocol per phase (this directly speeds that loop)
workspace.reset_session_state, loop._do_reset, ws.commit_task, the seed_committed / commit events
Motivation
Testing the v1 implementation phases means re-running the demo over and over (
proposals/v1-implementation-plan.md→ Demo-run protocol per phase). Today the only reset is a full teardown (tilth reset→workspace.reset_session_state:git worktree remove --force+git branch -D+rm -rf sessions/<id>/), so each test cycle forces a freshtilth prep-feature— an interactive interview (TTYFrontendask_user), the slowest and least-automatable part of the loop.When we're not testing the seed workflow — only the worker/evaluator/ledger/case mechanics of a given phase — the seed is reusable. We want to re-run
tilth runagainst the same committed seed without re-interviewing.Proposal
tilth reset --to-seed [<session_id>]— a tilth testing/research flag. It rewinds time to the instantprep-featurefinished: the seed is intact, and every trace of post-seed activity is gone from both the code and the logs, as if the run never happened.This is deliberately destructive and lossy — that's the feature, not a flaw. We are not preserving the prior run for comparison (if you want that, run separate sessions). The point is a pristine slate so the next run's logs and worktree carry zero noise from the last attempt. A
[y/N]confirm is the safety gate.Anchor: the
seed_committedevent records the seed commit (payload.sha+branch) — a clean HEAD onsession/<id>before any task work. That sha is the rewind target.Rewind actions:
git reset --hard <seed_sha>+git clean -fd. Drops all post-seed commits — per-taskcommits and anyFAILEDcommit (loop.py:1275/:1287) — and untracked task files. The orphaned commits are GC'd; we keep no tip tag (no trace, by design).prd.json: reset every taskstatusback topending.checkpoint.json:status→prepared, clear last-completed-task, resettokens_usedto its post-interview value (the run's spend is erased).events.jsonl: truncate to the seed boundary — keep only the prep events up to and includingseed_committed(session_start[phase=prep-feature],session_prepared,seed_committed); drop every event after. No archive.ledger/,progress.txt,proposed-learnings.md,summary.json,chat.html(rebuilt on the next run).Postcondition:
sessions/<id>/is byte-for-byte equivalent to its state the momentprep-featurereturned (modulo timestamps) —tilth runagainst it starts as a clean first attempt.Caveats to encode
tilth reset, which is also destructive but at least obvious;--to-seedlooks gentler, so the warning matters more.)prd.json/seed shape changed.--to-seedis for iterating within a phase. If the loaded seed fails a shape check, error out and point at fullreset+prep-feature.seed_committedevent. A session that never finished prep has no rewind target → clean error, no-op.Out of scope
pyproject.tomlpollution (separate; Research: Docker Sandboxes (sbx) as opt-in process isolation #13).Related
proposals/v1-implementation-plan.md— Demo-run protocol per phase (this directly speeds that loop)workspace.reset_session_state,loop._do_reset,ws.commit_task, theseed_committed/commitevents