feat(agent): track fs_write file modifications across runs (#113)#116
Conversation
When a TENEX agent writes a file via `fs_write`, the EmitHook now snapshots the written content (SHA-256 + up to 50 KB of bytes) into the conversation DB, keyed by (conversation_id, agent_pubkey, file_path) with last-write-wins upsert. Capture is gated on the `fs_write` success result so blocked and failed writes never create a bogus baseline. On a later run of the same agent in the same conversation, bootstrap re-reads each snapshot from the absolute path the writing run resolved (working dir is not stable across runs — the supervisor moves a conversation into a git worktree on first file mutation) and, when the content changed externally, appends a `<system-reminder type="file-modifications">` block to the system prompt. UTF-8 files produce an inline unified diff (similar crate) when <= 8 KB; binary, oversized, or large-diff cases fall back to a size/line summary. - tenex-conversations: schema v3 `agent_file_snapshots` table, `FileSnapshot`/`NewFileSnapshot` models, `record_file_snapshot` / `get_file_snapshots_for_agent` store methods, upsert integration test. - tenex-agent: `file_modifications` module (FileSnapshotWriter + reminder renderer), hook capture point, bootstrap reminder injection, similar dep. - probe: `file-modification-tracking` e2e scenario + verdicts (two runs in one conversation; second run's system prompt must carry the diff). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review: feat(agent): track fs_write file modifications across runs (#113)OverviewThis PR implements file-modification tracking so that when an agent writes a file via Must Fix1.
|
Addresses review feedback on #116. - Deduplicate path resolution: file_modifications now reuses crate::tools::fs::resolve_path instead of carrying local copies of resolve_path / expand_env_vars. - render_reminder logs and bails on real DB errors instead of swallowing them via .ok()?, so an open/query failure is no longer indistinguishable from "no snapshots". - render_file_block distinguishes a deleted file (reported as an external modification) and an unreadable file (logged) from an unchanged file, instead of collapsing all three to None. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ios pass (#120) Four fixes to the e2e probe scenarios added in #116 and #117: - file-modification-tracking: pm agent must be 'generalist' (not orchestrator) so workspace fs tools are available; orchestrators are workspace-restricted. - file-modification-tracking: cassette reordered so the more-specific second-run entry (with fileModificationSecondRequest in history) is checked before the first-run entry, preventing false matches on the shared conversation history. - file-modification-tracking: scenario driver switches from waitForObservedEvent (relay push, blocked by relay ACL deferral) to waitForStoredMessage (DB poll). - hooks-pre-tool: same relay ACL fix — final completion now detected via DB. - verdicts: requestDebug uses Rust {:?} escaping; quote checks updated to match the escaped form (type=\"file-modifications\"). Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Implements issue #113: TENEX agents now detect when a file they wrote was modified externally between runs of the same agent in the same conversation.
fs_write, theEmitHooksnapshots the written content (SHA-256 hash + up to 50 KB of bytes) into the conversation DB, keyed by(conversation_id, agent_pubkey, file_path)with last-write-wins upsert.fs_writesuccess result ("Successfully wrote …"), so blocked/failed writes never create a phantom baseline.<system-reminder type="file-modifications">block to the system prompt for any externally-modified file.similarcrate) when the rendered diff is ≤ 8 KB; binary content, oversized files (> 50 KB, stored hash-only), or oversized diffs fall back to a size/line-count summary.Design note:
resolved_pathThe working directory is not stable across runs — the supervisor moves a conversation into a dedicated git worktree on first file mutation, so the run that writes a file and a later run that reads it can have different working dirs. The snapshot therefore stores both
file_path(relative, as passed tofs_write— used for display) andresolved_path(absolute, used for identity/comparison). The reader re-readsresolved_path, so an unchanged file never falsely reports as modified.Changes
tenex-conversationsagent_file_snapshotstable (+ lookup index,UNIQUE(conversation_id, agent_pubkey, file_path)),EXPECTED_SCHEMA_VERSION→ 3.FileSnapshot/NewFileSnapshotmodels;record_file_snapshot(upsert) andget_file_snapshots_for_agentstore methods.AGENTS.mdpublic-API list updated.tenex-agentfile_modificationsmodule:FileSnapshotWriter(capture) +render_reminder(diff/summary). Env-var + path resolution mirrorstools::fs::resolve_path; symmetric SHA-256 hashing on both sides.on_tool_result(above the MCP-error early return), gated ontool_name == "fs_write"and the success prefix.system_promptalongside the active-tools / shell-task reminders.similar = "2"dependency.Probe
file-modification-trackinge2e scenario (tenex-runtime-probe-scenarios.ts): run 1 writesprobe-file.txt; the probe externally overwrites it; run 2 is triggered in the same conversation (threaded via roote-tag).tenex-runtime-probe-verdicts.ts): first run wrote the file; second run'srequestDebugcarries thefile-modificationsreminder forprobe-file.txt; the diff shows-original/+modified.Testing
cargo test -p tenex-conversations— all pass (incl. newfile_snapshot_upsert_is_last_write_wins).cargo test -p tenex-agent— all pass, 0 failures (incl. 4 newfile_modificationstests).bun build).TENEX_RELAY_BIN/ launcher), which is not available in this sandbox. The scenario, mock cassette, and verdict logic follow the establishedconversation-reminders(two publishes threaded into one conversation) androot-agents-md(system-prompt inspection viarequestDebug) patterns. A maintainer with a relay should runbun scripts/tenex-runtime-probe.ts file-modification-trackingto confirm end-to-end before merge.Closes #113
🤖 Generated with Claude Code