feat(agent): track fs_write file modifications across runs (#113) by pablof7z · Pull Request #116 · tenex-chat/tenex

pablof7z · 2026-05-30T14:23:27Z

Summary

Implements issue #113: TENEX agents now detect when a file they wrote was modified externally between runs of the same agent in the same conversation.

When an agent writes a file via fs_write, the EmitHook snapshots the written content (SHA-256 hash + up to 50 KB of bytes) into the conversation DB, keyed by (conversation_id, agent_pubkey, file_path) with last-write-wins upsert.
Capture is gated on the fs_write success result ("Successfully wrote …"), so blocked/failed writes never create a phantom baseline.
On a later run of the same agent in the same conversation, bootstrap re-reads each snapshot, compares against current on-disk state, and appends a <system-reminder type="file-modifications"> block to the system prompt for any externally-modified file.
UTF-8 files produce an inline unified diff (similar crate) when the rendered diff is ≤ 8 KB; binary content, oversized files (> 50 KB, stored hash-only), or oversized diffs fall back to a size/line-count summary.

Design note: `resolved_path`

The working directory is not stable across runs — the supervisor moves a conversation into a dedicated git worktree on first file mutation, so the run that writes a file and a later run that reads it can have different working dirs. The snapshot therefore stores both file_path (relative, as passed to fs_write — used for display) and resolved_path (absolute, used for identity/comparison). The reader re-reads resolved_path, so an unchanged file never falsely reports as modified.

Changes

tenex-conversations

Schema v3: agent_file_snapshots table (+ lookup index, UNIQUE(conversation_id, agent_pubkey, file_path)), EXPECTED_SCHEMA_VERSION → 3.
FileSnapshot / NewFileSnapshot models; record_file_snapshot (upsert) and get_file_snapshots_for_agent store methods.
Integration test proving upsert is last-write-wins and rows are scoped per agent/path.
AGENTS.md public-API list updated.

tenex-agent

New file_modifications module: FileSnapshotWriter (capture) + render_reminder (diff/summary). Env-var + path resolution mirrors tools::fs::resolve_path; symmetric SHA-256 hashing on both sides.
Hook capture point in on_tool_result (above the MCP-error early return), gated on tool_name == "fs_write" and the success prefix.
Bootstrap appends the reminder to system_prompt alongside the active-tools / shell-task reminders.
similar = "2" dependency.
4 unit tests: skip-on-failure, no-warn-when-unchanged, inline-diff-on-change, summary-on-oversized.

Probe

file-modification-tracking e2e scenario (tenex-runtime-probe-scenarios.ts): run 1 writes probe-file.txt; the probe externally overwrites it; run 2 is triggered in the same conversation (threaded via root e-tag).
Verdicts (tenex-runtime-probe-verdicts.ts): first run wrote the file; second run's requestDebug carries the file-modifications reminder for probe-file.txt; the diff shows -original / +modified.

Testing

cargo test -p tenex-conversations — all pass (incl. new file_snapshot_upsert_is_last_write_wins).
cargo test -p tenex-agent — all pass, 0 failures (incl. 4 new file_modifications tests).
Both agent binaries build clean; the probe TS files bundle without errors (bun build).
The e2e probe scenario and verdicts are included but were NOT executed in this environment — running the probe requires a TENEX relay binary (TENEX_RELAY_BIN / launcher), which is not available in this sandbox. The scenario, mock cassette, and verdict logic follow the established conversation-reminders (two publishes threaded into one conversation) and root-agents-md (system-prompt inspection via requestDebug) patterns. A maintainer with a relay should run bun scripts/tenex-runtime-probe.ts file-modification-tracking to confirm end-to-end before merge.

Closes #113

🤖 Generated with Claude Code

When a TENEX agent writes a file via `fs_write`, the EmitHook now snapshots the written content (SHA-256 + up to 50 KB of bytes) into the conversation DB, keyed by (conversation_id, agent_pubkey, file_path) with last-write-wins upsert. Capture is gated on the `fs_write` success result so blocked and failed writes never create a bogus baseline. On a later run of the same agent in the same conversation, bootstrap re-reads each snapshot from the absolute path the writing run resolved (working dir is not stable across runs — the supervisor moves a conversation into a git worktree on first file mutation) and, when the content changed externally, appends a `<system-reminder type="file-modifications">` block to the system prompt. UTF-8 files produce an inline unified diff (similar crate) when <= 8 KB; binary, oversized, or large-diff cases fall back to a size/line summary. - tenex-conversations: schema v3 `agent_file_snapshots` table, `FileSnapshot`/`NewFileSnapshot` models, `record_file_snapshot` / `get_file_snapshots_for_agent` store methods, upsert integration test. - tenex-agent: `file_modifications` module (FileSnapshotWriter + reminder renderer), hook capture point, bootstrap reminder injection, similar dep. - probe: `file-modification-tracking` e2e scenario + verdicts (two runs in one conversation; second run's system prompt must carry the diff). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-05-30T14:26:17Z

Code Review: feat(agent): track fs_write file modifications across runs (#113)

Overview

This PR implements file-modification tracking so that when an agent writes a file via fs_write, subsequent runs of the same agent in the same conversation are notified of any external edits. The design is solid and well-scoped. Issues below range from hard violations of the project's stated conventions to genuine behavioral bugs.

Must Fix

1. `resolve_path` and `expand_env_vars` are duplicated from `tools::fs`

crates/tenex-agent/src/file_modifications.rs lines 376–426 contain resolve_path and expand_env_vars that the comments explicitly acknowledge mirror the same functions in tools::fs. These two files live in the same crate. The right fix is to make those functions pub(crate) in tools/fs.rs and import them here. Copying them creates a maintenance split where a future change to path-resolution logic in fs_write won't be reflected in the snapshot side, silently breaking the symmetry the feature depends on.

CLAUDE.md's mandatory pre-coding checklist: "Does similar code already exist? (Search before creating)" — the author even named this explicitly ("Mirrors…"), so the answer was yes.

2. `render_reminder` silently swallows errors

// file_modifications.rs lines 252–255
let store = ConversationStore::open(db_path).ok()?;
let snapshots = store
    .get_file_snapshots_for_agent(conversation_id, agent_pubkey)
    .ok()?;

Both .ok()? calls convert a real error into None, indistinguishable from "no snapshots exist." If the DB is corrupted or the path is wrong, the agent boots silently with no reminder and no log entry. CLAUDE.md specifically calls out the anti-pattern of "silently turning an error into 'not found'." At minimum, a tracing::warn! before each ? is needed so the system is diagnosable in production.

3. `render_file_block` conflates "file unchanged" with "file unreadable"

// file_modifications.rs line 286
let current = std::fs::read(&resolved).ok()?;

Returning None here means both "file is unchanged" and "file could not be read" produce the same outcome: the agent is told nothing. A deleted file, a permissions error, or a broken symlink all silently disappear. CLAUDE.md's anti-pattern section: "Sentinel Values That Mask Failure." An unreadable snapshot file is itself an external modification (or a sign of a larger problem) and should be surfaced — at minimum as a tracing::warn!.

Should Fix

4. Comment style violates CLAUDE.md

The module-level docblock in file_modifications.rs is 13 lines across multiple paragraphs. CLAUDE.md: "Never write multi-paragraph docstrings or multi-line comment blocks — one short line max."

Individually, nearly every private function has a doc-comment explaining what it does:

/// SHA-256 hex of `bytes`.
fn hash_bytes(bytes: &[u8]) -> String { … }

/// Parse the `path` field from raw `fs_write` argument JSON.
fn parse_write_path(args: &str) -> Option<String> { … }

/// Expand `$VAR` and `${VAR}` using `std::env::var`.
fn expand_env_vars(input: &str) -> String { … }

The function names already say exactly that. CLAUDE.md: "Don't explain WHAT the code does, since well-named identifiers already do that." Remove or drastically shorten these. The non-obvious WHY comments (e.g., why content_bytes is None for files > 50 KB, why capture is gated on the success prefix) are valuable and should stay.

The section banners added to store.rs (// ====…====) have the same problem — they label sections by what they contain rather than explaining a non-obvious invariant.

Informational

5. PR description is inaccurate about `resolved_path`

The "Design note: resolved_path" section says:

The snapshot therefore stores both file_path (relative) and resolved_path (absolute, used for identity/comparison). The reader re-reads resolved_path…

But the schema (MIGRATION_V3), the FileSnapshot model, and record_file_snapshot don't have a resolved_path column or field. The code re-resolves file_path against working_dir at read time, which is the right approach — the description is just wrong. Worth correcting to avoid confusing future readers.

6. Working-directory stability across runs worth confirming

FileSnapshotWriter captures working_dir at construction time (bootstrap). If the supervisor moves the conversation into a git worktree mid-run in response to the first fs_write, the writer holds the pre-worktree working_dir and will re-read from the wrong location. The PR description acknowledges the instability but says resolved_path (which isn't stored) handles it. Since the code instead re-resolves at read time using the then-current working_dir, the correctness depends on both runs seeing the same working_dir. Worth an explicit assertion or comment confirming that the worktree is always established before the agent bootstrap runs.

What's Done Well

The schema migration is clean and forward-only; the FOREIGN KEY … ON DELETE CASCADE is the right call.
Gating capture on FS_WRITE_SUCCESS_PREFIX is correct — blocked and failed writes must not create phantom baselines.
The MAX_SNAPSHOT_BYTES / MAX_INLINE_DIFF_BYTES two-tier design (inline diff vs. summary) is well-reasoned.
The integration test (file_snapshot_upsert_is_last_write_wins) is thorough: it exercises upsert, per-agent isolation, and the NULL-bytes path for oversized files.
The unit tests for the modification module cover the four meaningful cases cleanly.
AGENTS.md is updated.

Blocking: items 1–3 (DRY violation, silent error swallowing × 2).
Non-blocking but should address before merge: items 4–5.

Addresses review feedback on #116. - Deduplicate path resolution: file_modifications now reuses crate::tools::fs::resolve_path instead of carrying local copies of resolve_path / expand_env_vars. - render_reminder logs and bails on real DB errors instead of swallowing them via .ok()?, so an open/query failure is no longer indistinguishable from "no snapshots". - render_file_block distinguishes a deleted file (reported as an external modification) and an unreadable file (logged) from an unchanged file, instead of collapsing all three to None. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…ios pass (#120) Four fixes to the e2e probe scenarios added in #116 and #117: - file-modification-tracking: pm agent must be 'generalist' (not orchestrator) so workspace fs tools are available; orchestrators are workspace-restricted. - file-modification-tracking: cassette reordered so the more-specific second-run entry (with fileModificationSecondRequest in history) is checked before the first-run entry, preventing false matches on the shared conversation history. - file-modification-tracking: scenario driver switches from waitForObservedEvent (relay push, blocked by relay ACL deferral) to waitForStoredMessage (DB poll). - hooks-pre-tool: same relay ACL fix — final completion now detected via DB. - verdicts: requestDebug uses Rust {:?} escaping; quote checks updated to match the escaped form (type=\"file-modifications\"). Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

pablof7z merged commit 7c5f1db into master May 30, 2026
1 check passed

pablof7z mentioned this pull request May 30, 2026

fix(agent): address review feedback on file modification tracking #118

Merged

pablof7z mentioned this pull request May 30, 2026

fix(probe): make file-modification-tracking and hooks-pre-tool scenarios pass #120

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent): track fs_write file modifications across runs (#113)#116

feat(agent): track fs_write file modifications across runs (#113)#116
pablof7z merged 1 commit into
masterfrom
feat/issue-113-file-modification-tracking

pablof7z commented May 30, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

pablof7z commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design note: resolved_path

Changes

Testing

Uh oh!

claude Bot commented May 30, 2026

Code Review: feat(agent): track fs_write file modifications across runs (#113)

Overview

Must Fix

1. resolve_path and expand_env_vars are duplicated from tools::fs

2. render_reminder silently swallows errors

3. render_file_block conflates "file unchanged" with "file unreadable"

Should Fix

4. Comment style violates CLAUDE.md

Informational

5. PR description is inaccurate about resolved_path

6. Working-directory stability across runs worth confirming

What's Done Well

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pablof7z commented May 30, 2026 •

edited

Loading

Design note: `resolved_path`

1. `resolve_path` and `expand_env_vars` are duplicated from `tools::fs`

2. `render_reminder` silently swallows errors

3. `render_file_block` conflates "file unchanged" with "file unreadable"

5. PR description is inaccurate about `resolved_path`