Skip to content

feat(tools): read_file slice/gutter + new grep_file (PR-R1)#20

Merged
rtpa25 merged 7 commits into
feat/sync-agent-designfrom
feat/sync-r1-read-primitives
May 18, 2026
Merged

feat(tools): read_file slice/gutter + new grep_file (PR-R1)#20
rtpa25 merged 7 commits into
feat/sync-agent-designfrom
feat/sync-r1-read-primitives

Conversation

@rtpa25

@rtpa25 rtpa25 commented May 18, 2026

Copy link
Copy Markdown
Owner

Summary

  • Extends read_file with offset/limit (1-based line slicing) plus a line-number gutter on output. Default no-args behavior unchanged except for the gutter (still truncates at 64KB with truncated:true).
  • Adds grep_file: server-side regex over R2 text files. Scope by single path or recursive prefix. Returns matching lines + N context lines. Caps at 100 matches by default. Binary files skipped automatically.
  • Sets touchesFS: false on read_file (small bug fix — read-only tools shouldn't open a per-call worktree).
  • One-sentence prompt update naming grep_file and teaching the truncated:true → switch to grep pattern.

Stacked on #19

Base = feat/sync-agent-design because the plan references the spec doc on that branch. Retarget to main once #19 merges.

Spec reference

  • docs/superpowers/specs/2026-05-17-sync-agent-design.md §3.1.1 + §6 PR-R1
  • Independent prerequisite for the sync-agent stack (F1/F2/F3) — unblocks NDJSON-mirror design assumptions but has no compile-time dependency on them.

Test plan

  • Seed a small md + a 5000-line NDJSON in r2://my-files/
  • read_file on the NDJSON returns 64KB truncated head with line gutter + size
  • read_file with {offset: 2500, limit: 11} returns the explicit slice, no truncation
  • grep_file on path for a single id returns exactly one match with context
  • grep_file on prefix matches across the small file
  • grep_file with max_matches: 50 returns 50 matches + truncated: true
  • grep_file with pattern: "[unclosed" returns {ok:false, error:"bad_pattern"}
  • Binary file → {ok:false, error:"binary_not_supported"}
  • Cleanup deletes succeed

🤖 Generated with Claude Code

rtpa25 and others added 7 commits May 18, 2026 11:57
Bite-sized plan covering:
- read_file: add offset/limit (1-based line slicing), line-number gutter
  on output, touchesFS:false. Keeps existing 64KB truncate semantics for
  the no-args case.
- grep_file: new server-side regex tool over R2 text files. path | prefix
  scope, regex_flags, context_lines, max_matches. Binary files skipped.
- One-sentence main-agent system prompt update.
- 9-step agent-driven smoke (no unit tests per memory).
- PR raised against main, independent of the sync-agent spec PR.

Decisions locked pre-plan: keep truncate semantics (not error-on-overflow),
line gutter on output, regex string for grep pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smoke caught read_file refusing seed-sample.ndjson as binary. Root cause:
.ndjson wasn't in the EXT_CONTENT_TYPE map, so the path-extension sniff
fell back to application/octet-stream — and per commit-driver.ts:637
the materialize step always re-sniffs from path on canonical writes,
so custom contentTypes passed at writeFile() time don't survive the
git-pipeline commit anyway. The only viable fix is the extension map.

- env-fs.ts: map ndjson + jsonl → application/x-ndjson.
- fs-tools.ts: extend TEXT_CT_RE to accept application/x-ndjson and
  application/x-jsonl so read_file / grep_file recognize them as text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smoke caught a second issue: even after fixing the EXT_CONTENT_TYPE map,
canonical R2 holds the old "application/octet-stream" httpMetadata for
files written before the fix. materializeMainToCanonical only writes
canonical when the git blob sha changes — same-content rewrites don't
refresh stored metadata. So a fresh sniff returns x-ndjson, but the
stored type stays binary, and the tool's gate rejects.

Adds `resolveTextContentType(stored, path)`: trust stored if it's already
text-shaped; otherwise re-sniff from the path extension and prefer that
when it lands in TEXT_CT_RE. Applied to read_file, edit_file, and
grep_file (both single-path and prefix-loop branches). The error path
still surfaces the stored type so the agent's reasoning matches what
the agent sees in stat() output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rtpa25 rtpa25 merged commit 1bcf7e7 into feat/sync-agent-design May 18, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant