Skip to content

docs(sync-agent): design spec for dynamic per-integration sync engines#19

Open
rtpa25 wants to merge 14 commits into
mainfrom
feat/sync-agent-design
Open

docs(sync-agent): design spec for dynamic per-integration sync engines#19
rtpa25 wants to merge 14 commits into
mainfrom
feat/sync-agent-design

Conversation

@rtpa25

@rtpa25 rtpa25 commented May 17, 2026

Copy link
Copy Markdown
Owner

Summary

Master design doc for the sync-agent feature — a 3-PR stack that turns Pipedream integration connections into self-installing sync engines + skill files, so the main agent can answer questions about a user's third-party data with zero re-exploration on every thread.

  • Trigger model: Cloudflare Queue (integration-sync) fires on every connection; consumer dispatches to kernel.runSyncAgent(integration, account_id). No slash-command trigger. Idempotency keyed on skill-md existence in R2.
  • Two-branch agent: Phase-1 triage decides sync-engine flavor (file systems, DBs, ticket trackers, calendars, wikis — gets full handler + R2 mirror + facet) or api-only flavor (LinkedIn / MailChimp / WhatsApp / Twilio — gets just a skill md that documents exec_code API patterns and exits).
  • Per-sync DO facets for handler isolation (sauna parity). MirrorIO + HttpGateway WorkerEntrypoints inject scoped capabilities; facet's own SQLite carries agent_dedup/raw_events/cursor state via agent-authored schema_ddl.
  • Mirror = structural index, not content (Prime Directive, §1.1). Mirror tops out at ~5MB even on huge integrations; content fetched on demand via exec_code.
  • Central alarm scheduler in the supervisor — facets validated empirically (2026-05-17) to NOT support setAlarm on Cloudflare's runtime. Supervisor's alarm() body iterates active syncs, RPCs into facets for renewWebhook / pollOnce.

PR stack downstream of this one

  • PR-F1 — Foundation primitives (sync_engine + sync_errors schema, facet wrapper, Worker Loader sandbox, HttpGateway/MirrorIO WorkerEntrypoints, webhook receiver route, central alarm scheduler skeleton, /sync status + /sync delete, hand-written Dropbox fixture). No agent yet.
  • PR-F2 — Sync agent + Dropbox e2e (sync-agent system prompt, curated tool surface, queue consumer + /integrations/connected API route, runSubAgent extraction from spawn_sub_agent).
  • PR-F3 — Polling fallback + research wave + additional integrations (Notion / Linear / GCal smoke targets).

Test plan

  • Skim the spec end-to-end for coherence + remaining open questions
  • Confirm the 3-PR slicing matches your appetite (or push back)
  • Confirm decisions table (#0–fix(cli): green bullet on done, bright bullet on input-streaming #12) captures every load-bearing call
  • Decide on the test worker cleanup (cf-facet-alarm-test.pandaronit25.workers.dev is still deployed; wrangler delete cf-facet-alarm-test to remove)
  • Greenlight the PR-F1 plan kickoff (separate doc, written when we start that PR)

🤖 Generated with Claude Code

rtpa25 and others added 11 commits May 17, 2026 16:18
Master spec for a 3-PR stack (F1 foundation primitives, F2 sync agent +
Dropbox e2e, F3 polling + additional integrations). Covers: queue-triggered
sync agent runs on every Pipedream connection; agent triages into a full
sync-engine branch (file systems, databases, ticket trackers) or an
api-only skill-writing branch (write-only APIs, broadcasters); per-sync
Cloudflare DO facets for handler isolation with R2 json mirror as the
main-agent-facing index layer and facet SQLite for handler-internal state
(dedup, raw_events for probe loop). Central alarm scheduler in the
supervisor — facet setAlarm empirically validated as unsupported by
Cloudflare's runtime (cf-facet-alarm-test, 2026-05-17). Mirror = index,
not content (Prime Directive) — agents enumerate structural primitives
only; content stays at the source and is fetched on demand via exec_code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n_in_sync

Per sauna pattern (src/agent/tools/write/run-in-sync.ts), backfill is not a
persistent handler export. It runs as one-shot agent-authored ES-module code
through a separate Worker Loader bundle, sharing the run_in_sync exec runner
that also handles diagnostics, data repair, and probe-loop triggers.

- §3.6: removed backfill from required exports; added explanatory note
- §3.7: removed backfill from wrapper RPC list
- §3.9: create_sync_engine no longer awaits handler.backfill
- §3.10.3: split install (Phase 5) from backfill (Phase 5.5 via run_in_sync)
- §3.10.4: create_sync_engine + run_in_sync descriptions updated
- §4.1: trace shows run_in_sync exec bundle instead of facet.backfill()
- §5.1: 'Backfill throws' reframed as run_in_sync return-value failure
- §6 PR-F1: added exec-runner.ts + dropbox-backfill-exec.js fixture

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User feedback: forcing one struct.json per sync over-constrains integrations
that benefit from multi-file layouts (Gmail: labels.json + cursors.json +
threads/by-label/*.json; Drive: per-folder index). Agent should design the
layout; the supervisor just provides scoped FS primitives.

Replaces MirrorIO (readMirror/writeMirror unknown-typed) with MirrorFS
exposing read/write/list/delete over R2 keys relative to a per-sync prefix.

- §3.3: mirror_path column → mirror_prefix (defaults integrations/<slug>/)
- §3.4: rewritten — agent designs file layout; Dropbox/Gmail/Airtable examples
- §3.5: Dropbox skill example updated (mirror_prefix + Mirror layout section)
- §3.6: HandlerCtx.readMirror/writeMirror → ctx.mirror.{read,write,list,delete}
- §3.7: MirrorIO → MirrorFS; capabilities pass-through carries mirrorPrefix
- §3.10.4: read_mirror takes a path; add list_mirror; run_in_sync FacetBridge
  exposes __read/__write/__list/__delete instead of __readMirror/__writeMirror
- §3.10.4: create_sync_engine/delete_sync use mirror_prefix; delete sweeps
  every R2 key under the prefix (batch delete)
- §4.1/§4.2 traces: handler/exec code uses ctx.mirror.read("tree.json") +
  facet.__write("tree.json", ...) shape
- §4.3 main-agent trace: read_file integrations/dropbox/tree.json
- §6 PR-F1: mirror-io.ts → mirror-fs.ts; exec-runner FacetBridge updated
- §10 decisions: row 6 reframed (cursor location is agent's call); added
  row 6c (mirror is prefix-scoped KV space, not single file — rationale)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e placement

User feedback: spec was over-prescribing where cursors live (mirror-first
framing). Cleaner pattern: agent declares `webhook_config` table in
schema_ddl (channel_id, expires_at, history_id) and UPDATEs that row on
every webhook — faster and simpler than read-parse-mutate-write on a JSON
file. Bigger principle: primitives don't teach behavior, prompts do.

- New §1.2: 'Primitives stay neutral; the system prompt teaches behavior'.
  Codifies that placement (cursor, layout, schema, surface choice) is
  always the agent's call; the runtime imposes no defaults.
- §3.4: rewrote cursor-placement paragraph to list three peer options
  (mirror-embedded / sidecar file / SQLite row); SQLite is the preferred
  default for high-churn or multi-field webhook state.
- §3.4 Gmail example: shows `webhook_config` SQL table as the cursor home
  (single-row UPDATE per webhook), keeps mirror for grep-friendly index;
  notes the cursors.json alternative as equally valid for lower-volume
  integrations.
- §3.6 'Why this is enough for Gmail': updated to SQL cursor path with
  R2/SQL op counts.
- §3.10.3 Phase 2 schema_ddl example: replaces generic `cursors` example
  with both `webhook_config` (Gmail shape) and `cursors` (Drive shape).
- §3.10.3 Phase 4: new paragraph teaching the SQL-vs-mirror placement
  rule of thumb (main agent reads it → mirror; only handler reads it →
  SQL; when in doubt → SQL).
- §10 decisions: row 6 reframed (SQLite is the preferred default for
  webhook state); new row 13 codifying the neutral-primitives principle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t in consumer

User correction: sync agent is just an ai-sdk streamText invocation living
inside the queue consumer. No thread, no run row, no message persistence,
no streaming UI, no tool_call audit. User never sees it. Side effects
(skill md, sync_engine row, mirror files, registered webhook) ARE the
output. Kernel methods exist purely as RPC backends for tools that need
to touch the DO sqlite — no runSyncAgent RPC, no system thread, no
runSubAgent extraction.

- §1 goal: reframed flow — consumer calls runSyncAgent({env, integration,
  accountId, kernelStub}) directly (not a DO RPC method)
- §3.10: rewrote — sync agent is a plain streamText invocation in the
  queue consumer; included pseudocode skeleton showing tools wired via
  closure on kernelStub; explained why no thread/run/audit needed and
  debuggability via wrangler tail
- §3.10.2: queue ack semantics updated; debugging via wrangler tail +
  sync_errors instead of system-thread browsing
- §3.10.4: tool surface clarified — plain tool() from "ai" (no
  wrappedTool); buildSyncAgentTools takes {env, integration, accountId,
  kernelStub}; manage_tasks removed (no per-thread tasks table to write to)
- §4.1: end-to-end trace rewritten — Consumer invokes runSyncAgent
  directly; tool execute fns RPC into kernelStub; no Kernel DO orchestrates
  the agent run, only services the tool calls
- §6 PR-F2: added sync-agent.ts (runSyncAgent free function); dropped
  run-sub-agent.ts extraction + spawn-sub-agent refactor + system-thread
  bootstrap from kernel.ts; renamed Kernel methods to the tool-backend
  RPCs that actually need to exist (skillExists, writeSkill,
  createSyncEngine, runInSync, etc.)
- §10 decisions: rewrote rows 8 & 9 — no thread/run machinery; no
  runSubAgent extraction. Rationale: spawn_sub_agent's persistence
  machinery exists because sub-agent output is visible to the user;
  sync agent isn't, so all of that infrastructure is irrelevant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ror layout

Mirror-as-index only works if the main agent can read mirror files without
dragging the whole object into its context window. Adds two main-agent
tools and threads matching guidance through the sync agent's design rules.

§3.1.1 (new): Main-agent R2 read primitives
- read_file extended with offset/limit (1-based lines) + 64KB soft cap on
  full reads. Cap-violation error points at grep_file / paginated reads.
- grep_file (new): server-side regex over R2 files with path or
  prefix+glob scope. Plain JS RegExp (no WASM ripgrep). Returns
  {file, line, match, before[], after[]} records, capped at 100 matches.
- Both wrappedTool({touchesFS:false}) per the read-only-tools convention.
- Argued out: two tools beat one composite (divergent return shapes,
  arg surfaces, failure hints; Claude Code precedent).

§3.4: mirror layout gets a budget rule
- Aim for ≤64KB per file so read_file is one-shot.
- For high-cardinality indexes: NDJSON (line-grep-friendly,
  line-offset-readable) or sharding (per-axis files + a top-level
  index.json). Mix is fine (Gmail = labels.json + per-label NDJSON).
- Document the layout in the skill md so the main agent picks
  read_file vs grep_file correctly.

§3.4 Dropbox canonical example switched to NDJSON+meta:
- tree.ndjson (one entry per line) + meta.json (cursor + counts).
- §3.5.1 skill md example, §3.10.3 Phase 2 schema design, §4.1 trace
  (run_in_sync code), §4.2 webhook delivery handler, §4.3 main agent
  query (grep_file instead of read_file), §6 PR-F1 fixture line, §6
  PR-F2 smoke checklist — all updated to match.

§6 PR slicing:
- New PR-R1 (Main-agent R2 read primitives) — independent prerequisite,
  no compile-time dependency on F1/F2/F3 but unblocks their design
  assumptions. Files modified: read-file.ts. Files created: grep-file.ts.
  4 smoke checks.

§10 decisions:
- Row 6: updated wording (meta.json/tree.ndjson pattern; "never embed a
  churning cursor in a multi-MB index file").
- Row 14 (new): read_file + grep_file, not composite — divergent shapes
  + Claude Code precedent.
- Row 15 (new): 64KB cap + NDJSON/sharding preference; guardrail with
  hint, not hard refusal (offset/limit bypasses).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bite-sized plan covering:
- read_file: add offset/limit (1-based line slicing), line-number gutter
  on output, touchesFS:false. Keeps existing 64KB truncate semantics for
  the no-args case.
- grep_file: new server-side regex tool over R2 text files. path | prefix
  scope, regex_flags, context_lines, max_matches. Binary files skipped.
- One-sentence main-agent system prompt update.
- 9-step agent-driven smoke (no unit tests per memory).
- PR raised against main, independent of the sync-agent spec PR.

Decisions locked pre-plan: keep truncate semantics (not error-on-overflow),
line gutter on output, regex string for grep pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rtpa25 and others added 3 commits May 18, 2026 14:33
Smoke caught read_file refusing seed-sample.ndjson as binary. Root cause:
.ndjson wasn't in the EXT_CONTENT_TYPE map, so the path-extension sniff
fell back to application/octet-stream — and per commit-driver.ts:637
the materialize step always re-sniffs from path on canonical writes,
so custom contentTypes passed at writeFile() time don't survive the
git-pipeline commit anyway. The only viable fix is the extension map.

- env-fs.ts: map ndjson + jsonl → application/x-ndjson.
- fs-tools.ts: extend TEXT_CT_RE to accept application/x-ndjson and
  application/x-jsonl so read_file / grep_file recognize them as text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smoke caught a second issue: even after fixing the EXT_CONTENT_TYPE map,
canonical R2 holds the old "application/octet-stream" httpMetadata for
files written before the fix. materializeMainToCanonical only writes
canonical when the git blob sha changes — same-content rewrites don't
refresh stored metadata. So a fresh sniff returns x-ndjson, but the
stored type stays binary, and the tool's gate rejects.

Adds `resolveTextContentType(stored, path)`: trust stored if it's already
text-shaped; otherwise re-sniff from the path extension and prefer that
when it lands in TEXT_CT_RE. Applied to read_file, edit_file, and
grep_file (both single-path and prefix-loop branches). The error path
still surfaces the stored type so the agent's reasoning matches what
the agent sees in stat() output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(tools): read_file slice/gutter + new grep_file (PR-R1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant