feat: context engine + session tools (PR-B)#15
Merged
Conversation
Read-side consumer of PR-A's vector ingestion pipeline. Adds three new context-message blocks (<memory> Sauna-style hybrid injection, <sessions_recent> SQL-only thread list, <current_date> cache-friendly day formatter) and three agent tools (sessions_search wrapping Kernel.searchMessages, session_info as a new DO method, get_time pure-sync). System prompt grows <sessions> + <datetime> blocks and a one-line edit to <memory_and_skills>. Cache breakpoint b2 placed at end of <current_date> (the last context block), covering the whole context-message stack — keeps prompt cache warm across all turns in a thread. Final scope decided through structured brainstorm; sessions_fragments auto-injection dropped (agent uses sessions_search on demand instead). Single get_time tool over Dimension-style 6-tool surface; TUI sends X-Client-Timezone header so the kernel can render dates in user's local TZ. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three corrections caught while drafting the implementation plan:
1. TUI dispatches chat-request via `useAgent` (WebSocket from `agents/react`),
sending a CF_AGENT_USE_CHAT_REQUEST frame at apps/cli/src/app.tsx:391.
X-Client-Timezone HTTP header is impossible — clientTimezone goes as a
field in init.body alongside `message` and `attachments`. handlers.ts
reads it via JSON.parse(msg.init.body).
2. Thread schema column is `lastMessageAt` (threads.ts:15), not
`lastActivityAt`. Updated SQL + return shapes accordingly. NULLS LAST
handles threads that have no messages yet.
3. PerTurnContext (wrapped-tool.ts:27) verified — adding `kernel: Kernel`
to that type plus `buildTools` signature so sessions_search /
session_info execute() can RPC into DO methods directly. `this` is
available at the buildTools call site in turn.ts:201.
§16 verification hooks now closed:
- ~~EnvFS.readFile shape~~ — throws EnvFSError("not_found"), returns FileBytes
- ~~thread.lastActivityAt~~ — column is lastMessageAt
- ~~buildTools deps shape~~ — PerTurnContext at wrapped-tool.ts:27
Only smoke #12 (b2 cache empirical verification) remains as a runtime check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11-task bite-sized plan for context engine + session tools: 1. clientTimezone plumbing (kernel) 2. clientTimezone in CF_AGENT_USE_CHAT_REQUEST body (TUI) 3. <current_date> block 4. <memory> block (Sauna hybrid) 5. <sessions_recent> block 6. Wire 3 blocks + cache breakpoint b2 7. get_time tool 8. Kernel.getSessionInfo() DO method 9. sessions_search + session_info tools + PerTurnContext.kernel 10. System prompt edits (<sessions>, <datetime>, <memory_and_skills>) 11. Smoke test the 15 checks from spec §14 Self-reviewed against spec sections 6-14; all coverage accounted for. Type-consistency verified across clientTimezone, PerTurnContext.kernel, HybridHit/getSessionInfo return shapes. Task ordering: 1→2→3-5 (parallelizable) →6→7→8→9→10→11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nChatTurn Parses optional clientTimezone field from CF_AGENT_USE_CHAT_REQUEST init.body, validates as IANA via Intl.DateTimeFormat try/catch, falls back to UTC on missing or malformed. Threads through to buildContextMessages so the new <current_date> block (Task 6) and any TZ-aware formatting in <sessions_recent> (Task 5) can render in user-local time. Mid-flow paths (tool approval, run resume, retry) pass UTC for now — the TUI doesn't restate the TZ on those frames. TODO: stash on WS state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Computed at submit time via Intl.DateTimeFormat().resolvedOptions().timeZone. Read by handlers.ts and threaded into the <current_date> context block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure sync renderer, ~10 lines. Day + date in client's IANA timezone. Stable across all turns within a day; cache breakpoint b2 lives on this block (wired in Task 6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reads 7 canonical memory files from r2://memory/ in parallel. Each file: full content if <= 200 lines and <= 15 KB, else index_entry (frontmatter + section headings only). Missing files render as role="not_created" placeholders; read errors render as role="read_failed" with the message. Block always ships — agent always has read_file fallback. No DO-instance caching for v1 (~80ms per turn for the 7 R2 reads is fine). Wired into context-messages in Task 6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 most recent threads excluding the current one, ordered by thread.last_message_at DESC NULLS LAST. Each row: id, title (or first-user-message preview fallback if NULL), last_message_at formatted in client TZ, message count, last-user-message preview truncated to 80 chars. SQL failures render an error placeholder; empty result renders the "first thread" placeholder. Block always ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ontext Adds three new block builders to buildContextMessages' Promise.all, then appends in stable-most-to-volatile order at the end of the existing block stack. Cache breakpoint b2 lives on the last contextMessage (<current_date>), caching the full context stack alongside the system-prompt b1. Within a thread, this prefix turns over only on memory writes (b2 busts once) or user-local midnight (date rollover, b2 busts once). Per spec §9, b3 (compaction marker, conditional) and b4 (last conversation message) remain unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure sync. Validates IANA via Intl.DateTimeFormat try/catch, falls back
to invalid_timezone error envelope on bad input. Returns { iso, epochMs,
timezone, utcOffset, humanReadable, dayOfWeek } so the model has every
angle it needs for scheduling math, timezone conversion, freshness checks.
No I/O, no rate-limit, no auth — cheapest possible tool. Registered into
the tool map in Task 9.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Returns thread metadata + paginated messages (limit + offset, newest-
first). Projects message rows to { id, role, createdAt, contentText }
(drops raw UIMessage parts — agent gets the text-only view via the
session_info tool).
Pure DO method; no side effects. Backs the session_info tool wired in
Task 9. Mirrors searchMessages' structural pattern from PR-A.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new agent-facing tools registered in the standard tool map: - sessions_search wraps Kernel.searchMessages from PR-A; trims HybridHit to messageId + threadId + role + createdAt + contentText + rerankScore. - session_info wraps Kernel.getSessionInfo; pagination via limit+offset, newest-first. - get_time pure-sync IANA timezone-aware now-clock. Adds kernel: Kernel to PerTurnContext so the two DO-method-wrapping tools can RPC directly via 'this' rather than env.KERNEL.get(...). Threaded from turn.ts:201. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new XML-tagged blocks teaching the agent the read-side affordances: - <sessions> — explains <sessions_recent> auto-injection vs. sessions_search tool, with the explicit 'don't ask user to repeat themselves' rule. - <datetime> — explains <current_date> vs. get_time tool boundary. Edits <memory_and_skills> opening to note that memory files are now auto-injected as <memory> block content (Task 4 / 6), not just listed as files for read_file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t rows)
Final-review nit: sessions_search, session_info, and get_time were using
raw tool() from 'ai', bypassing the tool_call audit table. Swap to
wrappedTool({ touchesFS: false }) matching the web-search precedent —
read-only tools that DO write audit rows for operator observability.
Also adds a one-line comment at kernel.ts:496 explaining the deliberate
timestamp_ms transform bypass via raw sql<number | null>.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rtpa25
commented
May 14, 2026
…dels PR review nit: the inline return type for getSessionInfo redeclared fields that already live in the Drizzle-derived Thread / Message types exported from @agent-os/models. Replaced with Pick<Thread, ...> + Pick<Message, ...> intersections — structural fields stay in lockstep with the schema, only the boundary transforms (Date -> ISO string, plus messageCount) need explicit declaration. Picked Pick over Omit because we keep 2 of 7 Thread fields and 3 of 10 Message fields — Pick reads as the explicit subset; Omit with 5-7 names each would be louder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the read-side context engine on top of PR-A's vector ingestion pipeline (#14).
What
Three new context-message blocks auto-injected per turn:
<memory>— Sauna-style hybrid: seven canonical memory files inline as full content under 200L/15KB threshold, else as index entry (frontmatter + section headings).<sessions_recent>— 10 most recent threads excluding current, with title (or first-user-message fallback),last_message_at, msg count, last-user-message preview.<current_date>— today + tz, stable across all turns in a day; cache breakpoint b2 lives here.Three new agent tools:
sessions_search— hybrid recall across all threads (wrapsKernel.searchMessagesfrom PR-A; trims return shape tomessageId/threadId/role/createdAt/contentText/rerankScore).session_info— thread metadata + paginated messages (newest-first,limit + offset).get_time— IANA-aware now-clock (Intl.DateTimeFormatbased, pure sync).All three tools use
wrappedTool({ touchesFS: false })— writetool_callaudit rows for operator observability, matching theweb-searchprecedent.System prompt: new
<sessions>and<datetime>teaching blocks; edit to<memory_and_skills>noting auto-injection.TUI: sends
clientTimezonefield inCF_AGENT_USE_CHAT_REQUESTbody viaIntl.DateTimeFormat().resolvedOptions().timeZone.Cache architecture
<current_date>)b2 busts once on memory writes (next turn re-caches) and once per user-local midnight. Spec §9 has the full topology.
Process
docs/superpowers/specs/2026-05-14-context-engine-and-session-tools-design.mddocs/superpowers/plans/2026-05-14-context-engine-and-session-tools.md(11 tasks)Smoke checklist (Task 11)
All 15 checks from spec §14 still to run post-deploy. Critical one: smoke #12 cache b2 empirical verification — confirm turn 2's
cache_read_input_tokenscovers the full context stack, not just the system prompt.🤖 Generated with Claude Code