Skip to content

Undo: store snapshots and add assistant__undo#26

Merged
oxyc merged 12 commits into
mainfrom
undo-restore
May 27, 2026
Merged

Undo: store snapshots and add assistant__undo#26
oxyc merged 12 commits into
mainfrom
undo-restore

Conversation

@oxyc

@oxyc oxyc commented May 26, 2026

Copy link
Copy Markdown
Member

Summary

The assistant side of undo. Companion to generoi/gds-mcp#19 (the snapshot/restore handlers).

  • Mutating abilities attach an _undo snapshot to their result. MessageLoop and the approval path peel it off before the result reaches the LLM, the SSE stream, or the saved conversation (it's internal and can be large), and store it in a new audit-log undo_state column. Size-capped (256KB → oversized objects simply aren't undoable); version-gated migration so existing installs get the column.
  • UndoToolProvider adds assistant__undo-list (recent undoable actions) and assistant__undo (revert one). The stored snapshot is loaded server-side and replayed via the gds-mcp/restore_snapshot filter — the snapshot never travels through the LLM. A user can only undo their own actions (which also scopes capability — an editor never sees an admin's form-edit entries).
  • Restores that can't be perfectly faithful (new id, spent DeepL credits) return caveats; the tool description tells the assistant to relay them.

Tests

UndoToolProviderTest covers list / undo / caveats / ownership / nothing-to-undo with a stubbed restore filter; AuditLog gains undo_state, getReversible, getById, markUndone.

🤖 Generated with Claude Code

Mutating abilities (gds-mcp) attach an _undo snapshot to their result. MessageLoop and the approval path peel it off before the result reaches the LLM/UI and store it in a new audit-log undo_state column (size-capped, version-gated migration). The UndoToolProvider exposes assistant__undo-list and assistant__undo: a user can revert their own recent change, which loads the stored snapshot and replays it via the gds-mcp/restore_snapshot filter (the snapshot never travels through the LLM). Restores that recreate under a new id surface caveats for the assistant to relay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
oxyc and others added 11 commits May 26, 2026 14:34
- Render tool calls as real assistant-ui tool-call parts (ToolCallFallback)
  instead of inline markdown text. This is what lets us hang interactive UI
  (the Undo button) off a tool message; it also makes ToolCallFallback the
  single rendering path for live and restored tool calls.
- Per-tool Undo button: driven by the tool_result undoable/audit_id/undo_label
  signal (threaded to ToolCallFallback via UndoContext). Clicking POSTs to the
  new /undo endpoint (Api\UndoEndpoint, reusing UndoToolProvider), then shows
  "Undone" and surfaces any caveats. AuditLog::log() now returns the row id so
  the SSE event can carry it.
- Fix: the approval bar clears the whole batch on one click (return [] instead
  of dropping only the first), so it no longer lingers or trigger stray empty
  turns from re-clicks.
- Deterministic Playwright e2e (chat-undo.spec.js) covering tool-call component
  rendering and the undo flow; UndoEndpointTest covers the endpoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ToolCallUI is not a valid MessagePrimitive.Content components key in assistant-ui 0.12 — tool calls render through tools.Fallback. With the wrong key the tool-call parts rendered nothing (the e2e caught this). Confirmed against ToolCallMessagePartProps in the dist types.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…all card

- chat-approval.spec.js: approving a destructive action resolves it AND clears
  the bar in one click (guards the batch-clear fix); denying clears it without
  running the action.
- chat-history.spec.js: resuming a past conversation restores its messages and
  renders a stored tool_use as a tool-call component (guards the history path
  of the tool-call-parts refactor).
- Strengthen the existing "renders as structured card" test: it was a no-op
  (conditional on the card existing, which it never did with the dead ToolCallUI
  key) — now an unconditional assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… navigations

Keep the chat usable across full-page wp-admin navigations: the panel reopens
itself, a half-written composer draft is restored, and the active conversation
is reloaded so reopening lands on the same thread.

- Open state: AssistantModalPrimitive.Root is uncontrolled (Cmd+K and resume
  click the trigger), so seed it with defaultOpen from localStorage and record
  every toggle via onOpenChange.
- Draft: subscribe to the composer runtime — restore a saved draft on mount and
  mirror every change to localStorage; the subscription also captures the
  clear-on-send transition so the draft is dropped once sent.
- Active conversation: persist the conversation id on conversation_start (clear
  in newChat / when a restored thread is gone), and restore it on mount in
  app.jsx only when the panel was left open.

Adds chat-persistence.spec.js covering draft+open restore, conversation restore,
and that a never-opened panel stays closed after reload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The deny test was racing: denyToolCall sends a __tool_denied__ follow-up /chat
call, but the mock only special-cased __tool_approved__ and fell through to the
approval prompt for everything else — re-surfacing the bar, so toBeHidden raced
the reappearance. Add TOOL_DENIAL_RESOLVED (resolves the tool as denied, no new
approval) and route __tool_denied__ to it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A concurrent maybeInstall() (two requests both passing the version gate before
either committed) could insert the same bundled slug twice in the same second —
e.g. two published "report-bug" skills. installSkill() now collapses duplicate
slugs to the original (lowest id), removing only byte-identical extras so a
customized copy is never silently deleted, and reconciles again after an insert
so concurrent racers converge with no lock. VERSION bumped to run the cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Export selected conversation(s) — including full messages, fetched per item so
the file is self-contained — as JSON. Single selection downloads
conversation-{uuid}.json; multiple downloads a combined file. Supports bulk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rendering fixes:
- Own each streaming turn's message by a stable id instead of "the last
  assistant message" / a mutable flag (those were read stale inside batched
  setMessages updaters), so the approval follow-up no longer overwrites the
  message holding the tool-call card. Approved actions keep their card, flip to
  Done, and surface Undo.
- Render approval-required tools as a compact tool-call card (via context, since
  assistant-ui doesn't report requires-action for external-store parts) rather
  than verbose "```json … Waiting for approval```" text.
- Suffix duplicate toolCallIds per message in convertMessage so a connector that
  reuses one id (seen with Gemini) can't crash assistant-ui with "Duplicate key
  … in tapResources"; attach each tool_result to the first unfilled card.
- Stop rendering the empty assistant stub (bubble + copy button) while a stream
  warms up; give every message a stable id to avoid remounts.
- Copy-message button serializes each tool call's full request + response from
  message data (not the collapsed DOM).

UI declutter:
- Top bar: Skills + New icons (now labelled), a "⋯" overflow menu holding Chat
  history / Edit system context / Export, and a Close (×) button.
- Slide-in panels (skills/history/context) get a header with a collapse control
  and are mutually exclusive (opening one closes the others).
- Subtle text-link styling for the per-action Undo button.

Tests: approval card + undo, duplicate-id no-crash, empty-bubble, overflow menu,
panel close + exclusivity, close button. Also fixes pre-existing restore mocks
to match full URLs (plain-permalink wp-env) and the deny assertion to be
count-based (the deny flow now yields multiple assistant messages).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…icons

- An approval-gated tool streams tool_use_start AND then tool_approval_required
  for the same id. Stop pushing a second card on approval — reuse the existing
  one (filling its args if tool_use_start carried none). Previously this left a
  duplicate card stuck on "Running" after approval. + e2e regression test.
- Usage bar shows just the token count; the cost moves into the title tooltip.
- Panel headers use a collapse chevron instead of a second × (less confusing
  next to the chat-close ×).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clicking Undo now leaves a trace instead of silently flipping the card:
- UndoEndpoint appends a "Reverted: ..." note to the originating conversation
  (so it shows in history and the model stays in sync next turn) and logs the
  undo to the audit trail. The LLM-driven undo tool is unchanged.
- The note renders as a centered system line; the frontend also drops it into
  the live thread immediately.
- ChatEndpoint merges consecutive same-role messages before the model call so a
  trailing user note can't break Anthropic role alternation (no-op otherwise).

Tests: UndoEndpoint appends-note + audit-logged; ChatEndpoint role-merge;
chat-undo e2e asserts the system note appears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sages

Old messages showed no time after a refresh because the stored transcript
carried no per-message timing. Now ChatEndpoint stamps each message with a `ts`
(epoch ms) at persist — loaded messages keep their original ts, only the turn's
new messages get "now" — and loadConversation reads `ts` so the time renders on
reload. The provider never sees `ts`: MessageLoop strips messages to
{role, content} for its payload copy, leaving the persisted transcript intact.

Tests: MessageLoop strips ts from the provider payload but keeps it in the
returned transcript; chat-persistence e2e asserts timestamps render on restore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oxyc oxyc merged commit 98e456d into main May 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant