Agent server audit fixes (+ bundled diary/desktop WIP)#58
Open
G9000 wants to merge 12 commits into
Open
Conversation
Agent server (audit-driven, primary work of this change): - Reliability: fix zombie "running" runs on turn-setup/persist failure (evict orphaned user message, mark run failed); guard context-overflow retry against double tool execution; end-to-end run cancellation (early run commit, run_started event, WS cancel handler, race-safe cancel events); stream inactivity timeout; mask raw exception text to clients; log previously-silent excepts. - Response quality: derive prompt budget from the model context window; boundary-aware block truncation; recency+heat blend in automatic retrieval; importance floor in heat scoring; cross-block dedup; embedding-free claim dedup for paraphrased facts. - Performance: defer Soul Writer LLM work off the pre-turn path; reuse the turn's query embedding in the knowledge-graph block (drop blocking thread.join); background post-turn compaction; cache static identity blocks via the companion version counter, rebuild volatile/query blocks per turn; non-blocking mod-tools fetch with negative cache. - Refactor/cleanup: add call_llm_for_json/llm_json helper and migrate 6 modules; delete dead predict_calibrate.py, streaming_utils.py, and consolidate_pending_ops(). Audit + status notes in docs/audits. - Tests for all of the above; full server suite green (1452 passed). Also bundled (pre-existing in-progress work, not part of the audit): - Daily diary feature (server routes/schemas/services + migration + docs) - Desktop today/mood panel, journal, appearance settings - api-client updates Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0e9a3795a3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… context - tools.py/service.py: when anima-mod tools load via the background fetch (cold cache on the event loop at startup), the already-built runner was cached without them and never picked them up. Fire a callback on first successful background load that invalidates the runner so the next turn rebuilds with mod tools. (Codex P2) - schemas/chat.py: TodayContext rejected any date != server's date.today(), so a client a calendar day ahead/behind a differently-zoned server got a 422 and chat broke. Accept server day +/- 1; still reject clearly stale dates and non-ISO input. (Codex P2) - Tests for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Added `useBackground` hook to manage background configurations, including saving and resolving background URLs. - Introduced `useTheme` hook to handle theme settings, allowing users to toggle between dark, light, and system themes. - Created `BackgroundConfig` and `Theme` types for better type safety. - Updated `AppearanceSettings` component to integrate background and theme management, replacing the previous banner functionality. - Removed legacy banner handling code and associated preferences. - Enhanced the theme management logic to respond to system theme changes. - Added new icons for UI enhancements: `ChevronDownIcon` and `ChevronUpIcon`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The primary work here is a three-phase audit-driven overhaul of the agent server (
apps/server/src/anima_server/services/agent/), all tested. It also bundles pre-existing in-progress feature work (daily diary, desktop today/mood/journal, api-client) that was already uncommitted in the tree — per request, everything except personal interview notes is included.Agent server changes (the audit work)
Reliability
runningrun that replays as unanswered history.run_startedevent carries the run id; WebSocket cancel handler implemented; cancel events race-safe.exceptblocks now log.Response quality
Performance
thread.join); post-turn compaction moved to a background task; static identity blocks cached via the companion version counter while volatile/query-ranked blocks rebuild per turn; non-blocking mod-tools fetch with negative cache.Refactor / cleanup
llm_json.py(call_llm_for_json/call_llm_for_text) with 6 call sites migrated; deleted deadpredict_calibrate.py,streaming_utils.py, andconsolidate_pending_ops().docs/audits/2026-06-11-agent-server-audit.md.An adversarial self-review of the diff caught and fixed three regressions before this PR (Stage-3 zombie run, stream-timeout generator leak, claim-dedup nondeterminism).
Tests: full server suite green — 1452 passed, 1 skipped.
Bundled pre-existing WIP (not part of the audit, included on request)
🤖 Generated with Claude Code