Agent server audit fixes (+ bundled diary/desktop WIP) by G9000 · Pull Request #58 · G9000/animaOS

G9000 · 2026-06-17T03:41:31Z

Summary

The primary work here is a three-phase audit-driven overhaul of the agent server (apps/server/src/anima_server/services/agent/), all tested. It also bundles pre-existing in-progress feature work (daily diary, desktop today/mood/journal, api-client) that was already uncommitted in the tree — per request, everything except personal interview notes is included.

Agent server changes (the audit work)

Reliability

Turn-setup / result-persist failures now mark the run failed and evict the orphaned user message instead of leaving a zombie running run that replays as unanswered history.
Context-overflow retry no longer re-executes already-run tools (only retries before any tool side effect).
End-to-end run cancellation: run row committed early so the cancel endpoint sees in-flight runs (also releases the thread-row lock held across the LLM call); run_started event carries the run id; WebSocket cancel handler implemented; cancel events race-safe.
Stream inactivity timeout so a stalled LLM stream can't pin the thread lock for ~10 min; raw exception text masked from clients; previously-silent except blocks now log.

Response quality

Prompt budget derived from the model's context window (was a fixed 24k chars); boundary-aware truncation (no mid-fact cuts); recency+heat blend in automatic retrieval; importance floor in heat scoring so important memories don't decay into invisibility; cross-block dedup; embedding-free dedup for paraphrased free-form claims.

Performance

Soul Writer LLM work deferred off the pre-turn path (TTFT); knowledge-graph block reuses the turn's existing query embedding (drops a blocking thread.join); post-turn compaction moved to a background task; static identity blocks cached via the companion version counter while volatile/query-ranked blocks rebuild per turn; non-blocking mod-tools fetch with negative cache.

Refactor / cleanup

New llm_json.py (call_llm_for_json/call_llm_for_text) with 6 call sites migrated; deleted dead predict_calibrate.py, streaming_utils.py, and consolidate_pending_ops().
Findings, fixes, and a self-review pass documented in docs/audits/2026-06-11-agent-server-audit.md.

An adversarial self-review of the diff caught and fixed three regressions before this PR (Stage-3 zombie run, stream-timeout generator leak, claim-dedup nondeterminism).

Tests: full server suite green — 1452 passed, 1 skipped.

Bundled pre-existing WIP (not part of the audit, included on request)

Daily diary feature: server routes/schemas/services + alembic migration + design docs.
Desktop: today/mood panel, journal, appearance settings, nav.
api-client updates.

Note: my audit edits were layered on top of pre-existing uncommitted edits in the same agent files, so per-file diffs may include unrelated pre-existing changes. Excluded: INTERVIEW_*.md, interview_link.md.

🤖 Generated with Claude Code

Agent server (audit-driven, primary work of this change): - Reliability: fix zombie "running" runs on turn-setup/persist failure (evict orphaned user message, mark run failed); guard context-overflow retry against double tool execution; end-to-end run cancellation (early run commit, run_started event, WS cancel handler, race-safe cancel events); stream inactivity timeout; mask raw exception text to clients; log previously-silent excepts. - Response quality: derive prompt budget from the model context window; boundary-aware block truncation; recency+heat blend in automatic retrieval; importance floor in heat scoring; cross-block dedup; embedding-free claim dedup for paraphrased facts. - Performance: defer Soul Writer LLM work off the pre-turn path; reuse the turn's query embedding in the knowledge-graph block (drop blocking thread.join); background post-turn compaction; cache static identity blocks via the companion version counter, rebuild volatile/query blocks per turn; non-blocking mod-tools fetch with negative cache. - Refactor/cleanup: add call_llm_for_json/llm_json helper and migrate 6 modules; delete dead predict_calibrate.py, streaming_utils.py, and consolidate_pending_ops(). Audit + status notes in docs/audits. - Tests for all of the above; full server suite green (1452 passed). Also bundled (pre-existing in-progress work, not part of the audit): - Daily diary feature (server routes/schemas/services + migration + docs) - Desktop today/mood panel, journal, appearance settings - api-client updates Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0e9a3795a3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

… context - tools.py/service.py: when anima-mod tools load via the background fetch (cold cache on the event loop at startup), the already-built runner was cached without them and never picked them up. Fire a callback on first successful background load that invalidates the runner so the next turn rebuilds with mod tools. (Codex P2) - schemas/chat.py: TodayContext rejected any date != server's date.today(), so a client a calendar day ahead/behind a differently-zoned server got a 422 and chat broke. Accept server day +/- 1; still reject clearly stale dates and non-ISO input. (Codex P2) - Tests for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Added `useBackground` hook to manage background configurations, including saving and resolving background URLs. - Introduced `useTheme` hook to handle theme settings, allowing users to toggle between dark, light, and system themes. - Created `BackgroundConfig` and `Theme` types for better type safety. - Updated `AppearanceSettings` component to integrate background and theme management, replacing the previous banner functionality. - Removed legacy banner handling code and associated preferences. - Enhanced the theme management logic to respond to system theme changes. - Added new icons for UI enhancements: `ChevronDownIcon` and `ChevronUpIcon`.

G9000 and others added 10 commits May 30, 2026 17:15

docs: design today user context

7fbc528

docs: plan today user context

c83b812

feat: inject ephemeral today context

07f46d1

api-client: send today context

c9ce844

desktop: add today context input

891a164

desktop: suggest today context from check-in

dabac07

desktop: surface today context on dashboard

4a6bfd8

desktop: add expressive today mood picker

5448e3b

desktop: compact today mood panel

40f8129

chatgpt-codex-connector Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread apps/server/src/anima_server/schemas/chat.py Outdated

Comment thread apps/server/src/anima_server/services/agent/tools.py

G9000 and others added 2 commits June 19, 2026 00:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent server audit fixes (+ bundled diary/desktop WIP)#58

Agent server audit fixes (+ bundled diary/desktop WIP)#58
G9000 wants to merge 12 commits into
mainfrom
agent-server-improvements

G9000 commented Jun 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

G9000 commented Jun 17, 2026

Summary

Agent server changes (the audit work)

Bundled pre-existing WIP (not part of the audit, included on request)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant