Skip to content

agent: harden runtime reliability#56

Merged
G9000 merged 8 commits into
mainfrom
fix-agent-runtime-reliability
May 21, 2026
Merged

agent: harden runtime reliability#56
G9000 merged 8 commits into
mainfrom
fix-agent-runtime-reliability

Conversation

@G9000

@G9000 G9000 commented May 18, 2026

Copy link
Copy Markdown
Owner

Summary

  • Isolate companion conversation history by thread and reload runtime history through explicit thread IDs.
  • Serialize approval resume under the same per-thread lock as normal turns and preserve cancelled runs as terminal, including task cancellation cleanup.
  • Scope delegated action results to their originating connection, preserve OpenAI-compatible temperature through tool binding, and align prompt wording with strict tool schemas.
  • Add the completed runtime reliability plan and update the v2 memory recall scratchboard todo.

Test Plan

  • bun run test:server (1397 passed, 1 skipped)
  • bun run lint
  • bun run build
  • git diff --check

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e4091d16f5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/server/src/anima_server/services/agent/service.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens the agent runtime’s correctness under concurrency by tightening per-thread state isolation, ensuring approval re-entry and cancellation are serialized/terminal-safe, and scoping delegated tool results to the originating WebSocket connection. It also aligns prompts/tool binding behavior with strict tool schemas and preserves user-scoped DB routing in background consolidation/episode generation paths.

Changes:

  • Isolate companion conversation history per thread and refresh caches explicitly by thread_id.
  • Serialize approval resume under the per-thread lock; preserve cancellation as terminal (including task-cancel cleanup).
  • Scope delegated tool results by connection, preserve temperature through OpenAI-compatible tool binding, and remove schema-invalid “thinking argument” prompt wording.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scratchboard/v2-memory-recall-reliability/todo.md Records Phase 4 runtime reliability work + validation gates.
docs/superpowers/plans/2026-05-18-agent-runtime-reliability.md Adds the implementation plan covering AR-001..AR-006.
apps/server/tests/test_p5_transcript_archive.py Adds coverage ensuring consolidation uses user-scoped soul DB factories.
apps/server/tests/test_inner_thought.py Updates assertions to match prompt wording (no required thinking arg).
apps/server/tests/test_client_action_tools.py Adds regression ensuring duplicate tool_call_ids are scoped by connection.
apps/server/tests/test_approval_reentry.py Adds regression proving approval resume takes the per-thread lock.
apps/server/tests/test_agent_system_prompt.py Asserts prompt no longer references a “thinking argument”.
apps/server/tests/test_agent_service.py Adds regression tests for thread-scoped history + cancellation terminal behavior.
apps/server/tests/test_agent_openai_compatible_client.py Adds regression ensuring bind_tools() preserves temperature.
apps/server/tests/test_agent_episodes.py Adds regression ensuring episode generation uses user-scoped DB factory.
apps/server/src/anima_server/services/agent/templates/system_rules.md.j2 Removes schema-invalid “thinking argument” requirement wording.
apps/server/src/anima_server/services/agent/templates/system_prompt.md.j2 Rewords cognitive loop to avoid requiring schema-absent args.
apps/server/src/anima_server/services/agent/service.py Adds thread lock around approval resume; thread-scoped cache refresh; cancellation cleanup on task cancel.
apps/server/src/anima_server/services/agent/persistence.py Makes finalize_run() refuse to overwrite cancelled/failed runs.
apps/server/src/anima_server/services/agent/openai_compatible_client.py Preserves temperature when binding tools.
apps/server/src/anima_server/services/agent/episodes.py Switches to get_user_session_factory(user_id) for soul DB access.
apps/server/src/anima_server/services/agent/eager_consolidation.py Uses user-scoped soul DB factories per thread in consolidation/sweeps.
apps/server/src/anima_server/services/agent/companion.py Stores conversation windows per thread and returns defensive history copies.
apps/server/src/anima_server/services/agent/client_actions.py Keys pending delegated tool calls by connection identity + call id; validates tool_name on resolve.
apps/server/src/anima_server/api/routes/ws.py Resolves delegated tool results via the authenticated connection context.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apps/server/src/anima_server/services/agent/companion.py
Comment thread apps/server/src/anima_server/services/agent/client_actions.py
Comment thread apps/server/src/anima_server/services/agent/persistence.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

apps/server/src/anima_server/services/agent/client_actions.py:135

  • ClientActionRegistry now scopes pending delegated tool calls by (id(conn), tool_call_id), but tool_call_id can be deterministic (e.g., AgentRuntime generates synthetic-{tool}-{step_index}-{i}), so two concurrent runs on the same connection can still collide and overwrite each other’s pending entry. This can route a tool_result to the wrong awaiting turn. Consider namespacing the wire tool_call_id with run_id/thread_id (or generating a server-side UUID and mapping back) so it’s unique per connection even when models reuse IDs.
        if conn is None:
            raise DelegationTimeout(
                f"No connected client has registered action tool {tool_name!r}"
            )

        loop = asyncio.get_running_loop()
        future: asyncio.Future[DelegatedToolResult] = loop.create_future()
        pending_key = (id(conn), tool_call_id)
        self._pending[pending_key] = _PendingActionCall(
            connection=conn,
            future=future,
            tool_name=tool_name,
        )

Comment thread apps/server/src/anima_server/services/agent/companion.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comment thread apps/server/src/anima_server/services/agent/service.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.

Comment thread apps/server/src/anima_server/services/agent/persistence.py Outdated
Comment thread apps/server/src/anima_server/services/agent/persistence.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI commented May 18, 2026

Copy link
Copy Markdown
Contributor

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • invalid
    • Triggering command: .venv/bin/pytest .venv/bin/pytest apps/server/tests -q /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-hist.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-huf_compress.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_compress.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_compress_literals.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_compress_sequences.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_compress_superblock.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out�� /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_fast.o /home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/fb80479a5fb81f6a-zstd_lazy.o know�� known-linux-gnu/-m64 known-linux-gnu/-I stable-x86_64-unzstd/lib/ S/target/releasear S/target/releasecqD ld.lld S/target/release/home/REDACTED/work/animaOS/animaOS/target/release/build/zstd-sys-496f82dd5cfa2083/out/44ff4c55aa9e5133-debug.o (dns block)
    • Triggering command: /home/REDACTED/work/animaOS/animaOS/.venv/bin/pytest pytest stable-x86_64-un-c stable-x86_64-un-- stable-x86_64-un&#34;/home/REDACTED/work/animaOS/animaOS/.venv/lib/python3.12/site-packages/pgserver/pginstall/bin/postgres&#34; --single -F -O -j -c search_path=pg_catalog -c exit_on_er--norc 1a.r�� 496f82dd5cfa2083embed-bitcode=no S/target/release--allow=non_upper_case_globals /deps/rustce1Qka--cfg /deps/thiserror_iptables /deps/thiserror_-w /deps/thiserror_-t /deps/thiserror_security /dep�� /deps/thiserror_OUTPUT kages/pgserver/p-d /deps/thiserror_168.63.129.16 /deps/thiserror_bash /deps/thiserror_--norc /deps/thiserror_--noprofile /deps/thiserror_owner (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated no new comments.

@G9000 G9000 merged commit 76d58e1 into main May 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants