fix: soul.md silently truncated to 2000 chars in agent system prompt by TatsuKo-Tsukimi · Pull Request #672 · dataelement/Clawith

TatsuKo-Tsukimi · 2026-06-11T13:29:39Z

Problem

build_agent_context() — the prompt assembler used by chat, heartbeat, A2A and task execution — reads the agent's soul with a 2000-character cap:

https://github.com/dataelement/Clawith/blob/30e8b774/backend/app/services/agent_context.py#L253

_read_file_safe (L15–23) silently truncates anything past max_chars, appending ...(truncated) with no log. So for any agent whose soul.md exceeds 2000 chars, everything after char 2000 — rules, boundaries, operational facts — never reaches the model, while the file, the DB and the UI all display the full soul. That makes the failure very hard to diagnose: the agent behaves correctly on the head of its soul and confidently denies facts stated in the tail.

Evidence

On one of our deployments, 68 of 141 agents had souls over the cap (largest 12,089 chars — only the first 17% ever reached the model). The symptom that surfaced it: an agent whose soul explicitly lists a service URL kept answering "there is no URL I can share" — the URL section sat at char ~5550 of a 5,927-char soul, past the cut.

Fix

Raise the soul read cap to 30000 at the call site, with a comment explaining the asymmetry: soul is author-curated and bounded (only seeded or explicitly edited), unlike memory/relationships which grow unbounded at runtime and keep their small caps.

One line + comment. Agents with souls ≤2000 chars see a byte-identical prompt — zero behavior change for them.

Verified on a live deployment: after this change, build_agent_context static prompts for previously-truncated agents contain their full souls (no ...(truncated) marker), and the misbehaving agents answer tail-section questions correctly.

Possible follow-ups (not in this PR)

Log a warning in _read_file_safe when truncation actually fires, so silent prompt loss is visible in logs.
Validate soul size at seed/edit time instead of truncating at read time.
Hoist the cap into a named constant (e.g. SOUL_CONTEXT_MAX_CHARS) and add a unit test asserting the soul read uses it while memory keeps its small cap.

Unrelated pre-existing note: backend/tests/test_agent_context.py still asserts ## Focus injection, but that injection is commented out at main (agent_context.py#L676) — that test fails on main with or without this PR.

🤖 Generated with Claude Code

_read_file_safe silently truncates at max_chars (appending "...(truncated)" without logging), and build_agent_context read soul.md with a 2000-char cap. Any agent whose soul.md exceeds 2000 chars therefore ran with every tail section — rules, boundaries, operational facts — silently missing from its system prompt, while the file, DB, and UI all showed the full soul. On one deployment, 68 of 141 agents exceeded the cap (largest 12k chars: only the first 17% ever reached the model), and agents confidently denied facts their souls plainly stated. Soul is author-curated and bounded, so a generous 30000-char cap is safe. Memory and relationships keep their small caps because they grow unbounded at runtime. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

yaojin3616 merged commit 18cc83a into dataelement:main Jun 11, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: soul.md silently truncated to 2000 chars in agent system prompt#672

fix: soul.md silently truncated to 2000 chars in agent system prompt#672
yaojin3616 merged 1 commit into
dataelement:mainfrom
TatsuKo-Tsukimi:fix/agent-context-soul-truncation

TatsuKo-Tsukimi commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TatsuKo-Tsukimi commented Jun 11, 2026

Problem

Evidence

Fix

Possible follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants