Summary
When a tool returns a large result, the agent runtime truncates it to a fixed character budget and discards the overflow. The truncated content is lost — the agent cannot page back to it. This degrades reliability on exactly the workflows that matter most in CLI-first/coding contexts: grepping a repo, reading a large file, capturing a build/test log, or a verbose API response.
The core SDK already defines the right abstraction for this — ArtifactStoreProtocol and ArtifactRef (with head / tail / grep / chunk / load) in praisonaiagents/context/artifacts.py — but there is no concrete implementation and it is not wired into the tool-execution truncation path. This issue is about finishing that loop.
Current behaviour
praisonaiagents/agent/tool_execution.py (~lines 309–345) truncates tool output to a head+tail preview and throws the middle away:
limit = getattr(self, 'tool_output_limit', 16000)
if len(result_str) > limit:
tail_size = min(limit // 5, 2000)
head = result_str[:limit - tail_size]
tail = result_str[-tail_size:] if tail_size > 0 else ""
truncated = f"{head}\n...[{len(result_str):,} chars, showing first/last portions]...\n{tail}"
The dropped content is unrecoverable; nothing is persisted and no reference is returned to the model.
- Built-in shell tooling caps output with a hardcoded byte limit (
max_output_size default ~10000 in praisonaiagents/tools/shell_tools.py), tail-biased, with no spill.
- Limits are hardcoded (16000 chars in the agent loop, ~10000 bytes in shell) — there is no CLI flag, no YAML key, and no
Config knob to tune them, and no line-based (max_lines) limit or head/tail direction control.
ArtifactStoreProtocol, ArtifactRef, ArtifactMetadata, GrepMatch are all defined in praisonaiagents/context/artifacts.py (the protocol exposes store, load, tail, head, grep, chunk, delete, list_artifacts), but:
- the only
FileSystemArtifactStore reference is inside the module docstring (illustration), not a real class;
- no concrete adapter exists in
praisonaiagents or praisonai;
- nothing in the tool loop ever constructs an
ArtifactRef or calls the store.
Desired behaviour
When a tool result exceeds the configured budget, the runtime should:
- Spill the full output to a content-addressed file via a concrete
ArtifactStore.
- Return to the model a compact preview plus a retrievable
ArtifactRef (using the existing ArtifactRef.to_inline()), so the model knows the full output exists and where it is.
- Expose lightweight follow-up operations the agent can call to page through the preserved output on demand:
head, tail, grep, chunk, load (the protocol already specifies these).
- Make the budget configurable (per-run and per-tool) with
max_bytes, max_lines, and head/tail direction, surfaced via Python config, YAML, and a CLI flag.
- Garbage-collect spilled artifacts after a retention window so the store does not grow unbounded.
This turns a lossy, silent truncation into a durable, browsable artifact — the agent retries less, wastes fewer tokens re-running tools, and large outputs stop blowing the context window.
Layer placement
- Primary layer: core (
praisonaiagents)
- Why not core → (this is core): The truncation/spill happens inside the agent tool-execution loop and the built-in tools; the protocol it completes (
ArtifactStoreProtocol) already lives in core. A default FileSystemArtifactStore adapter + the wiring belong next to the protocol so every entry point (Python, YAML, CLI) benefits uniformly.
- Why not wrapper: Wrapper would only cover the CLI path; the data-loss bug affects every
Agent run regardless of entry point, so a wrapper-only fix leaves the SDK and YAML paths broken. (Wrapper still gets a thin surface — see secondary touch.)
- Why not tools: This is lifecycle behaviour of the tool-execution loop (how the runtime handles any tool's output), not a single agent-callable integration. Re-implementing per tool in
praisonai-tools would duplicate logic and miss MCP/plugin tools.
- Why not plugins: It is not a policy/guardrail hook — it is core runtime output handling that must run on the hot path by default, not an optional lifecycle plugin.
- Secondary touch (optional): wrapper exposes a CLI flag (e.g.
--tool-output-max-bytes / --tool-output-max-lines) and config/YAML keys (tool_output.max_bytes, tool_output.max_lines, tool_output.direction, tool_output.retention_days); the existing OutputConfig/ExecutionConfig consolidation is the natural home for the Python surface.
- 3-way surface (CLI + YAML + Python): yes
Proposed approach
- Add a concrete
FileSystemArtifactStore adapter in core implementing ArtifactStoreProtocol, persisting under the existing data dir (e.g. ~/.praisonai/artifacts/<agent_id>/<run_id>/), content-addressed by SHA256, with the metadata already modelled in ArtifactRef/ArtifactMetadata.
- In
tool_execution.py, replace lossy truncation with: build the preview and store(...) the full result, then attach ArtifactRef.to_inline() to what the model sees. Keep the existing zero-overhead fast path for small outputs (no store call when under budget).
- Add agent-callable retrieval tools (
artifact_head, artifact_tail, artifact_grep, artifact_chunk, artifact_load) thin-wrapping the store, registered lazily so they cost nothing until an overflow occurs.
- Introduce a
ToolOutputConfig (or extend OutputConfig/ExecutionConfig) for max_bytes, max_lines, direction, retention_days; thread it through Python, YAML loader, and a CLI flag on praisonai run.
- Add a retention/GC sweep (age-based, mirroring the existing cache/checkpoint cleanup conventions) so artifacts are pruned.
Resolution sketch
praisonaiagents/context/artifacts.py — add FileSystemArtifactStore(ArtifactStoreProtocol) (currently only a docstring stub).
praisonaiagents/agent/tool_execution.py (~309–345) — spill-on-overflow + return ArtifactRef; preserve fast path for small results.
praisonaiagents/tools/ — new lazy artifact_* retrieval tools backed by the store.
praisonaiagents/config/ — ToolOutputConfig (max_bytes, max_lines, direction, retention_days); follow the False=disabled / True=defaults / Config=custom consolidation pattern.
praisonaiagents/paths.py — get_artifacts_dir() helper + GC hook.
praisonai/praisonai/cli/commands/run.py — surface the budget flags; YAML loader maps tool_output: keys.
- Backward compatible: defaults reproduce today's preview behaviour; spill + retrieval are additive and opt-in-by-default-safe (no API breakage).
Severity
High. This is silent data loss on the agent's hot path. It directly causes failed/looping tool use, wasted tokens, and context-window overflows in long CLI/coding sessions — the headline use case. The fix is well-scoped because the protocol and reference types already exist; only the implementation and wiring are missing.
Validation
- Real value: Production CLI/coding runs routinely produce tool outputs far above the 16000-char/10000-byte caps (repo greps, large file reads, build/test logs). Today that content is gone; the agent cannot recover it.
- Traced in code: lossy truncation at
praisonaiagents/agent/tool_execution.py:~309–345; hardcoded shell cap in praisonaiagents/tools/shell_tools.py; complete-but-unimplemented ArtifactStoreProtocol/ArtifactRef in praisonaiagents/context/artifacts.py (the FileSystemArtifactStore mention is docstring-only; no concrete class, no callers).
- Reference pattern exists in the wild: mature terminal coding agents spill overflowing tool output to a temp file, hand the model a path/preview, let it page back via head/tail/grep, and GC after a retention window — exactly the API
ArtifactStoreProtocol already specifies.
- Layer-validated: single primary layer (core); fix on the hot path benefits Python + YAML + CLI uniformly; wrapper/tools/plugins rejected for the reasons above.
- Aligned with design principles: protocol-first (implements an existing protocol), lazy (retrieval tools and store calls only engage on overflow), safe defaults (additive; current preview preserved), backward compatible.
Acceptance criteria
Summary
When a tool returns a large result, the agent runtime truncates it to a fixed character budget and discards the overflow. The truncated content is lost — the agent cannot page back to it. This degrades reliability on exactly the workflows that matter most in CLI-first/coding contexts: grepping a repo, reading a large file, capturing a build/test log, or a verbose API response.
The core SDK already defines the right abstraction for this —
ArtifactStoreProtocolandArtifactRef(withhead/tail/grep/chunk/load) inpraisonaiagents/context/artifacts.py— but there is no concrete implementation and it is not wired into the tool-execution truncation path. This issue is about finishing that loop.Current behaviour
praisonaiagents/agent/tool_execution.py(~lines 309–345) truncates tool output to a head+tail preview and throws the middle away:max_output_sizedefault ~10000 inpraisonaiagents/tools/shell_tools.py), tail-biased, with no spill.Configknob to tune them, and no line-based (max_lines) limit or head/tail direction control.ArtifactStoreProtocol,ArtifactRef,ArtifactMetadata,GrepMatchare all defined inpraisonaiagents/context/artifacts.py(the protocol exposesstore,load,tail,head,grep,chunk,delete,list_artifacts), but:FileSystemArtifactStorereference is inside the module docstring (illustration), not a real class;praisonaiagentsorpraisonai;ArtifactRefor calls the store.Desired behaviour
When a tool result exceeds the configured budget, the runtime should:
ArtifactStore.ArtifactRef(using the existingArtifactRef.to_inline()), so the model knows the full output exists and where it is.head,tail,grep,chunk,load(the protocol already specifies these).max_bytes,max_lines, and head/taildirection, surfaced via Python config, YAML, and a CLI flag.This turns a lossy, silent truncation into a durable, browsable artifact — the agent retries less, wastes fewer tokens re-running tools, and large outputs stop blowing the context window.
Layer placement
praisonaiagents)ArtifactStoreProtocol) already lives in core. A defaultFileSystemArtifactStoreadapter + the wiring belong next to the protocol so every entry point (Python, YAML, CLI) benefits uniformly.Agentrun regardless of entry point, so a wrapper-only fix leaves the SDK and YAML paths broken. (Wrapper still gets a thin surface — see secondary touch.)praisonai-toolswould duplicate logic and miss MCP/plugin tools.--tool-output-max-bytes/--tool-output-max-lines) and config/YAML keys (tool_output.max_bytes,tool_output.max_lines,tool_output.direction,tool_output.retention_days); the existingOutputConfig/ExecutionConfigconsolidation is the natural home for the Python surface.Proposed approach
FileSystemArtifactStoreadapter in core implementingArtifactStoreProtocol, persisting under the existing data dir (e.g.~/.praisonai/artifacts/<agent_id>/<run_id>/), content-addressed by SHA256, with the metadata already modelled inArtifactRef/ArtifactMetadata.tool_execution.py, replace lossy truncation with: build the preview andstore(...)the full result, then attachArtifactRef.to_inline()to what the model sees. Keep the existing zero-overhead fast path for small outputs (no store call when under budget).artifact_head,artifact_tail,artifact_grep,artifact_chunk,artifact_load) thin-wrapping the store, registered lazily so they cost nothing until an overflow occurs.ToolOutputConfig(or extendOutputConfig/ExecutionConfig) formax_bytes,max_lines,direction,retention_days; thread it through Python, YAML loader, and a CLI flag onpraisonai run.Resolution sketch
praisonaiagents/context/artifacts.py— addFileSystemArtifactStore(ArtifactStoreProtocol)(currently only a docstring stub).praisonaiagents/agent/tool_execution.py(~309–345) — spill-on-overflow + returnArtifactRef; preserve fast path for small results.praisonaiagents/tools/— new lazyartifact_*retrieval tools backed by the store.praisonaiagents/config/—ToolOutputConfig(max_bytes,max_lines,direction,retention_days); follow theFalse=disabled / True=defaults / Config=customconsolidation pattern.praisonaiagents/paths.py—get_artifacts_dir()helper + GC hook.praisonai/praisonai/cli/commands/run.py— surface the budget flags; YAML loader mapstool_output:keys.Severity
High. This is silent data loss on the agent's hot path. It directly causes failed/looping tool use, wasted tokens, and context-window overflows in long CLI/coding sessions — the headline use case. The fix is well-scoped because the protocol and reference types already exist; only the implementation and wiring are missing.
Validation
praisonaiagents/agent/tool_execution.py:~309–345; hardcoded shell cap inpraisonaiagents/tools/shell_tools.py; complete-but-unimplementedArtifactStoreProtocol/ArtifactRefinpraisonaiagents/context/artifacts.py(theFileSystemArtifactStoremention is docstring-only; no concrete class, no callers).ArtifactStoreProtocolalready specifies.Acceptance criteria
ArtifactStoreadapter implementing the existing protocol.ArtifactRef, with no loss of the full content.head/tail/grep/chunk/load.max_bytes,max_lines,direction) configurable via Python, YAML, and CLI.