Retrievable tool-output overflow: implement and wire an ArtifactStore so large tool outputs are preserved, not discarded

## Summary

When a tool returns a large result, the agent runtime truncates it to a fixed character budget and **discards the overflow**. The truncated content is lost — the agent cannot page back to it. This degrades reliability on exactly the workflows that matter most in CLI-first/coding contexts: grepping a repo, reading a large file, capturing a build/test log, or a verbose API response.

The core SDK already defines the right abstraction for this — `ArtifactStoreProtocol` and `ArtifactRef` (with `head` / `tail` / `grep` / `chunk` / `load`) in `praisonaiagents/context/artifacts.py` — but there is **no concrete implementation** and it is **not wired** into the tool-execution truncation path. This issue is about finishing that loop.

## Current behaviour

- `praisonaiagents/agent/tool_execution.py` (~lines 309–345) truncates tool output to a head+tail preview and throws the middle away:
  ```python
  limit = getattr(self, 'tool_output_limit', 16000)
  if len(result_str) > limit:
      tail_size = min(limit // 5, 2000)
      head = result_str[:limit - tail_size]
      tail = result_str[-tail_size:] if tail_size > 0 else ""
      truncated = f"{head}\n...[{len(result_str):,} chars, showing first/last portions]...\n{tail}"
  ```
  The dropped content is unrecoverable; nothing is persisted and no reference is returned to the model.
- Built-in shell tooling caps output with a hardcoded byte limit (`max_output_size` default ~10000 in `praisonaiagents/tools/shell_tools.py`), tail-biased, with no spill.
- Limits are **hardcoded** (16000 chars in the agent loop, ~10000 bytes in shell) — there is no CLI flag, no YAML key, and no `Config` knob to tune them, and no line-based (`max_lines`) limit or head/tail direction control.
- `ArtifactStoreProtocol`, `ArtifactRef`, `ArtifactMetadata`, `GrepMatch` are all defined in `praisonaiagents/context/artifacts.py` (the protocol exposes `store`, `load`, `tail`, `head`, `grep`, `chunk`, `delete`, `list_artifacts`), but:
  - the only `FileSystemArtifactStore` reference is inside the module **docstring** (illustration), not a real class;
  - no concrete adapter exists in `praisonaiagents` or `praisonai`;
  - nothing in the tool loop ever constructs an `ArtifactRef` or calls the store.

## Desired behaviour

When a tool result exceeds the configured budget, the runtime should:

1. **Spill** the full output to a content-addressed file via a concrete `ArtifactStore`.
2. Return to the model a compact preview **plus** a retrievable `ArtifactRef` (using the existing `ArtifactRef.to_inline()`), so the model knows the full output exists and where it is.
3. Expose lightweight follow-up operations the agent can call to page through the preserved output on demand: `head`, `tail`, `grep`, `chunk`, `load` (the protocol already specifies these).
4. Make the budget **configurable** (per-run and per-tool) with `max_bytes`, `max_lines`, and head/tail `direction`, surfaced via Python config, YAML, and a CLI flag.
5. **Garbage-collect** spilled artifacts after a retention window so the store does not grow unbounded.

This turns a lossy, silent truncation into a durable, browsable artifact — the agent retries less, wastes fewer tokens re-running tools, and large outputs stop blowing the context window.

## Layer placement

- **Primary layer:** core (`praisonaiagents`)
- **Why not core → (this _is_ core):** The truncation/spill happens inside the agent tool-execution loop and the built-in tools; the protocol it completes (`ArtifactStoreProtocol`) already lives in core. A default `FileSystemArtifactStore` adapter + the wiring belong next to the protocol so every entry point (Python, YAML, CLI) benefits uniformly.
- **Why not wrapper:** Wrapper would only cover the CLI path; the data-loss bug affects every `Agent` run regardless of entry point, so a wrapper-only fix leaves the SDK and YAML paths broken. (Wrapper still gets a thin surface — see secondary touch.)
- **Why not tools:** This is lifecycle behaviour of the tool-execution loop (how the runtime handles any tool's output), not a single agent-callable integration. Re-implementing per tool in `praisonai-tools` would duplicate logic and miss MCP/plugin tools.
- **Why not plugins:** It is not a policy/guardrail hook — it is core runtime output handling that must run on the hot path by default, not an optional lifecycle plugin.
- **Secondary touch (optional):** wrapper exposes a CLI flag (e.g. `--tool-output-max-bytes` / `--tool-output-max-lines`) and config/YAML keys (`tool_output.max_bytes`, `tool_output.max_lines`, `tool_output.direction`, `tool_output.retention_days`); the existing `OutputConfig`/`ExecutionConfig` consolidation is the natural home for the Python surface.
- **3-way surface (CLI + YAML + Python):** yes

## Proposed approach

1. Add a concrete `FileSystemArtifactStore` adapter in core implementing `ArtifactStoreProtocol`, persisting under the existing data dir (e.g. `~/.praisonai/artifacts/<agent_id>/<run_id>/`), content-addressed by SHA256, with the metadata already modelled in `ArtifactRef`/`ArtifactMetadata`.
2. In `tool_execution.py`, replace lossy truncation with: build the preview **and** `store(...)` the full result, then attach `ArtifactRef.to_inline()` to what the model sees. Keep the existing zero-overhead fast path for small outputs (no store call when under budget).
3. Add agent-callable retrieval tools (`artifact_head`, `artifact_tail`, `artifact_grep`, `artifact_chunk`, `artifact_load`) thin-wrapping the store, registered lazily so they cost nothing until an overflow occurs.
4. Introduce a `ToolOutputConfig` (or extend `OutputConfig`/`ExecutionConfig`) for `max_bytes`, `max_lines`, `direction`, `retention_days`; thread it through Python, YAML loader, and a CLI flag on `praisonai run`.
5. Add a retention/GC sweep (age-based, mirroring the existing cache/checkpoint cleanup conventions) so artifacts are pruned.

## Resolution sketch

- `praisonaiagents/context/artifacts.py` — add `FileSystemArtifactStore(ArtifactStoreProtocol)` (currently only a docstring stub).
- `praisonaiagents/agent/tool_execution.py` (~309–345) — spill-on-overflow + return `ArtifactRef`; preserve fast path for small results.
- `praisonaiagents/tools/` — new lazy `artifact_*` retrieval tools backed by the store.
- `praisonaiagents/config/` — `ToolOutputConfig` (`max_bytes`, `max_lines`, `direction`, `retention_days`); follow the `False=disabled / True=defaults / Config=custom` consolidation pattern.
- `praisonaiagents/paths.py` — `get_artifacts_dir()` helper + GC hook.
- `praisonai/praisonai/cli/commands/run.py` — surface the budget flags; YAML loader maps `tool_output:` keys.
- Backward compatible: defaults reproduce today's preview behaviour; spill + retrieval are additive and opt-in-by-default-safe (no API breakage).

## Severity

High. This is silent data loss on the agent's hot path. It directly causes failed/looping tool use, wasted tokens, and context-window overflows in long CLI/coding sessions — the headline use case. The fix is well-scoped because the protocol and reference types already exist; only the implementation and wiring are missing.

## Validation

- **Real value:** Production CLI/coding runs routinely produce tool outputs far above the 16000-char/10000-byte caps (repo greps, large file reads, build/test logs). Today that content is gone; the agent cannot recover it.
- **Traced in code:** lossy truncation at `praisonaiagents/agent/tool_execution.py:~309–345`; hardcoded shell cap in `praisonaiagents/tools/shell_tools.py`; complete-but-unimplemented `ArtifactStoreProtocol`/`ArtifactRef` in `praisonaiagents/context/artifacts.py` (the `FileSystemArtifactStore` mention is docstring-only; no concrete class, no callers).
- **Reference pattern exists in the wild:** mature terminal coding agents spill overflowing tool output to a temp file, hand the model a path/preview, let it page back via head/tail/grep, and GC after a retention window — exactly the API `ArtifactStoreProtocol` already specifies.
- **Layer-validated:** single primary layer (core); fix on the hot path benefits Python + YAML + CLI uniformly; wrapper/tools/plugins rejected for the reasons above.
- **Aligned with design principles:** protocol-first (implements an existing protocol), lazy (retrieval tools and store calls only engage on overflow), safe defaults (additive; current preview preserved), backward compatible.

### Acceptance criteria
- [ ] Concrete `ArtifactStore` adapter implementing the existing protocol.
- [ ] Tool outputs over budget are spilled and surfaced to the model as a preview + `ArtifactRef`, with no loss of the full content.
- [ ] Agent can retrieve more via `head`/`tail`/`grep`/`chunk`/`load`.
- [ ] Budget (`max_bytes`, `max_lines`, `direction`) configurable via Python, YAML, and CLI.
- [ ] Age-based GC of stored artifacts.
- [ ] No regression in import time or the small-output fast path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retrievable tool-output overflow: implement and wire an ArtifactStore so large tool outputs are preserved, not discarded #2148

Summary

Current behaviour

Desired behaviour

Layer placement

Proposed approach

Resolution sketch

Severity

Validation

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Retrievable tool-output overflow: implement and wire an ArtifactStore so large tool outputs are preserved, not discarded #2148

Description

Summary

Current behaviour

Desired behaviour

Layer placement

Proposed approach

Resolution sketch

Severity

Validation

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions