diff --git a/README.md b/README.md
index 71d2d7a..7be3945 100644
--- a/README.md
+++ b/README.md
@@ -1,32 +1,19 @@
 # tracecraft
 
 [![PyPI](https://img.shields.io/pypi/v/tracecraft-ai)](https://pypi.org/project/tracecraft-ai/)
+[![Python](https://img.shields.io/pypi/pyversions/tracecraft-ai)](https://pypi.org/project/tracecraft-ai/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
+[![Tests](https://github.com/Arrmlet/tracecraft/actions/workflows/test.yml/badge.svg)](https://github.com/Arrmlet/tracecraft/actions/workflows/test.yml)
 
-Persistent shared memory and coordination layer for AI agents. Any agent can store, share, and retrieve data from the same bucket — memory, messages, tasks, and artifacts. Works with any S3 or HuggingFace bucket.
+**Tracecraft is a CLI coordination layer for multi-agent AI systems** — shared **memory**, a **mailbox**, atomic task **claims**, **handoffs**, and **artifacts**, plus mirrored **session transcripts**, all stored as plain JSON in any **S3** or **HuggingFace** bucket. No server. No database. No SDK lock-in.
 
-```
-  Agent 1 (designer)                 Agent 2 (developer)
-  ┌──────────────────────┐           ┌──────────────────────┐
-  │ tracecraft claim      │           │ tracecraft wait-for   │
-  │   design              │           │   design              │
-  │                       │           │   ...waiting...       │
-  │ tracecraft complete   │  ──────>  │                       │
-  │   design --note "done"│           │ ✓ design complete     │
-  │                       │           │                       │
-  │                       │  <──────  │ tracecraft send       │
-  │                       │           │   designer "starting" │
-  └──────────────────────┘           └──────────────────────┘
-              \                  /
-               \                /
-            ┌──────────────────────┐
-            │  Any S3 bucket       │
-            │  (MinIO, AWS, R2,    │
-            │   HuggingFace)       │
-            └──────────────────────┘
-```
+<p align="center">
+  <img width="100%" alt="Two agents race for the same task; the second is atomically rejected — no server" src="docs/assets/tracecraft-claim-race.gif">
+</p>
+
+> Two agents, one bucket — they can't grab the same work, enforced by an S3 conditional write. No server, no lock service. All state is plain JSON you own; open it in the MinIO console or [HuggingFace Hub](https://huggingface.co/buckets/arrmlet/tracecraft-test) and watch it live.
 
-<img width="814" alt="tracecraft CLI" src="https://github.com/user-attachments/assets/8e0b7a71-45af-4df4-99a5-712481b19a85" />
+---
 
 ## Quick start
 
@@ -34,108 +21,145 @@ Persistent shared memory and coordination layer for AI agents. Any agent can sto
 pip install tracecraft-ai
 ```
 
-Start MinIO locally (or use AWS S3, Cloudflare R2, HuggingFace Buckets):
+The only infra is a bucket. For local dev, run MinIO (in production, point at AWS / R2 / HF instead):
+
 ```bash
-docker run -d -p 9000:9000 -e MINIO_ROOT_USER=admin -e MINIO_ROOT_PASSWORD=admin123456 minio/minio server /data
+docker run -d -p 9000:9000 \
+  -e MINIO_ROOT_USER=admin -e MINIO_ROOT_PASSWORD=admin123456 \
+  minio/minio server /data
 ```
 
-Initialize two agents:
+Register two agents against the same project:
+
 ```bash
 # Terminal 1
-tracecraft init --project myproject --agent designer \
+tracecraft init --project demo --agent designer \
   --endpoint http://localhost:9000 --bucket tracecraft \
   --access-key admin --secret-key admin123456
 
-# Terminal 2
-tracecraft init --project myproject --agent developer \
+# Terminal 2 — same flags, --agent developer
+tracecraft init --project demo --agent developer \
   --endpoint http://localhost:9000 --bucket tracecraft \
   --access-key admin --secret-key admin123456
 ```
 
-Now they can coordinate:
-```bash
-# Designer claims a task and shares state
+Now the core move — **two agents cannot grab the same work**, with no lock service and no server to run:
+
+```console
+# Terminal 1 — designer claims the task
 $ tracecraft claim design
 Claimed step design as designer
 
-$ tracecraft memory set design.status "complete"
-Set design.status = complete
+# Terminal 2 — developer tries the SAME task, atomically rejected (S3 If-None-Match)
+$ tracecraft claim design
+Error: Step design already claimed by designer
 
-$ tracecraft send developer "Design is ready"
-Sent to developer: Design is ready
+# designer finishes and leaves a handoff note for whoever picks up next
+$ tracecraft complete design --note "API in api.py, see memory key design.contract"
+Completed step design
 
-# Developer checks messages and picks it up
-$ tracecraft inbox
-[2026-03-24T14:00:00Z] (direct) designer: Design is ready
+# developer was blocked on it — now it unblocks
+$ tracecraft wait-for design
+All steps complete: design
+```
 
-$ tracecraft memory get design.status
-complete
+Every call is stateless. Everything you just did is JSON files in the bucket — no server stayed running, nothing to tear down.
 
-$ tracecraft claim implementation
-Claimed step implementation as developer
-```
+---
+
+## Agents talk to each other
+
+Beyond claiming work, agents coordinate by messaging through the bucket — direct messages and broadcasts, each one a JSON file in a per-agent mailbox.
 
-Everything is stored as JSON files in S3. No servers. No databases.
+<p align="center">
+  <img width="100%" alt="One agent sends a handoff, another reads its inbox and replies, then a broadcast to the team" src="docs/assets/tracecraft-messaging.gif">
+</p>
+
+```bash
+tracecraft send developer "contract is in memory key design.contract"
+tracecraft inbox                       # read your direct + broadcast messages
+tracecraft send _broadcast "v1 cut at 3pm, wrap your tasks"
+```
 
 ---
 
-## What agents get
+## Why tracecraft
 
-- **Shared memory** — `tracecraft memory set/get/list` — persistent key-value state any agent can read/write
-- **Messaging** — `tracecraft send/inbox` — direct messages or broadcast to all agents
-- **Task claiming** — `tracecraft claim/complete` — claim steps so agents don't collide
-- **Barriers** — `tracecraft wait-for step1 step2` — block until dependencies complete
-- **Handoffs** — `tracecraft complete step --note "context for next agent"`
-- **Artifacts** — `tracecraft artifact upload/download/list` — share files between agents
-- **Agent registry** — `tracecraft agents` — see who's online and what they're working on
+- **Atomic task claims** — two agents never grab the same work, enforced by S3 `If-None-Match` conditional puts, with no central coordinator.
+- **Coordinate across hosts** — the bucket *is* the coordinator, so agents on different machines or clouds work together by default — not just processes sharing one laptop.
+- **No server, no database** — every CLI call is stateless; all state is JSON in a bucket you already own.
+- **Any backend, zero lock-in** — AWS, Cloudflare R2, MinIO, Backblaze B2, Wasabi, SeaweedFS, and HuggingFace Buckets all work today.
+- **Harness-agnostic** — Claude Code, Codex, OpenClaw, Hermes, bash, Python, or anything that can run a shell command.
+- **Coordination + reasoning together** — the events *and* each agent's full session transcript live in one bucket, not two systems.
 
-Works with any process that can call a CLI — Claude Code, OpenClaw, Hermes Agent, Codex, bash scripts, Python, anything.
+> Frameworks like CrewAI and LangGraph own the agent loop; memory layers like Mem0 store one agent's recall; in-process coordination tools assume every agent shares one machine. Tracecraft owns neither the loop nor the model — just the shared bucket the agents coordinate *through* — so it works across hosts, across clouds, and with any harness, via a plain CLI.
 
 ---
 
-## Storage backends
+## Coordination + reasoning in one bucket
 
-No vendor lock-in. Bring your own S3:
+Most coordination tools store the *events* — who claimed what, who messaged whom. Tracecraft stores those **and** each agent's full reasoning, by mirroring coding-agent session transcripts into the same bucket. When a run goes sideways, one `tracecraft session show` gives you the handoffs **and** the chain of thought behind them — same place, same JSON, no second system to wire up.
 
 ```bash
-# Local development (recommended to start)
-tracecraft init --endpoint http://localhost:9000 ...    # MinIO
-tracecraft init --endpoint http://localhost:8333 ...    # SeaweedFS
+tracecraft session mirror --harness claude-code   # tail this session into the bucket
+tracecraft session show <id> --tail 50            # read coordination + reasoning together
+```
 
-# Production
-tracecraft init --endpoint https://s3.amazonaws.com ... # AWS S3
-tracecraft init --endpoint https://xxx.r2.cloudflarestorage.com ... # Cloudflare R2
+Works with **Claude Code, Codex, OpenClaw, and Hermes**. Source transcripts are never modified; secret-shape redaction (AWS / Anthropic / OpenAI / HF / GitHub / Slack token patterns) is on by default and counted in metadata.
 
-# HuggingFace Buckets (browsable on the Hub)
-pip install tracecraft-ai[huggingface]
-tracecraft init --backend hf --bucket username/my-bucket ...
-```
+Harness matrix, storage formats, and redaction details → **[docs/session-mirror.md](docs/session-mirror.md)**
 
 ---
 
 ## How it works
 
-All coordination state is JSON files in S3:
+Every agent action is a JSON file under `<bucket>/<project>/`:
 
 ```
-s3://bucket/project/
-  agents/designer.json          ← who's alive, what they're doing
-  memory/design/status.json     ← shared key-value state
-  messages/developer/1234.json  ← agent inboxes
-  steps/design/claim.json       ← who claimed what
-  steps/design/status.json      ← pending → in_progress → complete
-  steps/design/handoff.json     ← notes for the next agent
-  artifacts/design/mockup.html  ← shared files
+s3://bucket/demo/
+  agents/designer.json                       ← who's alive, what they're doing
+  memory/design/contract.json                ← shared key-value state
+  messages/developer/1738f3_designer.json    ← per-agent mailbox
+  steps/design/claim.json                    ← who claimed what (atomic)
+  steps/design/status.json                   ← pending → in_progress → complete
+  steps/design/handoff.json                  ← note for the next agent
+  artifacts/design/mockup.html               ← shared files
+  sessions/claude-code/<id>/part-00000-….jsonl  ← mirrored agent transcript
+  sessions/claude-code/<id>/meta.json            ← cumulative session metadata
 ```
 
-Any agent that can call `tracecraft` can participate. Any S3 browser (MinIO console, AWS console, HuggingFace Hub) lets you watch agents coordinate in real-time.
+Any process that can call `tracecraft` participates. Any S3 browser (MinIO console, AWS console, HuggingFace Hub) lets you watch agents coordinate in real time. Atomicity details and the HuggingFace fallback are in **[docs/s3-architecture.md](docs/s3-architecture.md)**.
+
+---
+
+## Backends
+
+Bring your own bucket — no vendor lock-in:
+
+| Backend | `init` flag | Notes |
+|---|---|---|
+| MinIO | `--endpoint http://localhost:9000` | recommended for local dev |
+| SeaweedFS | `--endpoint http://localhost:8333` | self-hosted |
+| AWS S3 | `--endpoint https://s3.amazonaws.com` | |
+| Cloudflare R2 | `--endpoint https://<acct>.r2.cloudflarestorage.com` | zero egress fees |
+| Backblaze B2 / Wasabi | S3-compatible endpoint | |
+| HuggingFace Buckets | `--backend hf --bucket user/name` | browsable on the Hub; `pip install tracecraft-ai[huggingface]` |
 
 ---
 
-## CLI reference
+## Use cases
+
+- **Multi-agent coding** — run several Claude Code / Codex agents in parallel; they claim modules, share artifacts, wait at barriers, and hand off context instead of stepping on each other.
+- **Autonomous research** — agents claim experiments, share results via memory, and avoid duplicating work across a fleet.
+- **Pipelines** — lint → test → build → deploy as claimed steps; each stage waits for its dependencies.
+
+---
+
+<details>
+<summary><strong>Full CLI reference</strong></summary>
 
 ```bash
-tracecraft init                           # Configure S3 + project + agent
+tracecraft init                           # Configure backend + project + agent
 tracecraft agents                         # Who's online?
 
 tracecraft memory set <key> <value>       # Write (dots become path separators)
@@ -147,50 +171,37 @@ tracecraft send _broadcast <message>      # Broadcast to all
 tracecraft inbox                          # Read messages
 tracecraft inbox --delete                 # Read and clear
 
-tracecraft claim <step-id>                # Claim a step
-tracecraft complete <step-id> [--note X]  # Mark done + handoff
+tracecraft claim <step-id>                # Claim a step (atomic)
+tracecraft complete <step-id> [--note X]  # Mark done + handoff note
 tracecraft step-status <step-id>          # Check status
 tracecraft wait-for <step-ids...>         # Block until complete (default 300s timeout)
 
-tracecraft artifact upload <path> [--step id]   # Share a file
-tracecraft artifact download <name> [--step id] # Get a file
+tracecraft artifact upload <path> [--step id]    # Share a file
+tracecraft artifact download <name> [--step id]  # Get a file
 tracecraft artifact list [--step id]             # List files
+
+tracecraft session mirror --harness <name>       # Mirror a session into the bucket
+tracecraft session list                          # Browse mirrored sessions
+tracecraft session show <id> [--tail N]          # Inspect meta + transcript tail
+tracecraft session stop <id>                     # Clear local state, mark ended
 ```
 
-For multiple agents in the same directory, set identity via env var:
+Run multiple agents from one directory by overriding identity per call:
+
 ```bash
-TRACECRAFT_AGENT=designer tracecraft inbox
+TRACECRAFT_AGENT=designer  tracecraft inbox
 TRACECRAFT_AGENT=developer tracecraft inbox
 ```
 
----
-
-## Use cases
-
-**Multi-agent coding** — Run 4 Claude Code agents in worktrees. They claim modules, share artifacts, wait at barriers, hand off context.
-
-**Autonomous research** — Run hundreds of autoresearch experiments. Agents claim experiments, share results via memory, avoid duplicating work.
-
-**Collaborative knowledge bases** — Multiple agents build a wiki together. One processes papers, another writes summaries, a third checks consistency. All coordinated through shared memory and messaging.
-
-**CI/CD pipelines** — Lint → test → build → deploy as tracecraft steps. Each stage claims its step and waits for dependencies.
-
----
-
-## Example coordination
-
-Two Claude Code agents coordinating through tracecraft via HuggingFace Buckets:
-
-<img width="100%" alt="Two Claude Code agents coordinating through tracecraft" src="https://github.com/user-attachments/assets/c2103ff9-afa9-48e9-8aa9-4d4089a66b57" />
-
-> See full coordination data (agents, memory, messages, steps, artifacts) stored as JSON on the Hub:
-> [huggingface.co/buckets/arrmlet/tracecraft-test](https://huggingface.co/buckets/arrmlet/tracecraft-test)
+</details>
 
 ---
 
-## Works with
+## More
 
-Tested with Claude Code, OpenAI Codex, and Hermes Agent. Works with any agent or script that can run a shell command.
+- [docs/session-mirror.md](docs/session-mirror.md) — session mirroring: harnesses, formats, redaction
+- [docs/s3-architecture.md](docs/s3-architecture.md) — atomicity, key layout, HuggingFace fallback
+- [plans/](plans/) — roadmap, research, and known gaps
 
 ---
 
diff --git a/docs/CI_CD_GUIDE.md b/docs/CI_CD_GUIDE.md
deleted file mode 100644
index e9092bf..0000000
--- a/docs/CI_CD_GUIDE.md
+++ /dev/null
@@ -1,429 +0,0 @@
-# CI/CD for tracecraft — a learning doc
-
-This is a working-engineer's introduction to CI/CD, grounded in the two workflows tracecraft uses today (`.github/workflows/test.yml` and `release.yml`). By the end you'll understand every line of YAML in this repo, the trust model that lets GitHub publish to PyPI without a stored password, and how to debug failures.
-
-Written for someone who knows Python and git but hasn't owned a CI/CD pipeline before. Skip sections you already know.
-
----
-
-## Part 1 — The concepts, in 5 minutes
-
-### What CI/CD actually is
-
-Two related-but-distinct things:
-
-- **Continuous Integration (CI)** — every time someone pushes code or opens a PR, a fresh computer (a "runner") checks the code out, installs dependencies, and runs your tests. If anything is broken, you see it within seconds on the commit/PR. The point is to catch breakage *before* it merges to `main`, not after.
-
-- **Continuous Delivery (CD)** — when you mark a commit as a release (cut a tag, click "Publish release"), a fresh computer builds the shippable artifact (a Python wheel, a Docker image, a binary) and uploads it to wherever users get it (PyPI, Docker Hub, App Store). The point is to make releases boring and repeatable — no "did I remember to bump the version in both files?" mistakes.
-
-Sometimes people add **Continuous Deployment** (same acronym, different word) — automatically pushing every green commit to production. Tracecraft has no servers, so that doesn't apply here.
-
-### Why it exists
-
-Before CI/CD, releases were a checklist a human did by hand. Six steps, easy to skip one, easy to miss "this works on my machine" bugs. CI runs the checklist in a known-clean environment, every time, and fails loudly when it can't.
-
-The deeper point: **CI is the executable documentation of how your project works.** Someone reading your repo can look at `.github/workflows/test.yml` and learn "this is how you install and test this code." Conversely, if your CI passes on a fresh machine, you've proven the install instructions in your README actually work.
-
-### The GitHub Actions vocabulary
-
-GitHub Actions is one of many CI/CD systems. Others: CircleCI, GitLab CI, Jenkins, Travis CI. The concepts below are mostly universal; the keywords are GitHub-specific.
-
-- **Workflow** — one YAML file under `.github/workflows/`. One workflow = one purpose (run tests, publish release, run nightly job, etc.).
-- **Job** — a single unit of work inside a workflow. Jobs run in parallel unless you tell them to depend on each other. One job runs on one runner.
-- **Step** — a command inside a job. Steps run sequentially. If a step fails, the rest of the job stops.
-- **Runner** — the VM that executes the job. GitHub provides `ubuntu-latest`, `macos-latest`, `windows-latest`. You can also self-host runners.
-- **Trigger / `on:`** — what causes the workflow to fire. `push`, `pull_request`, `release`, `schedule` (cron), `workflow_dispatch` (manual button), and more.
-- **Matrix** — a single job that runs N times with different variable values (e.g., one per Python version). Saves duplication.
-- **Action** — a reusable building block, e.g. `actions/checkout@v4`. Other people's code you call from your workflow. Hosted on the GitHub Marketplace.
-- **Secrets / variables** — encrypted values stored on GitHub, available to workflows. Used for API tokens, etc. *We deliberately don't use stored secrets for PyPI — see Part 4.*
-- **Concurrency** — controls whether multiple runs of the same workflow can run at once. Useful to cancel old runs when you push twice in a row.
-- **Artifact** — files a job produces that you want to keep (build outputs, screenshots, coverage reports). Stored on GitHub for 90 days by default.
-
-### Cost
-
-For tracecraft (public repo): **free, unlimited**. GitHub gives unlimited Actions minutes to public repos. PyPI is always free for public packages.
-
-For private repos: free tier is 2,000 minutes/month, then ~$0.008/min on Linux. A 30-second run × 10 pushes/day × 30 days = 1.5 hours of CI/month — well under the free tier.
-
----
-
-## Part 2 — `test.yml` line by line
-
-Here's the actual file in this repo, with annotations.
-
-```yaml
-name: tests
-```
-The display name for the workflow in GitHub's UI. Shows up as "tests" on commits and PRs.
-
-```yaml
-on:
-  push:
-    branches: [main]
-  pull_request:
-    branches: [main]
-```
-**The trigger.** "Run this workflow when (a) someone pushes to `main`, or (b) someone opens/updates a PR targeting `main`." If we removed the `branches:` filter, the workflow would also run on pushes to feature branches — wasteful since the PR run already covers that.
-
-> *Note:* YAML 1.1 interprets the unquoted word `on` as the boolean `true` when parsed by some libraries. GitHub Actions handles this correctly. Just leave it as `on:` — no need to quote it.
-
-```yaml
-jobs:
-  pytest:
-```
-One job named `pytest`. The name appears as the status check on PRs (`pytest (3.10)`, `pytest (3.11)`, etc., because of the matrix below).
-
-```yaml
-    runs-on: ubuntu-latest
-```
-Use GitHub's latest Ubuntu runner. As of 2026 that's Ubuntu 24.04. Other choices: `ubuntu-22.04`, `macos-latest`, `windows-latest`. Ubuntu is the cheapest and fastest; we add macOS/Windows only when needed.
-
-```yaml
-    strategy:
-      fail-fast: false
-      matrix:
-        python-version: ["3.10", "3.11", "3.12", "3.13"]
-```
-The **matrix**. This single job definition gets expanded into 4 parallel runs, each with `${{ matrix.python-version }}` set to one of the listed versions. `fail-fast: false` means "if 3.10 fails, keep running 3.11/3.12/3.13 anyway" — useful because failures are often version-specific and you want to see them all.
-
-```yaml
-    steps:
-      - uses: actions/checkout@v4
-```
-**Step 1**: `actions/checkout@v4` is an official GitHub action that does `git clone` into the runner. The `@v4` is a version pin — major version 4. You should always pin actions; `@main` would mean "whatever they push" which can break you.
-
-```yaml
-      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python-version }}
-          cache: pip
-```
-**Step 2**: installs the matrix Python version. `cache: pip` tells the action to cache `~/.cache/pip` between runs — speeds up subsequent runs by 30-60s because dependencies don't redownload.
-
-`${{ matrix.python-version }}` is GitHub's expression syntax: it substitutes the current matrix value. So this step runs four times across the four matrix cells with `3.10`, `3.11`, `3.12`, `3.13`.
-
-```yaml
-      - name: Install package + dev extras
-        working-directory: sdk
-        run: |
-          python -m pip install --upgrade pip
-          pip install -e ".[dev,huggingface]"
-```
-**Step 3**: install tracecraft + its test/dev dependencies. The `|` lets you write multiline shell. `working-directory: sdk` means commands run from `sdk/`. `[dev,huggingface]` pulls the optional extras defined in `sdk/pyproject.toml`.
-
-```yaml
-      - name: Run tests
-        run: pytest sdk/tests/ -v
-```
-**Step 4**: actually run the tests. `working-directory` is back to repo root because we didn't specify one here. `-v` is verbose (one line per test). Exit code 0 = green check, non-zero = red X.
-
-That's the whole file. Less than 30 lines of YAML, and it gives you "the tests pass on 4 Python versions on Ubuntu" on every push.
-
-### What you'll see in the GitHub UI
-
-- On the commit list, a small ✓ or ✗ icon next to the commit hash.
-- On a PR, a "Checks" tab showing each matrix cell separately.
-- Click into a run to see logs per step.
-- The workflow file itself appears in the "Actions" tab.
-
----
-
-## Part 3 — `release.yml` line by line
-
-```yaml
-name: release
-
-on:
-  release:
-    types: [published]
-```
-Triggered by the `release.published` event, which fires when you click "Publish release" in the GitHub UI (or run `gh release create v0.2.0 ...`). NOT triggered by simply pushing a tag — there's a distinction. A tag is just a label on a commit; a "release" is a tag plus optional metadata (notes, attached binaries). We use the release event because it gives you a confirmation step before publication.
-
-```yaml
-jobs:
-  build-and-publish:
-    runs-on: ubuntu-latest
-    environment:
-      name: pypi
-      url: https://pypi.org/project/tracecraft-ai/
-```
-The job runs in an **environment** called `pypi`. Environments are a GitHub feature for adding extra protection around sensitive jobs:
-- You can require manual approval before the job runs.
-- You can restrict which branches can deploy to the environment.
-- The environment shows up in the GitHub UI with the URL above as a clickable link.
-
-For tracecraft, the environment also matches what we'll tell PyPI to trust (in Part 4).
-
-```yaml
-    permissions:
-      id-token: write  # required for PyPI trusted publishing
-      contents: read
-```
-**This is the magic.** GitHub Actions has a per-job permission model. By default, the `GITHUB_TOKEN` (auto-generated for each run) has read-only access. `id-token: write` is what lets the job request an **OIDC token** — a short-lived JWT signed by GitHub that proves to PyPI "yes, this is the genuine release.yml workflow on Arrmlet/tracecraft running right now."
-
-`contents: read` keeps the rest of the permissions minimal — we don't need to write to the repo, only read its files.
-
-```yaml
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          ref: ${{ github.event.release.tag_name }}
-```
-Checkout the repo, but specifically at the tag of the release that triggered this. Without `ref:`, it would check out the default branch — which might be ahead of the tag if someone pushed to `main` after creating the release. Using the tag means the wheel we publish is exactly the code in the release.
-
-```yaml
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: "3.12"
-```
-Only one Python version needed for building — the wheel is pure Python (`py3-none-any.whl`), so any modern Python builds it correctly.
-
-```yaml
-      - name: Sync root README into sdk/ before build
-        run: cp README.md sdk/README.md
-```
-A tracecraft-specific quirk. The Python package source lives in `sdk/`, but the README we want users to see on PyPI lives at the repo root. We copy it into `sdk/` before build so setuptools picks it up.
-
-```yaml
-      - name: Build sdist + wheel
-        working-directory: sdk
-        run: |
-          python -m pip install --upgrade pip build
-          python -m build
-```
-Runs Python's modern build tool. Produces two files in `sdk/dist/`:
-- `tracecraft_ai-X.Y.Z.tar.gz` — the *source distribution* (sdist). What `pip` falls back to if no wheel is available.
-- `tracecraft_ai-X.Y.Z-py3-none-any.whl` — the *wheel*. Pre-built, no compilation needed on install.
-
-```yaml
-      - name: Verify artifacts
-        working-directory: sdk
-        run: |
-          pip install twine
-          twine check dist/*
-          python -m venv /tmp/verify
-          /tmp/verify/bin/pip install dist/*.whl
-          /tmp/verify/bin/tracecraft --version
-```
-Three sanity checks before publishing:
-1. `twine check` validates the wheel metadata (README rendering, classifiers, etc.).
-2. Install the wheel in a fresh venv — proves it's installable.
-3. Run `tracecraft --version` — proves the CLI entry point actually works.
-
-If any of these fail, the workflow stops here and doesn't publish a broken package.
-
-```yaml
-      - name: Publish to PyPI
-        uses: pypa/gh-action-pypi-publish@release/v1
-        with:
-          packages-dir: sdk/dist/
-```
-**The actual publish step.** This action sends the contents of `sdk/dist/` to PyPI using the OIDC token from earlier. Notice there is no `password:` or `username:` or `token:` field — that's the whole point of trusted publishing.
-
-This step fails until you configure PyPI to trust this workflow (Part 4).
-
----
-
-## Part 4 — PyPI Trusted Publishing setup
-
-This is the one-time browser configuration that unlocks `release.yml`. It's required because PyPI doesn't blindly accept uploads from any GitHub workflow — it needs to know which workflows you trust.
-
-### Background — why trusted publishing exists
-
-The old way: generate a PyPI API token, save it as a GitHub Secret, reference it in the workflow. Problems:
-- The token is long-lived. If your GitHub account is breached, the attacker has your PyPI publish access.
-- Hard to rotate; everyone forgets to.
-- One leak from any project = total PyPI account takeover.
-
-The new way (introduced 2023, mature in 2024-2025): **OIDC trusted publishing.** GitHub generates a short-lived token *per run*, signed by GitHub, that proves "this is genuinely the `release.yml` workflow in `Arrmlet/tracecraft` running right now, on a runner GitHub controls, for the tag `v0.2.0`." PyPI verifies that signature and accepts the upload.
-
-Properties:
-- The token is valid for ~10 minutes and only inside that specific job.
-- No secret stored anywhere — there's nothing to leak.
-- Scoped to one workflow file in one repo. An attacker would need to compromise GitHub itself.
-
-### Step-by-step setup
-
-You do this once, today. After that you never touch PyPI tokens for this project again.
-
-1. **Sign in to PyPI** at https://pypi.org/.
-
-2. **Go to project settings** — https://pypi.org/manage/project/tracecraft-ai/settings/publishing/
-
-   If that URL 404s, navigate manually: Account dropdown → "Your projects" → click `tracecraft-ai` → "Publishing" in the left sidebar.
-
-3. **Click "Add a new pending publisher"** or "Add a new trusted publisher."
-
-4. **Choose "GitHub" as the publisher.**
-
-5. **Fill in the form exactly:**
-   - **Owner:** `Arrmlet`
-   - **Repository name:** `tracecraft`
-   - **Workflow filename:** `release.yml` (just the filename, not the path)
-   - **Environment name:** `pypi`
-
-   The `Environment name` here MUST match the `environment.name:` in `release.yml` (which is `pypi`). Case matters.
-
-6. **Click "Add."**
-
-That's it. PyPI now trusts `release.yml`. The next time you create a GitHub Release, the workflow will run end-to-end and publish to PyPI without any prompt.
-
-### Verifying it works
-
-Don't ship a real release just to test. Instead, the first release that goes through the workflow IS the test. Recommended:
-
-1. Make a tiny code change (a comment, a typo fix in README).
-2. Bump version to `0.1.6` in `sdk/pyproject.toml` and `sdk/tracecraft/__init__.py`.
-3. Commit, push, tag, push tag.
-4. `gh release create v0.1.6 --title "v0.1.6 — CI/CD test" --notes "Testing trusted publishing"`
-5. Watch the workflow in the Actions tab. Should turn green in ~1 minute.
-6. Verify on PyPI: `pip install --upgrade tracecraft-ai` → version should be `0.1.6`.
-
-If step 5 fails at "Publish to PyPI" with a 403 — go back and check the publisher config matches exactly (owner case, workflow filename, environment name).
-
----
-
-## Part 5 — Reading the GitHub Actions UI
-
-When you push or create a PR, here's where the action is in the UI:
-
-### Per-commit status
-On the commit list, look for a circle/check/X next to the commit hash:
-- Yellow dot = running
-- Green check = all green
-- Red X = at least one job failed
-
-Hover or click for a summary. Click the icon to see the workflow details.
-
-### Per-PR status
-At the bottom of the PR, the "Checks" section shows each workflow. For a matrix workflow you'll see one row per matrix cell (`pytest (3.10)`, `pytest (3.11)`, etc.). Click "Details" to see logs.
-
-### Actions tab (`github.com/Arrmlet/tracecraft/actions`)
-The full history of all workflow runs. Filter by workflow on the left, by branch/event/status at the top. Click a run for detailed logs.
-
-### Inside a run
-You see the matrix cells (or single job) listed. Click one to expand the steps. Each step has its own logs and timing. Failed steps are highlighted red and auto-expand to show the error.
-
-### Re-running failed jobs
-If a run failed due to a flake (network blip, etc.), the "Re-run failed jobs" button at the top right re-runs only the failed cells. Re-runs preserve the commit SHA, so the new attempt is genuinely a do-over of the same code.
-
----
-
-## Part 6 — `gh` (GitHub CLI) for local interaction
-
-You don't have to use the browser UI. The `gh` CLI is faster:
-
-```bash
-# Watch the most recent run for the current branch
-gh run watch
-
-# List recent runs
-gh run list --limit 10
-
-# View the details of a specific run
-gh run view <run-id>
-
-# View just the failed step logs
-gh run view <run-id> --log-failed
-
-# Re-run a failed run
-gh run rerun <run-id>
-
-# Cancel a stuck run
-gh run cancel <run-id>
-
-# Create a release (this triggers release.yml)
-gh release create v0.2.0 --title "v0.2.0" --notes "..."
-
-# List releases
-gh release list
-
-# View one release
-gh release view v0.1.5
-```
-
----
-
-## Part 7 — Common failures and how to debug them
-
-### Test workflow goes red
-
-1. **Open the failed run** (Actions tab → click the red run).
-2. **Find the failing matrix cell.** Maybe only 3.10 failed — that narrows the cause to "Python 3.10 specific."
-3. **Expand "Run tests"** to see the pytest output. Same format as your local terminal.
-4. **Reproduce locally** with the same Python version: `pyenv install 3.10` → `pyenv local 3.10` → `pip install -e "sdk/[dev]"` → `pytest sdk/tests/`.
-5. **Fix and push again.** The workflow re-runs.
-
-### Workflow doesn't trigger at all
-
-- Check `on:` filters — pushing to a feature branch with `branches: [main]` only triggers on PR, not on direct push.
-- Check `.github/workflows/` path. Typos like `.github/workflow/` won't be picked up.
-- Check the workflow YAML is valid. GitHub UI will show a "Workflow invalid" error in the Actions tab.
-
-### Release workflow fails at "Publish to PyPI"
-
-- The error message is usually `403 Forbidden` or `Invalid or non-existent authentication information`.
-- Cause: trusted publishing config doesn't match the actual workflow run.
-- Fix: double-check the PyPI publishing form. Owner case-sensitive. Workflow filename is just `release.yml` (no path). Environment name `pypi` matches the `environment.name:` in the YAML.
-
-### "Resource not accessible by integration" error
-
-- Cause: missing `permissions:` in the YAML. The default permissions are read-only.
-- Fix: explicitly request what you need (`id-token: write`, `contents: write`, etc.) in the job.
-
-### Action versions deprecated
-
-You may see a banner: "Node.js 20 actions are deprecated." This is GitHub's runtime, not your code. Fix by bumping action versions (e.g., `actions/checkout@v4` → `actions/checkout@v5` when released). Non-urgent unless the runner refuses to execute the action.
-
-### Cached pip install picks up wrong package
-
-If you change `pyproject.toml` deps but CI still uses the cached old version, force a cache refresh by changing the `cache:` config or, easiest, the lockfile/`pyproject.toml` hash will already invalidate the cache automatically (which is the point of `cache: pip`).
-
----
-
-## Part 8 — What we deliberately did NOT add (and why)
-
-These are common CI additions that aren't worth it for a small Python OSS project. Add them if you grow into the need; don't add them just because.
-
-| Feature | Why we skipped |
-|---|---|
-| Coverage reporting (codecov) | 12 tests at this scale tell you more than a coverage % does. Add when team-size justifies. |
-| Linting gate (ruff in CI) | Ruff is in `dev` extras; run locally. Blocking PRs on lint is friction for a solo maintainer. |
-| Pre-commit hooks | Local-only friction. Helpful with 3+ contributors; overkill solo. |
-| Dependabot / Renovate | Adds noise. Manual quarterly review of deps is fine at this scale. |
-| Branch protection rules | You're solo. Self-review is acceptable. Add when contributors arrive. |
-| Auto-version-bump (release-please, semantic-release) | Overengineering until 5+ releases/quarter. |
-| Windows / macOS runners | Add on first user bug report from those platforms. |
-| Nightly cron tests against real S3 | Premature; moto covers correctness, real S3 issues are rare. |
-| CodeQL security scanning | Free if you enable it. Useful eventually; not on the critical path. |
-| Slack / Discord notifications | The Actions email is enough until you have a team channel. |
-
-The principle: **CI complexity should match project stakes.** Right now tracecraft is small and the maintainer is one person; the two workflows we have are the right size. Re-evaluate when stakes change.
-
----
-
-## Part 9 — Where to learn more
-
-- **GitHub Actions docs** — https://docs.github.com/en/actions. The official reference. The "Quickstart" and "Workflow syntax" pages are the most useful.
-- **PyPI trusted publishing docs** — https://docs.pypi.org/trusted-publishers/
-- **awesome-actions** — https://github.com/sdras/awesome-actions. Curated list of useful actions.
-- **Anatomy of a workflow** — https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions
-- **GitHub Actions security hardening** — https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions. Becomes relevant when you start using secrets, deployments, or third-party actions.
-
----
-
-## TL;DR — what you have now
-
-- **`test.yml`** runs the 12 backtests on Python 3.10/3.11/3.12/3.13 every push and PR. Free, ~30s, catches regressions before merge.
-- **`release.yml`** builds + publishes to PyPI on every GitHub Release. Requires one-time trusted publishing setup at https://pypi.org/manage/project/tracecraft-ai/settings/publishing/ (Owner: `Arrmlet`, Repo: `tracecraft`, Workflow: `release.yml`, Environment: `pypi`).
-- **No tokens stored anywhere.** OIDC-based trust.
-- **Free** for public repos.
-
-The next time you ship is:
-```
-# bump version in two files, commit
-gh release create v0.2.0 --title "..." --notes "..."
-# walk away; PyPI has it in ~60 seconds
-```
diff --git a/docs/assets/tracecraft-claim-race.gif b/docs/assets/tracecraft-claim-race.gif
new file mode 100644
index 0000000..febf41a
Binary files /dev/null and b/docs/assets/tracecraft-claim-race.gif differ
diff --git a/docs/assets/tracecraft-messaging.gif b/docs/assets/tracecraft-messaging.gif
new file mode 100644
index 0000000..449037e
Binary files /dev/null and b/docs/assets/tracecraft-messaging.gif differ
diff --git a/docs/session-mirror.md b/docs/session-mirror.md
new file mode 100644
index 0000000..9837ab5
--- /dev/null
+++ b/docs/session-mirror.md
@@ -0,0 +1,126 @@
+# Session mirror
+
+`tracecraft session mirror` copies a coding agent's session transcript into your
+bucket, alongside the coordination state (memory, messages, claims, artifacts)
+that tracecraft already stores under the same `<project>/` prefix. One bucket
+ends up holding the full record of a multi-agent run: every agent's reasoning
+**and** every message between them.
+
+Sessions are never modified at the source. The mirror is a read-only tail.
+
+## Supported harnesses
+
+| `--harness` | Source | Storage |
+|---|---|---|
+| `claude-code` | `~/.claude/projects/<encoded-cwd>/<id>.jsonl` | append-only JSONL |
+| `codex` | `~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl` | append-only JSONL |
+| `openclaw` | `<state>/agents/<agentId>/sessions/<id>.jsonl` | append-only JSONL |
+| `hermes` | `~/.hermes/state.db` (`messages` table) | SQLite (WAL) |
+
+All four expose the same interface to the mirror loop via the `Harness`
+protocol (`sdk/tracecraft/harness/base.py`). Adding a fifth harness is one
+file plus a `REGISTRY` entry.
+
+### Harness notes
+
+- **OpenClaw** state dir resolves `OPENCLAW_STATE_DIR` → `OPENCLAW_HOME` →
+  `~/.openclaw`. `--dev`/`--profile <name>` map to `~/.openclaw-dev` /
+  `~/.openclaw-<name>` — point `OPENCLAW_STATE_DIR` at those if you use them.
+  The mutable `sessions.json` index and `*.tmp` staging files are skipped.
+  Session ids are unique only within an `agentId`, so the mirrored id is
+  `<agentId>__<sessionId>`.
+- **Hermes** is SQLite, not a file. The adapter opens the DB **read-only**
+  (`mode=ro`, never `immutable`) so it is safe to run while Hermes is writing —
+  WAL mode allows concurrent readers. It reads new rows with
+  `WHERE id > :cursor ORDER BY id` (the same incremental pattern Hermes uses
+  internally) and synthesizes one JSON line per message. Multimodal `content`
+  stored with Hermes' `\x00json:` sentinel is decoded back to JSON.
+
+## Commands
+
+```bash
+tracecraft session mirror --harness <name> [--session-id ID] [--cwd PATH]
+                          [--no-redact] [--min-bytes N]
+tracecraft session list [--harness NAME] [--limit N] [--sort-by recent|size]
+tracecraft session show <session-id> [--tail N]
+tracecraft session stop <session-id>
+```
+
+### mirror
+
+Single-shot. Reads everything new since the last run, redacts, uploads it as a
+new part, updates `meta.json`, and advances the cursor. Safe to run repeatedly
+(e.g. from a cron, a `SessionEnd` hook, or a `while sleep 5` loop).
+
+```bash
+# Auto-pick the most recent claude-code session for the current directory
+tracecraft session mirror --harness claude-code
+
+# Explicit session, codex
+tracecraft session mirror --harness codex --session-id abc123
+
+# Hermes (session id is the sessions.id TEXT value, e.g. 20260529_120000_abc123)
+tracecraft session mirror --harness hermes --session-id 20260529_120000_abc123
+```
+
+If `--session-id` is omitted, the most recently active session is chosen
+(for Hermes, the session owning the highest message id).
+
+### list / show / stop
+
+```bash
+tracecraft session list                       # every mirrored session
+tracecraft session show <id>                   # print meta.json
+tracecraft session show <id> --tail 50         # + last 50 lines of the transcript
+tracecraft session stop <id>                   # clear local state, mark ended_at
+```
+
+## Bucket layout
+
+Additive — does not touch existing coordination keys.
+
+```
+<bucket>/<project>/
+  agents/        memory/        messages/        steps/        artifacts/   ← coordination
+  sessions/
+    <harness>/
+      <session-id>/
+        part-00000-<uuid8>.jsonl   ← one per mirror flush, disjoint
+        part-00001-<uuid8>.jsonl
+        meta.json                  ← cumulative metadata + redaction counts
+```
+
+Parts are append-disjoint and reassemble byte-for-byte (file harnesses) or
+row-for-row (Hermes). The `<uuid8>` suffix makes concurrent flushes from
+different machines collision-safe; reassembly sorts by sequence number.
+
+## The cursor model
+
+The mirror tracks a per-session **cursor** in
+`~/.tracecraft/mirror-state/<session-id>.json`. The cursor is opaque:
+
+- file harnesses → a **byte offset**
+- Hermes → the highest **`messages.id`** (an AUTOINCREMENT rowid)
+
+`read_new(session, cursor)` returns `(new_bytes, new_cursor)` so advancement is
+race-free — the loop advances to exactly what it consumed, never to a
+separately-sampled size. Losing the state file is non-destructive: the next run
+re-derives the next part sequence number from a bucket LIST, and overlap is
+re-uploaded as a fresh part rather than clobbering existing ones.
+
+## Redaction
+
+Redaction is **on by default** and runs before any bytes leave the machine. It
+is a regex denylist (`sdk/tracecraft/redact.py`) covering AWS, Anthropic,
+OpenAI, HuggingFace, GitHub, and Slack token shapes plus bearer tokens. Every
+match is **counted** in `meta.json` (`redaction_counts`), never silently
+dropped.
+
+```bash
+tracecraft session mirror --harness claude-code            # redaction on (default)
+tracecraft session mirror --harness claude-code --no-redact # raw, trusted buckets only
+```
+
+Redaction v0 catches well-known token shapes. It does **not** detect arbitrary
+secrets, custom internal token formats, or proprietary content. Treat it as a
+safety net, not a guarantee — and prefer a private bucket for session data.
diff --git a/plans/TRACES_V1_PLAN.md b/plans/TRACES_V1_PLAN.md
new file mode 100644
index 0000000..86bb5da
--- /dev/null
+++ b/plans/TRACES_V1_PLAN.md
@@ -0,0 +1,435 @@
+# traces-v1 — Session Mirror & Replay
+
+**Branch:** `traces-v1`
+**Target release:** `0.2.0`
+**Estimated effort:** 12–14 working days
+**Status:** drafted 2026-05-20
+
+---
+
+## 1. Why this exists (the only thing that matters)
+
+The 2026-05 market scan (`plans/MARKET_REPORT_SESSIONS_2026_05.md`) found
+session-mirroring is **commodity**:
+
+- Anthropic ships **SessionStore** (Claude-Code-native, opaque cloud)
+- HuggingFace ships **Storage Buckets + Agent Trace Viewer** (HF-only)
+- **DataClaw** (2.1k★) and **claude-sync** (119★) already mirror local JSONL
+
+So copying any of them is a waste. Tracecraft's session mirror only earns
+its place if it does **three things none of them do**:
+
+1. **Cross-backend.** Any S3-compatible bucket (AWS, R2, MinIO, B2, Wasabi)
+   *and* HF Buckets. The user owns the data; we never see it.
+2. **Sessions + coordination in one bucket.** Tracecraft already stores
+   memory / mailbox / claims / artifacts under `<project>/`. Putting
+   harness sessions under the same `<project>/sessions/` namespace
+   means one bucket holds the *entire* multi-agent history.
+3. **Cross-harness replay.** Claude Code JSONL + Codex JSONL + tracecraft
+   coordination events merged into one timeline. This is the killer
+   demo: "watch four Claude Code agents coordinate, see each one's
+   reasoning, see the messages between them, in a single HTML."
+
+If at any point during implementation we feel pulled toward features
+that don't serve those three goals, stop and re-read this section.
+
+---
+
+## 2. Non-goals
+
+These look tempting and are deliberately excluded from `0.2.0`:
+
+- **Real-time UI.** Replay is a static HTML render of a finished bucket.
+  No live websocket, no dashboard server.
+- **LLM-based redaction.** Regex denylist v0 only; LLM redaction is a
+  later-tier item once we know the false-positive rate.
+- **Trace signing / SN13 submission.** That's `SN13_AGENT_TRACES_PITCH.md`
+  territory, separate 3-week de-risk plan.
+- **Anthropic SessionStore integration.** Their API, their schema,
+  their lock-in. We mirror the local JSONL — that's the open path.
+- **MCP server.** Already decided redundant given CLI + SKILL.md.
+- **Cursor / Cline / Aider support.** Claude Code + Codex first. Others
+  follow only if there's demand and a JSONL-equivalent format.
+- **TTL claims, heartbeat refresh, message-key collision.** These are
+  Tier 1 fixes from `RESEARCH_2026_05.md`. Bundle them in `0.2.1` if
+  traces-v1 didn't subsume the need.
+
+---
+
+## 3. Scope: nine deliverables
+
+| # | Deliverable | Approx LoC | Days |
+|---|-------------|-----------|------|
+| D1 | `tracecraft session mirror` (Claude Code) | 150 | 2 |
+| D2 | Claude Code plugin (`.claude-plugin/`) | 250 | 1 |
+| D3 | Codex variant | 80 | 1 |
+| D4 | `tracecraft session list / show` | 80 | 1 |
+| D5 | `tracecraft replay` (the killer demo) | 350 | 2 |
+| D6 | Redaction v0 (regex denylist) | 100 | 0.5 |
+| D7 | Tests (moto + golden JSONL fixtures) | 250 | 1.5 |
+| D8 | Docs (README + SKILL.md + plugin README) | — | 1 |
+| D9 | Launch artifact (4-agent demo recording) | — | 1 |
+
+Total: ~1,260 LoC, 11 working days + 1 day slack.
+
+---
+
+## 4. Bucket layout (additive — does not touch existing keys)
+
+```
+<bucket>/<project>/
+  …existing keys (agents/, memory/, messages/, steps/, artifacts/)…
+  sessions/
+    claude-code/
+      <session-id>.jsonl          ← raw JSONL stream (append-only)
+      <session-id>.meta.json      ← cwd, started_at, ended_at, agent_id,
+                                    line_count, redacted_count, schema_version
+    codex/
+      <session-id>.jsonl
+      <session-id>.meta.json
+    _index.json                   ← list of all sessions (rebuilt on each upload)
+```
+
+**Why a separate top-level `sessions/` instead of nesting under `agents/`:**
+sessions belong to a *harness instance*, not always to a registered tracecraft
+agent. A solo dev running Claude Code with no `tracecraft init agents/...` still
+benefits from the mirror. Linking to an `agent_id` is optional metadata.
+
+---
+
+## 5. D1 — `tracecraft session mirror` (the foundation)
+
+### Command
+```
+tracecraft session mirror [--harness claude-code|codex] [--session-id <id>]
+                          [--watch-dir <path>] [--batch-seconds 5]
+                          [--once] [--detach]
+```
+
+### Behaviour
+1. Auto-detect the active session if `--session-id` is omitted:
+   - **Claude Code:** glob `~/.claude/projects/<encoded-cwd>/*.jsonl`,
+     pick the one with the most recent `mtime`.
+   - **Codex:** glob `~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-*.jsonl`,
+     same heuristic.
+2. Tail the file (resume from byte offset stored in
+   `~/.tracecraft/mirror-<session-id>.state`).
+3. Every `--batch-seconds` (default 5), flush the new bytes to
+   `sessions/<harness>/<session-id>.jsonl` using
+   **multipart append via copy-then-put** (S3 has no native append;
+   we re-upload the growing object, see §5.3).
+4. Update `<session-id>.meta.json` on every flush.
+5. Track PID in `~/.tracecraft/mirror.pid` (per-session, not global) so
+   the user can `tracecraft session stop <session-id>` cleanly.
+6. `--detach` forks a background process (Unix `os.fork()`,
+   on Windows fall back to subprocess + log file).
+7. `--once` does a single sync and exits (good for cron / hooks).
+
+### 5.1 Append strategy on S3
+
+S3 has no `append`. Options considered:
+
+| Option | Pros | Cons | Verdict |
+|--------|------|------|---------|
+| Re-upload full file every batch | Trivial | Cost grows O(n²) for long sessions | ✗ |
+| One object per batch (`<sid>.<seq>.jsonl`) | Cheap, no read-back | Replay must list+merge | ✓ chosen |
+| S3 multipart upload kept open | True append-ish | Multipart sessions abort on agent crash | ✗ |
+
+**Chosen:** one object per batch. Final layout:
+```
+sessions/claude-code/<session-id>/
+  part-00000.jsonl
+  part-00001.jsonl
+  …
+  meta.json
+```
+Replay/show concatenates parts in order. `tracecraft session compact <sid>`
+(later) merges into one file for archival.
+
+Trade-off accepted: more list operations during replay. Cheap on S3
+($0.005 per 1000 LIST). For long sessions this is materially better.
+
+### 5.2 State file format
+
+`~/.tracecraft/mirror-state/<session-id>.json`:
+```json
+{
+  "harness": "claude-code",
+  "session_id": "abc123",
+  "source_path": "/Users/x/.claude/projects/.../abc123.jsonl",
+  "bucket_prefix": "sessions/claude-code/abc123/",
+  "byte_offset": 142857,
+  "next_part_seq": 12,
+  "last_flush": "2026-05-20T10:15:00Z",
+  "pid": 4523
+}
+```
+
+### 5.3 Graceful shutdown
+- `SIGTERM` / `SIGINT` → flush pending buffer, write final meta, remove pid.
+- Crash → state file lets next `mirror` invocation resume from `byte_offset`.
+- Idempotency: if `part-<seq>.jsonl` already exists at the target key,
+  bump `next_part_seq` until empty slot found (defends against duplicate
+  uploads after partial crash).
+
+---
+
+## 6. D2 — Claude Code plugin
+
+### Why a plugin (vs a hook the user installs manually)
+The whole point is **zero-friction**. If the user has to edit JSON
+config files, we lose. `/plugin install tracecraft` should be the path.
+
+### Files in `plugins/claude-code/`
+```
+plugins/claude-code/
+  .claude-plugin/
+    plugin.json              ← name, version, hooks, commands
+  hooks/
+    session-start.sh         ← spawns `tracecraft session mirror --detach`
+    session-end.sh           ← `tracecraft session stop $CLAUDE_SESSION_ID`
+  skills/
+    tracecraft.md            ← SKILL.md so Claude inside Claude Code knows
+                               how to use tracecraft for coordination
+  commands/
+    tc-mirror.md             ← /tc-mirror slash command (start/stop/status)
+    tc-replay.md             ← /tc-replay slash command
+  README.md
+```
+
+### Submission target
+Anthropic's plugin marketplace + GitHub direct-install path
+(`/plugin install Arrmlet/tracecraft`).
+
+### Open question to resolve during impl
+Does `SessionStart` hook fire on `claude --resume`? If not, we also
+need a `UserPromptSubmit` hook with a "have we started mirroring?" guard.
+(Test on day 1 of D2; cheap to verify.)
+
+---
+
+## 7. D3 — Codex variant
+
+Codex CLI writes to `~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-*.jsonl`.
+Schema differs (it's not Claude-Code JSONL) but the *act of tailing* is
+identical. ~50 LoC: just a new `Harness` adapter that knows the
+glob pattern and (optionally) translates entries to a normalized schema.
+
+For replay we'll keep entries in their native schema and let the
+renderer handle two harness types side-by-side. **No premature
+normalization** — if a third harness lands, then we extract a base.
+
+---
+
+## 8. D4 — `session list` / `session show`
+
+```
+tracecraft session list [--harness claude-code|codex] [--limit 20]
+tracecraft session show <session-id> [--tail 50]
+tracecraft session stop <session-id>
+```
+
+Reads `<project>/sessions/_index.json`. `_index.json` is rewritten on
+each meta update (write whole file — it's tiny; ~1 KB per 100 sessions).
+
+---
+
+## 9. D5 — `tracecraft replay` (the killer demo)
+
+This is where tracecraft stops looking like "yet another session
+mirror" and becomes a coordination viewer.
+
+### Command
+```
+tracecraft replay [--project <name>] [--out replay.html] [--open]
+                  [--since <iso>] [--until <iso>]
+```
+
+### What it does
+1. Pulls **all** of `<project>/`:
+   - `agents/*.json` (registered agents)
+   - `memory/*.json` (every memory write — but memory keys don't have
+     timestamps; we'll need to add `_updated_at` to memory writes —
+     small backwards-compatible change)
+   - `messages/**/*.json` (every message)
+   - `steps/**/*.json` (every claim/handoff/status)
+   - `sessions/**/part-*.jsonl` (every harness event)
+2. Builds a unified timeline (single sorted array of events,
+   each tagged with `event_type` and `agent_id`).
+3. Renders a **single self-contained HTML file** (no server) with:
+   - vertical timeline (newest at top or oldest at top, toggle)
+   - one swim-lane per agent
+   - colour-coding: coordination events (claim/message/memory) vs
+     harness events (tool-use, reasoning, file-edit)
+   - click any event → expand JSON
+   - filter by agent / event type / text search
+
+### Tech for the HTML
+- Pure HTML + vanilla JS embedded in one file. **No build step.**
+  React/Vite would be faster to write but harder to ship and harder
+  for users to inspect/trust.
+- One inlined `<script>` with the events array as JSON.
+- ~350 LoC including CSS.
+
+### Why this is the artifact
+When we show "four Claude Code agents coordinating on a real project,
+here's the HTML, every reasoning step + every message + every claim
+visible on one timeline" — *nobody else has that*. Anthropic's
+SessionStore can't see your other agents. HF's Trace Viewer is
+single-session. This is the differentiation made real.
+
+---
+
+## 10. D6 — Redaction v0
+
+Regex denylist applied **at flush time** (before bytes leave the machine):
+
+- AWS keys: `(?i)(aws_(access|secret)_(key|access_key_id)\s*[:=]\s*['"]?)[A-Za-z0-9/+=]{16,}`
+- Anthropic: `sk-ant-[A-Za-z0-9_-]{20,}`
+- OpenAI: `sk-[A-Za-z0-9]{20,}` (plus `sk-proj-`, `sk-svcacct-`)
+- HF: `hf_[A-Za-z0-9]{30,}`
+- GitHub: `ghp_[A-Za-z0-9]{30,}` `gho_` `ghu_` `ghs_` `ghr_`
+- Generic envvar leaks: lines matching `[A-Z_]+_TOKEN=` / `_KEY=` / `_SECRET=`
+- Bearer tokens: `Bearer [A-Za-z0-9_.-]{20,}`
+- Absolute home paths → `~`
+
+Each redaction is **counted, not silenced**: meta.json records
+`{"redactions": {"aws_key": 2, "anthropic_key": 1, ...}}` so users
+can audit. Add `--no-redact` for users who *want* raw (e.g., they
+control the bucket entirely and prefer full fidelity).
+
+False-positive escape valve: `.tracecraft-redact.yml` in cwd can
+add/remove patterns.
+
+---
+
+## 11. D7 — Tests
+
+Build on the `moto`-based test infra from Tier 0.
+
+**Core tests:**
+- `test_mirror_creates_part_objects` — write JSONL locally, run mirror
+  with `--once`, assert `part-00000.jsonl` exists with correct bytes.
+- `test_mirror_resumes_from_offset` — partial flush, re-run, verify
+  only new bytes go to `part-00001.jsonl`.
+- `test_mirror_idempotent_on_crash_recovery` — pre-create
+  `part-00000.jsonl`, verify mirror bumps to `part-00001.jsonl`.
+- `test_redaction_v0_catches_aws_key` — feed a JSONL line with an
+  AWS key, verify it's `[REDACTED:aws_key]` in the part object and
+  counted in meta.
+- `test_redaction_no_redact_flag_passes_through` — same line,
+  `--no-redact`, raw bytes preserved.
+- `test_session_list_returns_all_harnesses`
+- `test_session_show_concatenates_parts_in_order`
+- `test_replay_merges_coordination_and_harness_events_by_timestamp` —
+  the critical one. Seed bucket with a memory write at t=1, a message
+  at t=2, a harness JSONL line at t=3, verify the HTML output's
+  embedded JSON array contains all three in order.
+
+**Golden JSONL fixtures:** `sdk/tests/fixtures/claude-code-sample.jsonl`
+and `codex-sample.jsonl` — small, real-shape, scrubbed.
+
+---
+
+## 12. D8 — Docs
+
+- `README.md`: add a "Session mirror" section after the existing
+  coordination examples, with a 4-line quickstart.
+- `docs/session-mirror.md`: full reference (commands, flags,
+  state file format, redaction config).
+- `plugins/claude-code/README.md`: install instructions.
+- `CLAUDE.md`: add `sessions/` to the bucket-layout diagram.
+
+---
+
+## 13. D9 — Launch artifact
+
+Record (asciinema or screencap) a real demo:
+
+1. `tracecraft init` in a fresh dir.
+2. Spawn 4 Claude Code instances, each as a different agent.
+3. Give them a small shared task ("build a CLI that parses JSON")
+   — one claims `design`, one `impl`, one `tests`, one `docs`.
+4. Let them coordinate via tracecraft (mailbox + handoffs).
+5. Each Claude Code instance auto-mirrors its session via the plugin.
+6. Run `tracecraft replay` → open the HTML.
+7. Screenshot the swim-lane view showing all four agents'
+   reasoning + their cross-agent messages on one timeline.
+
+This is the asset that ships with the 0.2.0 release post.
+
+---
+
+## 14. Sequencing & dependencies
+
+```
+D1 (mirror core) ─┬─► D3 (codex) ─┐
+                  ├─► D6 (redact)  ├─► D7 (tests)
+                  └─► D4 (list/show)┘                          ─► D8 (docs) ─► D9 (demo) ─► release 0.2.0
+D1 ─► D2 (plugin)                  ─► D5 (replay, needs sessions in bucket)
+```
+
+D5 (replay) is the long pole and the differentiator. If we slip,
+slip everything else, not replay.
+
+---
+
+## 15. Risk register
+
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|------------|
+| Claude Code JSONL schema changes mid-build | Med | Med | Pin to current schema, golden fixture, version-detect via first-line metadata |
+| Anthropic launches matching coordination viewer | Low | High | Replay's value is *cross-harness*; their viewer will be Claude-only |
+| Plugin install UX requires Anthropic approval | Med | Med | Ship as `git clone` first, marketplace second |
+| Mirror crashes silently and user thinks they have a backup | Med | High | Loud failure: write `~/.tracecraft/mirror-errors.log`, surface in `tracecraft session list` |
+| Bucket cost surprise for heavy users | Low | Med | Doc the storage math, suggest lifecycle rules |
+| Redaction misses something and a secret lands in a public bucket | Low | Critical | Default-on redaction, count + show redactions in meta, doc the limits clearly |
+
+---
+
+## 16. Definition of done for 0.2.0
+
+- [ ] `tracecraft session mirror`, `list`, `show`, `stop` shipped
+- [ ] Claude Code plugin published to GitHub (`Arrmlet/tracecraft`)
+- [ ] Codex harness supported
+- [ ] `tracecraft replay` produces a self-contained HTML with
+      cross-agent + cross-harness timeline
+- [ ] Redaction v0 on by default, counted in meta, `.tracecraft-redact.yml`
+      escape valve documented
+- [ ] All Tier 0 tests still green; new tests for mirror + replay green
+- [ ] CI green on Python 3.10 / 3.11 / 3.12 / 3.13
+- [ ] Demo HTML committed to `examples/replay-demo.html`
+- [ ] PyPI 0.2.0 published via the trusted-publishing pipeline
+- [ ] Launch post drafted in `plans/LAUNCH_TWEET.md` (or sibling)
+
+---
+
+## 17. What we explicitly defer to 0.2.x / 0.3.0
+
+- Tier 1 fixes (TTL claims, heartbeat refresh, message-key collisions)
+- LLM-based redaction
+- Cursor / Cline / Aider harnesses
+- Live replay (websocket)
+- Trace signing (Ed25519) — prerequisite for SN13 pitch, separate plan
+- Memory `_updated_at` timestamps — *unless* replay needs it; if so,
+  pull forward into D5
+
+---
+
+## 18. Sign-off checklist (review before writing the first line of code)
+
+- [ ] Does every deliverable serve at least one of: cross-backend,
+      coordination+sessions in one place, cross-harness replay?
+- [ ] Have we re-read `plans/MARKET_REPORT_SESSIONS_2026_05.md`
+      and confirmed nothing here duplicates SessionStore / HF Viewer /
+      DataClaw / claude-sync?
+- [ ] Is anything here *not* required for the 4-agent launch demo?
+- [ ] If we shipped only D1+D2+D5, would the launch story still hold?
+      (If yes, that's our minimum lovable cut.)
+
+---
+
+**Next step:** review this plan; if it holds, start D1. Minimum
+lovable cut is D1 + D2 + D5 — if time pressure hits, drop D3 (Codex),
+D4 (list/show CLI sugar), D6 (redaction v0 default), and ship the
+rest in 0.2.1.
diff --git a/sdk/pyproject.toml b/sdk/pyproject.toml
index a078731..4f0be08 100644
--- a/sdk/pyproject.toml
+++ b/sdk/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "tracecraft-ai"
-version = "0.1.6"
+version = "0.2.0"
 description = "Coordination layer for multi-agent AI systems. Bring your own S3 / HuggingFace bucket; shared memory, mailbox, atomic task claims, handoffs, artifacts — no server, no database."
 readme = "README.md"
 license = {text = "MIT"}
diff --git a/sdk/tests/test_harness.py b/sdk/tests/test_harness.py
new file mode 100644
index 0000000..df58edb
--- /dev/null
+++ b/sdk/tests/test_harness.py
@@ -0,0 +1,566 @@
+"""Tests for the Harness adapter framework.
+
+Covers:
+  - Protocol conformance (registry + isinstance via @runtime_checkable)
+  - Claude Code: cwd encoding, discovery, active-session picking, tail semantics
+  - Codex: glob over YYYY/MM/DD tree, session id parsing from rollout filenames
+  - Append semantics shared by both (read_new_bytes from arbitrary offset)
+  - Bad-input handling (negative offset, missing directories)
+
+Run from repo root:
+    .venv-test/bin/pytest sdk/tests/test_harness.py -v
+"""
+
+from __future__ import annotations
+
+import os
+import sqlite3
+import time
+from pathlib import Path
+
+import pytest
+
+from tracecraft.harness import (
+    REGISTRY,
+    ClaudeCodeHarness,
+    CodexHarness,
+    HermesHarness,
+    OpenClawHarness,
+    get_harness,
+)
+from tracecraft.harness.base import Harness, Session
+from tracecraft.harness.claude_code import _encode_cwd
+
+
+# ---------- registry / protocol ----------
+
+
+def test_registry_lists_known_harnesses():
+    assert "claude-code" in REGISTRY
+    assert "codex" in REGISTRY
+    assert "openclaw" in REGISTRY
+    assert "hermes" in REGISTRY
+
+
+def test_get_harness_returns_instance():
+    h = get_harness("claude-code")
+    assert isinstance(h, ClaudeCodeHarness)
+    assert h.name == "claude-code"
+
+
+def test_get_harness_unknown_raises():
+    with pytest.raises(ValueError, match="unknown harness"):
+        get_harness("never-shipped")
+
+
+def test_adapters_satisfy_protocol():
+    # runtime_checkable Protocol — all adapters must structurally match
+    assert isinstance(ClaudeCodeHarness(), Harness)
+    assert isinstance(CodexHarness(), Harness)
+    assert isinstance(OpenClawHarness(), Harness)
+    assert isinstance(HermesHarness(), Harness)
+
+
+def test_read_new_returns_bytes_and_advanced_cursor():
+    # The race-free read_new contract: file harnesses return cursor+len(bytes).
+    import tempfile
+
+    with tempfile.TemporaryDirectory() as d:
+        p = Path(d) / "s.jsonl"
+        p.write_bytes(b"abc\n")
+        cc = ClaudeCodeHarness()
+        sess = Session(path=p, session_id="s")
+        data, new_cursor = cc.read_new(sess, 0)
+        assert data == b"abc\n"
+        assert new_cursor == 4
+        data2, new_cursor2 = cc.read_new(sess, new_cursor)
+        assert data2 == b""
+        assert new_cursor2 == 4
+
+
+# ---------- Claude Code ----------
+
+
+def test_claude_code_encode_cwd_matches_dotclaude_scheme(tmp_path):
+    # Claude Code encodes absolute paths by replacing separators with hyphens.
+    encoded = _encode_cwd(tmp_path)
+    expected = str(tmp_path.resolve()).replace(os.sep, "-")
+    assert encoded == expected
+    assert encoded.startswith("-")  # leading separator becomes leading hyphen
+
+
+def test_claude_code_discover_empty_when_no_project_dir(tmp_path):
+    cc = ClaudeCodeHarness(root=tmp_path / "projects")
+    assert cc.discover(tmp_path / "nonexistent-cwd") == []
+
+
+def test_claude_code_discover_finds_sessions(tmp_path):
+    cc_root = tmp_path / "projects"
+    cwd = tmp_path / "my-proj"
+    cwd.mkdir()
+    project_dir = cc_root / _encode_cwd(cwd)
+    project_dir.mkdir(parents=True)
+
+    (project_dir / "sess-aaa.jsonl").write_text('{"role":"user"}\n')
+    (project_dir / "sess-bbb.jsonl").write_text('{"role":"assistant"}\n')
+    # Unrelated file should be ignored.
+    (project_dir / "notes.txt").write_text("ignored")
+
+    cc = ClaudeCodeHarness(root=cc_root)
+    sessions = cc.discover(cwd)
+    ids = {s.session_id for s in sessions}
+    assert ids == {"sess-aaa", "sess-bbb"}
+    for s in sessions:
+        assert s.cwd == cwd
+        assert s.path.suffix == ".jsonl"
+
+
+def test_claude_code_active_session_picks_most_recent(tmp_path):
+    cc_root = tmp_path / "projects"
+    cwd = tmp_path / "proj"
+    cwd.mkdir()
+    pdir = cc_root / _encode_cwd(cwd)
+    pdir.mkdir(parents=True)
+
+    older = pdir / "sess-old.jsonl"
+    newer = pdir / "sess-new.jsonl"
+    older.write_text("old\n")
+    # ensure distinct mtimes across filesystems with second-precision mtime
+    time.sleep(0.01)
+    newer.write_text("new\n")
+    os.utime(older, (time.time() - 100, time.time() - 100))
+
+    cc = ClaudeCodeHarness(root=cc_root)
+    active = cc.active_session(cwd)
+    assert active is not None
+    assert active.session_id == "sess-new"
+
+
+def test_claude_code_active_session_none_when_empty(tmp_path):
+    cc = ClaudeCodeHarness(root=tmp_path / "projects")
+    assert cc.active_session(tmp_path / "no-such-cwd") is None
+
+
+# ---------- Codex ----------
+
+
+def test_codex_discover_walks_date_tree(tmp_path):
+    root = tmp_path / "sessions"
+    day = root / "2026" / "05" / "21"
+    day.mkdir(parents=True)
+    (day / "rollout-2026-05-21T10-30-00-abc123.jsonl").write_text("{}\n")
+    (day / "rollout-2026-05-21T11-00-00-def456.jsonl").write_text("{}\n")
+    # noise files shouldn't be picked up
+    (day / "scratch.txt").write_text("nope")
+
+    cx = CodexHarness(root=root)
+    sessions = cx.discover(tmp_path)  # cwd ignored for codex
+    ids = {s.session_id for s in sessions}
+    assert ids == {"abc123", "def456"}
+
+
+def test_codex_discover_empty_when_no_root(tmp_path):
+    cx = CodexHarness(root=tmp_path / "does-not-exist")
+    assert cx.discover(tmp_path) == []
+
+
+def test_codex_active_session_picks_newest_across_days(tmp_path):
+    root = tmp_path / "sessions"
+    d1 = root / "2026" / "05" / "20"
+    d2 = root / "2026" / "05" / "21"
+    d1.mkdir(parents=True)
+    d2.mkdir(parents=True)
+    old = d1 / "rollout-2026-05-20T09-00-00-old111.jsonl"
+    new = d2 / "rollout-2026-05-21T09-00-00-new222.jsonl"
+    old.write_text("o\n")
+    time.sleep(0.01)
+    new.write_text("n\n")
+    os.utime(old, (time.time() - 100, time.time() - 100))
+
+    cx = CodexHarness(root=root)
+    active = cx.active_session(tmp_path)
+    assert active is not None
+    assert active.session_id == "new222"
+
+
+# ---------- append / tail semantics (the contract the mirror loop relies on) ----------
+
+
+@pytest.fixture
+def claude_code_with_session(tmp_path):
+    """A fully-wired Claude Code env with one session file."""
+    cc_root = tmp_path / "projects"
+    cwd = tmp_path / "proj"
+    cwd.mkdir()
+    pdir = cc_root / _encode_cwd(cwd)
+    pdir.mkdir(parents=True)
+    sess_path = pdir / "sess-tail.jsonl"
+    sess_path.write_bytes(b"")  # empty
+    cc = ClaudeCodeHarness(root=cc_root)
+    session = Session(path=sess_path, session_id="sess-tail", cwd=cwd)
+    return cc, session
+
+
+def test_read_new_bytes_returns_everything_from_zero(claude_code_with_session):
+    cc, session = claude_code_with_session
+    session.path.write_bytes(b'{"a":1}\n{"a":2}\n')
+
+    out = cc.read_new_bytes(session, 0)
+    assert out == b'{"a":1}\n{"a":2}\n'
+
+
+def test_read_new_bytes_returns_only_appended_bytes(claude_code_with_session):
+    cc, session = claude_code_with_session
+    session.path.write_bytes(b'{"a":1}\n')
+    first_size = cc.size(session)
+
+    # Append two more lines
+    with open(session.path, "ab") as f:
+        f.write(b'{"a":2}\n{"a":3}\n')
+
+    out = cc.read_new_bytes(session, first_size)
+    assert out == b'{"a":2}\n{"a":3}\n'
+
+
+def test_read_new_bytes_at_eof_returns_empty(claude_code_with_session):
+    cc, session = claude_code_with_session
+    session.path.write_bytes(b'{"a":1}\n')
+    out = cc.read_new_bytes(session, cc.size(session))
+    assert out == b""
+
+
+def test_read_new_bytes_offset_beyond_eof_returns_empty(claude_code_with_session):
+    cc, session = claude_code_with_session
+    session.path.write_bytes(b'{"a":1}\n')
+    # offset > size: seek past EOF, read returns b""
+    out = cc.read_new_bytes(session, 10_000)
+    assert out == b""
+
+
+def test_read_new_bytes_rejects_negative_offset(claude_code_with_session):
+    cc, session = claude_code_with_session
+    session.path.write_bytes(b"hello")
+    with pytest.raises(ValueError, match="non-negative"):
+        cc.read_new_bytes(session, -1)
+
+
+def test_codex_read_new_bytes_same_contract(tmp_path):
+    root = tmp_path / "sessions"
+    day = root / "2026" / "05" / "21"
+    day.mkdir(parents=True)
+    p = day / "rollout-2026-05-21T10-00-00-xyz.jsonl"
+    p.write_bytes(b"line-1\nline-2\n")
+
+    cx = CodexHarness(root=root)
+    sess = Session(path=p, session_id="xyz")
+
+    assert cx.read_new_bytes(sess, 0) == b"line-1\nline-2\n"
+    assert cx.read_new_bytes(sess, 7) == b"line-2\n"
+    assert cx.read_new_bytes(sess, cx.size(sess)) == b""
+
+
+# ---------- mirror-loop dry run (no S3 yet — just the read side) ----------
+
+
+def test_simulated_tail_produces_disjoint_parts(claude_code_with_session):
+    """Simulates what the mirror loop will do: tail in batches, each batch
+    becomes one part. Verifies parts are disjoint and concatenate back to
+    the source bytes exactly. This is the contract D5 (replay) depends on.
+    """
+    cc, session = claude_code_with_session
+    parts: list[bytes] = []
+    offset = 0
+
+    # Batch 1
+    session.path.write_bytes(b'{"step":1}\n')
+    new = cc.read_new_bytes(session, offset)
+    parts.append(new)
+    offset += len(new)
+
+    # Batch 2 (append two lines)
+    with open(session.path, "ab") as f:
+        f.write(b'{"step":2}\n{"step":3}\n')
+    new = cc.read_new_bytes(session, offset)
+    parts.append(new)
+    offset += len(new)
+
+    # Batch 3 (nothing happened)
+    new = cc.read_new_bytes(session, offset)
+    parts.append(new)  # will be b""
+    offset += len(new)
+
+    # Batch 4 (one more line)
+    with open(session.path, "ab") as f:
+        f.write(b'{"step":4}\n')
+    new = cc.read_new_bytes(session, offset)
+    parts.append(new)
+
+    full = session.path.read_bytes()
+    assert b"".join(parts) == full
+    # Empty batch is preserved as a zero-length part; the mirror loop will
+    # skip those before uploading. We just verify it doesn't lose bytes.
+    assert any(p == b"" for p in parts)
+
+
+# ---------- OpenClaw ----------
+
+
+def _make_openclaw_session(root: Path, agent_id: str, sid: str, body: bytes) -> Path:
+    sess_dir = root / agent_id / "sessions"
+    sess_dir.mkdir(parents=True, exist_ok=True)
+    p = sess_dir / f"{sid}.jsonl"
+    p.write_bytes(body)
+    return p
+
+
+def test_openclaw_discover_finds_sessions_across_agents(tmp_path):
+    root = tmp_path / "agents"
+    _make_openclaw_session(root, "main", "sess-aaa", b'{"type":"session","id":"sess-aaa"}\n')
+    _make_openclaw_session(root, "worker", "sess-bbb", b'{"type":"session","id":"sess-bbb"}\n')
+
+    oc = OpenClawHarness(root=root)
+    sessions = oc.discover(tmp_path)
+    ids = {s.session_id for s in sessions}
+    # stable id is <agentId>__<stem>
+    assert ids == {"main__sess-aaa", "worker__sess-bbb"}
+
+
+def test_openclaw_excludes_sessions_json_and_tmp(tmp_path):
+    root = tmp_path / "agents"
+    sess_dir = root / "main" / "sessions"
+    sess_dir.mkdir(parents=True)
+    (sess_dir / "real.jsonl").write_bytes(b'{"type":"session"}\n')
+    (sess_dir / "sessions.json").write_bytes(b'{"index":true}\n')
+    (sess_dir / "store.123.abc.tmp").write_bytes(b"half-written")
+
+    oc = OpenClawHarness(root=root)
+    ids = {s.session_id for s in oc.discover(tmp_path)}
+    assert ids == {"main__real"}
+
+
+def test_openclaw_topic_session_caught_by_glob(tmp_path):
+    root = tmp_path / "agents"
+    _make_openclaw_session(root, "main", "sess-x-topic-42", b'{"type":"session"}\n')
+    oc = OpenClawHarness(root=root)
+    ids = {s.session_id for s in oc.discover(tmp_path)}
+    assert ids == {"main__sess-x-topic-42"}
+
+
+def test_openclaw_active_session_picks_most_recent(tmp_path):
+    root = tmp_path / "agents"
+    old = _make_openclaw_session(root, "main", "old", b"o\n")
+    time.sleep(0.01)
+    _make_openclaw_session(root, "main", "new", b"n\n")
+    os.utime(old, (time.time() - 100, time.time() - 100))
+    oc = OpenClawHarness(root=root)
+    active = oc.active_session(tmp_path)
+    assert active is not None and active.session_id == "main__new"
+
+
+def test_openclaw_read_new_tail_semantics(tmp_path):
+    root = tmp_path / "agents"
+    p = _make_openclaw_session(root, "main", "s", b"line1\n")
+    oc = OpenClawHarness(root=root)
+    sess = Session(path=p, session_id="main__s")
+    data, cur = oc.read_new(sess, 0)
+    assert data == b"line1\n" and cur == 6
+    with open(p, "ab") as f:
+        f.write(b"line2\n")
+    data2, cur2 = oc.read_new(sess, cur)
+    assert data2 == b"line2\n" and cur2 == 12
+
+
+def test_openclaw_state_dir_env_override(tmp_path, monkeypatch):
+    custom = tmp_path / "custom-state"
+    monkeypatch.setenv("OPENCLAW_STATE_DIR", str(custom))
+    _make_openclaw_session(custom / "agents", "main", "s", b"x\n")
+    oc = OpenClawHarness()  # no explicit root → must resolve from env
+    ids = {s.session_id for s in oc.discover(tmp_path)}
+    assert ids == {"main__s"}
+
+
+def test_openclaw_empty_when_no_root(tmp_path):
+    oc = OpenClawHarness(root=tmp_path / "nope")
+    assert oc.discover(tmp_path) == []
+    assert oc.active_session(tmp_path) is None
+
+
+# ---------- Hermes (SQLite) ----------
+
+# Minimal subset of Hermes' real schema (verified against hermes_state.py).
+_HERMES_SCHEMA = """
+CREATE TABLE sessions (
+    id TEXT PRIMARY KEY,
+    source TEXT NOT NULL,
+    model TEXT,
+    started_at REAL NOT NULL,
+    ended_at REAL,
+    title TEXT
+);
+CREATE TABLE messages (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    session_id TEXT NOT NULL REFERENCES sessions(id),
+    role TEXT NOT NULL,
+    content TEXT,
+    tool_calls TEXT,
+    tool_name TEXT,
+    timestamp REAL NOT NULL,
+    token_count INTEGER
+);
+CREATE TABLE schema_version (version INTEGER NOT NULL);
+"""
+
+
+def _make_hermes_db(path: Path, sessions, messages):
+    """sessions: list[(id, source, model, started_at, title)];
+    messages: list[(session_id, role, content, timestamp)]."""
+    conn = sqlite3.connect(str(path))
+    conn.executescript(_HERMES_SCHEMA)
+    conn.execute("INSERT INTO schema_version (version) VALUES (13)")
+    for sid, source, model, started, title in sessions:
+        conn.execute(
+            "INSERT INTO sessions (id, source, model, started_at, title) VALUES (?,?,?,?,?)",
+            (sid, source, model, started, title),
+        )
+    for sess_id, role, content, ts in messages:
+        conn.execute(
+            "INSERT INTO messages (session_id, role, content, timestamp) VALUES (?,?,?,?)",
+            (sess_id, role, content, ts),
+        )
+    conn.commit()
+    conn.close()
+
+
+def test_hermes_discover_lists_sessions(tmp_path):
+    db = tmp_path / "state.db"
+    _make_hermes_db(
+        db,
+        sessions=[
+            ("20260529_010101_aaa111", "cli", "hermes-4", 100.0, "first"),
+            ("20260529_020202_bbb222", "gateway", "hermes-4", 200.0, "second"),
+        ],
+        messages=[],
+    )
+    h = HermesHarness(db_path=db)
+    sessions = h.discover(tmp_path)
+    ids = {s.session_id for s in sessions}
+    assert ids == {"20260529_010101_aaa111", "20260529_020202_bbb222"}
+    # all point at the same DB file
+    assert all(s.path == db for s in sessions)
+
+
+def test_hermes_read_new_synthesizes_jsonl_and_advances_rowid(tmp_path):
+    db = tmp_path / "state.db"
+    sid = "20260529_010101_aaa111"
+    _make_hermes_db(
+        db,
+        sessions=[(sid, "cli", "hermes-4", 100.0, "t")],
+        messages=[
+            (sid, "user", "hello", 100.1),
+            (sid, "assistant", "hi there", 100.2),
+        ],
+    )
+    h = HermesHarness(db_path=db)
+    sess = Session(path=db, session_id=sid)
+
+    # size() == max messages.id == 2
+    assert h.size(sess) == 2
+
+    data, cursor = h.read_new(sess, 0)
+    assert cursor == 2
+    lines = data.decode().strip().split("\n")
+    assert len(lines) == 2
+    import json as _json
+
+    first = _json.loads(lines[0])
+    assert first["role"] == "user"
+    assert first["content"] == "hello"
+    assert first["id"] == 1
+
+    # Reading again from the advanced cursor yields nothing.
+    data2, cursor2 = h.read_new(sess, cursor)
+    assert data2 == b"" and cursor2 == 2
+
+
+def test_hermes_read_new_only_new_rows(tmp_path):
+    db = tmp_path / "state.db"
+    sid = "20260529_010101_aaa111"
+    _make_hermes_db(
+        db,
+        sessions=[(sid, "cli", "hermes-4", 100.0, "t")],
+        messages=[(sid, "user", "one", 100.1)],
+    )
+    h = HermesHarness(db_path=db)
+    sess = Session(path=db, session_id=sid)
+    _, cursor = h.read_new(sess, 0)
+    assert cursor == 1
+
+    # Append a second message
+    conn = sqlite3.connect(str(db))
+    conn.execute(
+        "INSERT INTO messages (session_id, role, content, timestamp) VALUES (?,?,?,?)",
+        (sid, "assistant", "two", 100.2),
+    )
+    conn.commit()
+    conn.close()
+
+    data, cursor2 = h.read_new(sess, cursor)
+    assert cursor2 == 2
+    import json as _json
+
+    rows = [_json.loads(line) for line in data.decode().strip().split("\n")]
+    assert len(rows) == 1
+    assert rows[0]["content"] == "two"
+
+
+def test_hermes_decodes_multimodal_content_prefix(tmp_path):
+    db = tmp_path / "state.db"
+    sid = "20260529_010101_aaa111"
+    # Hermes stores multimodal content as '\x00json:<json>'
+    payload = '\x00json:[{"type":"text","text":"hi"}]'
+    _make_hermes_db(
+        db,
+        sessions=[(sid, "cli", "hermes-4", 100.0, "t")],
+        messages=[(sid, "user", payload, 100.1)],
+    )
+    h = HermesHarness(db_path=db)
+    sess = Session(path=db, session_id=sid)
+    data, _ = h.read_new(sess, 0)
+    import json as _json
+
+    row = _json.loads(data.decode().strip())
+    # content should be decoded back into a list, not the sentinel string
+    assert isinstance(row["content"], list)
+    assert row["content"][0]["text"] == "hi"
+
+
+def test_hermes_active_session_is_one_with_highest_message(tmp_path):
+    db = tmp_path / "state.db"
+    s1, s2 = "20260529_010101_aaa", "20260529_020202_bbb"
+    _make_hermes_db(
+        db,
+        sessions=[(s1, "cli", "m", 100.0, "a"), (s2, "cli", "m", 200.0, "b")],
+        messages=[(s1, "user", "x", 100.1), (s2, "user", "y", 200.1), (s1, "user", "z", 201.0)],
+    )
+    h = HermesHarness(db_path=db)
+    # s1 owns the highest message id (the last insert), so it's "active"
+    active = h.active_session(tmp_path)
+    assert active is not None and active.session_id == s1
+
+
+def test_hermes_missing_db_is_empty(tmp_path):
+    h = HermesHarness(db_path=tmp_path / "nope.db")
+    assert h.discover(tmp_path) == []
+    assert h.active_session(tmp_path) is None
+    sess = Session(path=tmp_path / "nope.db", session_id="x")
+    assert h.size(sess) == 0
+    assert h.read_new(sess, 0) == (b"", 0)
+
+
+def test_hermes_read_new_rejects_negative_cursor(tmp_path):
+    db = tmp_path / "state.db"
+    _make_hermes_db(db, sessions=[("s", "cli", "m", 1.0, "t")], messages=[])
+    h = HermesHarness(db_path=db)
+    sess = Session(path=db, session_id="s")
+    with pytest.raises(ValueError, match="non-negative"):
+        h.read_new(sess, -1)
diff --git a/sdk/tests/test_session_cli.py b/sdk/tests/test_session_cli.py
new file mode 100644
index 0000000..cd22230
--- /dev/null
+++ b/sdk/tests/test_session_cli.py
@@ -0,0 +1,315 @@
+"""End-to-end tests for `tracecraft session` CLI.
+
+Stack:
+  - moto's @mock_aws for the S3 backend (in-process, no network)
+  - tmp_path for a fake ~/.claude/projects/<encoded-cwd>/ tree
+  - monkeypatch on tracecraft.cli.session.STATE_DIR so state files don't pollute $HOME
+  - CliRunner to drive the actual CLI
+
+What we prove here:
+  1. mirror --once uploads the first part starting from offset 0
+  2. mirror --once again on the same session uploads ONLY the new bytes as part-00001
+  3. seq numbering survives state-file deletion (derived from bucket LIST)
+  4. redaction default-on: a planted AWS key disappears from the bucket part and shows in meta counts
+  5. --no-redact passes raw bytes through
+  6. session list shows the session after upload
+  7. session show <sid> --tail N concatenates parts and prints the right lines
+  8. session stop clears local state and marks ended_at in meta
+"""
+
+from __future__ import annotations
+
+import json
+import os
+from pathlib import Path
+
+import boto3
+import pytest
+from click.testing import CliRunner
+from moto import mock_aws
+
+import tracecraft.cli.session as session_mod
+from tracecraft.cli import cli
+from tracecraft.harness.claude_code import _encode_cwd
+
+
+BUCKET = "tc-session-test"
+PROJECT = "demo"
+
+
+# ---------- shared fixtures ----------
+
+
+@pytest.fixture
+def s3_env(monkeypatch):
+    monkeypatch.setenv("AWS_ACCESS_KEY_ID", "testing")
+    monkeypatch.setenv("AWS_SECRET_ACCESS_KEY", "testing")
+    monkeypatch.setenv("AWS_DEFAULT_REGION", "us-east-1")
+    with mock_aws():
+        boto3.client("s3").create_bucket(Bucket=BUCKET)
+        yield
+
+
+@pytest.fixture
+def cli_env(tmp_path, monkeypatch, s3_env):
+    """Wires up:
+      - state dir under tmp_path (isolated from real $HOME)
+      - tracecraft config pointing at moto bucket
+      - fake ~/.claude/projects tree at tmp_path/dot-claude
+      - a project cwd at tmp_path/proj with a starter JSONL session file
+    Returns: (runner, cwd, session_file, session_id)
+    """
+    # 1. state dir
+    state_dir = tmp_path / "mirror-state"
+    monkeypatch.setattr(session_mod, "STATE_DIR", state_dir)
+
+    # 2. tracecraft config -> point ~/.tracecraft/config.json at the moto bucket
+    fake_home = tmp_path / "fake-home"
+    (fake_home / ".tracecraft").mkdir(parents=True)
+    cfg = {
+        "backend": "s3",
+        "endpoint": None,
+        "bucket": BUCKET,
+        "project": PROJECT,
+        "agent_id": "tester",
+        "access_key": "testing",
+        "secret_key": "testing",
+    }
+    (fake_home / ".tracecraft" / "config.json").write_text(json.dumps(cfg))
+    monkeypatch.setenv("HOME", str(fake_home))
+    # tracecraft.config uses Path.home() which honors $HOME; good.
+
+    # 3. fake Claude Code session under cwd
+    dot_claude_root = tmp_path / "dot-claude" / "projects"
+    cwd = tmp_path / "proj"
+    cwd.mkdir()
+    pdir = dot_claude_root / _encode_cwd(cwd)
+    pdir.mkdir(parents=True)
+    session_file = pdir / "sess-abc12345.jsonl"
+    session_file.write_bytes(b"")
+
+    # 4. Point ClaudeCodeHarness root at our fake tree.
+    # The harness reads Path.home()/".claude"/"projects" by default — easier to
+    # monkeypatch the class default by replacing the registry entry with a factory.
+    from tracecraft.harness import REGISTRY
+    from tracecraft.harness.claude_code import ClaudeCodeHarness
+
+    original = REGISTRY["claude-code"]
+
+    def factory():
+        return ClaudeCodeHarness(root=dot_claude_root)
+
+    monkeypatch.setitem(REGISTRY, "claude-code", factory)
+    # get_harness instantiates with no args; we routed it through a callable
+    # that captures `root` via closure, so the protocol contract is preserved.
+
+    runner = CliRunner()
+    yield runner, cwd, session_file, "sess-abc12345"
+
+    # restore (monkeypatch teardown handles env/state_dir; restore registry too)
+    REGISTRY["claude-code"] = original
+
+
+def _bucket_keys():
+    """Return all keys under PROJECT/ stripped of the project prefix."""
+    client = boto3.client("s3")
+    resp = client.list_objects_v2(Bucket=BUCKET, Prefix=f"{PROJECT}/")
+    return [
+        obj["Key"][len(PROJECT) + 1 :] for obj in resp.get("Contents", [])
+    ]
+
+
+def _get_meta(session_id):
+    client = boto3.client("s3")
+    key = f"{PROJECT}/sessions/claude-code/{session_id}/meta.json"
+    obj = client.get_object(Bucket=BUCKET, Key=key)
+    return json.loads(obj["Body"].read())
+
+
+# ---------- tests ----------
+
+
+def test_mirror_uploads_first_part(cli_env):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'{"role":"user","content":"hi"}\n')
+
+    r = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r.exit_code == 0, r.output
+    assert "part-00000-" in r.output
+
+    keys = _bucket_keys()
+    parts = [k for k in keys if "/part-" in k]
+    assert len(parts) == 1
+    assert parts[0].startswith(f"sessions/claude-code/{sid}/part-00000-")
+    assert f"sessions/claude-code/{sid}/meta.json" in keys
+
+
+def test_mirror_second_call_uploads_only_new_bytes(cli_env):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'{"role":"user","content":"first"}\n')
+    r1 = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r1.exit_code == 0, r1.output
+
+    # Append more
+    with open(sess, "ab") as f:
+        f.write(b'{"role":"assistant","content":"second"}\n')
+    r2 = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r2.exit_code == 0, r2.output
+    assert "part-00001-" in r2.output
+
+    keys = _bucket_keys()
+    parts = sorted(k for k in keys if "/part-" in k)
+    assert len(parts) == 2
+
+    # Verify the second part contains only the appended line
+    client = boto3.client("s3")
+    p1 = client.get_object(Bucket=BUCKET, Key=f"{PROJECT}/{parts[1]}")["Body"].read()
+    assert p1 == b'{"role":"assistant","content":"second"}\n'
+
+    meta = _get_meta(sid)
+    assert len(meta["parts"]) == 2
+    assert meta["total_source_bytes"] == sess.stat().st_size
+
+
+def test_mirror_skips_when_no_new_bytes(cli_env):
+    runner, cwd, sess, _sid = cli_env
+    sess.write_bytes(b'{"x":1}\n')
+    r1 = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r1.exit_code == 0
+    r2 = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r2.exit_code == 0
+    assert "nothing new" in r2.output
+
+
+def test_seq_derived_from_bucket_survives_state_loss(cli_env, tmp_path):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'{"a":1}\n')
+    runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+
+    # Nuke local state — simulating user wiped ~/.tracecraft or moved machines
+    for p in session_mod.STATE_DIR.iterdir():
+        p.unlink()
+
+    with open(sess, "ab") as f:
+        f.write(b'{"a":2}\n')
+    r = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r.exit_code == 0, r.output
+    # With no state file, offset resets to 0, so the "new" chunk is the WHOLE
+    # file. That goes up as part-00001 (next seq from bucket LIST), NOT part-00000.
+    assert "part-00001-" in r.output
+
+    keys = sorted(_bucket_keys())
+    parts = [k for k in keys if "/part-" in k]
+    assert len(parts) == 2
+    # Confirm both seqs are present and disjoint
+    seqs = sorted(int(k.rsplit("/", 1)[-1].split("-")[1]) for k in parts)
+    assert seqs == [0, 1]
+
+
+def test_redaction_default_on_scrubs_aws_key(cli_env):
+    runner, cwd, sess, sid = cli_env
+    leak = b'{"tool":"bash","output":"export AWS_KEY=AKIAIOSFODNN7EXAMPLE\\n"}\n'
+    sess.write_bytes(leak)
+
+    r = runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    assert r.exit_code == 0, r.output
+
+    client = boto3.client("s3")
+    parts = [k for k in _bucket_keys() if "/part-" in k]
+    body = client.get_object(Bucket=BUCKET, Key=f"{PROJECT}/{parts[0]}")["Body"].read()
+    assert b"AKIAIOSFODNN7EXAMPLE" not in body
+    assert b"[REDACTED:aws_access_key]" in body
+
+    meta = _get_meta(sid)
+    assert meta["redaction_counts"].get("aws_access_key") == 1
+
+
+def test_no_redact_passes_raw_bytes(cli_env):
+    runner, cwd, sess, sid = cli_env
+    leak = b'{"k":"AKIAIOSFODNN7EXAMPLE"}\n'
+    sess.write_bytes(leak)
+
+    r = runner.invoke(
+        cli,
+        ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd), "--no-redact"],
+    )
+    assert r.exit_code == 0, r.output
+
+    client = boto3.client("s3")
+    parts = [k for k in _bucket_keys() if "/part-" in k]
+    body = client.get_object(Bucket=BUCKET, Key=f"{PROJECT}/{parts[0]}")["Body"].read()
+    assert b"AKIAIOSFODNN7EXAMPLE" in body
+    meta = _get_meta(sid)
+    assert meta["redaction_counts"] == {}
+
+
+def test_session_list_shows_uploaded_session(cli_env):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'{"x":1}\n')
+    runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+
+    r = runner.invoke(cli, ["session", "list"])
+    assert r.exit_code == 0, r.output
+    assert "claude-code" in r.output
+    assert sid[:8] in r.output
+
+
+def test_session_show_tails_concatenated_parts(cli_env):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'line1\n')
+    runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+    with open(sess, "ab") as f:
+        f.write(b"line2\nline3\n")
+    runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+
+    r = runner.invoke(cli, ["session", "show", sid, "--tail", "2"])
+    assert r.exit_code == 0, r.output
+    assert "--- tail ---" in r.output
+    assert "line2" in r.output
+    assert "line3" in r.output
+    assert "line1" not in r.output.split("--- tail ---")[1]  # not in the tail block
+
+
+def test_session_stop_clears_state_and_marks_ended(cli_env):
+    runner, cwd, sess, sid = cli_env
+    sess.write_bytes(b'{"x":1}\n')
+    runner.invoke(cli, ["session", "mirror", "--harness", "claude-code", "--cwd", str(cwd)])
+
+    state_files_before = list(session_mod.STATE_DIR.glob("*.json"))
+    assert state_files_before, "expected state file to exist after mirror"
+
+    r = runner.invoke(cli, ["session", "stop", sid])
+    assert r.exit_code == 0, r.output
+    assert "state_cleared=True" in r.output
+
+    state_files_after = list(session_mod.STATE_DIR.glob("*.json"))
+    assert not state_files_after
+
+    meta = _get_meta(sid)
+    assert meta.get("ended_at") is not None
+
+
+def test_mirror_unknown_session_id_errors_cleanly(cli_env):
+    runner, cwd, _sess, _sid = cli_env
+    r = runner.invoke(
+        cli,
+        [
+            "session",
+            "mirror",
+            "--harness",
+            "claude-code",
+            "--cwd",
+            str(cwd),
+            "--session-id",
+            "does-not-exist",
+        ],
+    )
+    assert r.exit_code != 0
+    assert "No claude-code session found" in r.output
+
+
+def test_show_unknown_session_errors_cleanly(cli_env):
+    runner, _cwd, _sess, _sid = cli_env
+    r = runner.invoke(cli, ["session", "show", "ghost-sid"])
+    assert r.exit_code != 0
+    assert "session not found" in r.output
diff --git a/sdk/tracecraft/__init__.py b/sdk/tracecraft/__init__.py
index aab3481..ae1ec60 100644
--- a/sdk/tracecraft/__init__.py
+++ b/sdk/tracecraft/__init__.py
@@ -1,3 +1,3 @@
 """Tracecraft — coordination layer for multi-agent AI systems."""
 
-__version__ = "0.1.6"
+__version__ = "0.2.0"
diff --git a/sdk/tracecraft/cli/__init__.py b/sdk/tracecraft/cli/__init__.py
index 1725927..6cdf14f 100644
--- a/sdk/tracecraft/cli/__init__.py
+++ b/sdk/tracecraft/cli/__init__.py
@@ -9,6 +9,7 @@
 from tracecraft.cli.messages import send, inbox
 from tracecraft.cli.steps import claim, complete, step_status, wait_for
 from tracecraft.cli.artifacts import artifact
+from tracecraft.cli.session import session as session_group
 
 BANNER = """
 \033[36m  _                                  __ _
@@ -39,6 +40,7 @@ def cli(ctx):
         click.echo("    step-status    Check step progress")
         click.echo("    wait-for       Block until steps complete")
         click.echo("    artifact       Share files (upload/download/list)")
+        click.echo("    session        Mirror coding-agent traces (mirror/list/show/stop)")
         click.echo()
         click.echo("  \033[2mRun 'tracecraft <command> --help' for details.\033[0m")
         click.echo()
@@ -54,6 +56,7 @@ def cli(ctx):
 cli.add_command(step_status, "step-status")
 cli.add_command(wait_for, "wait-for")
 cli.add_command(artifact)
+cli.add_command(session_group)
 
 
 def main():
diff --git a/sdk/tracecraft/cli/session.py b/sdk/tracecraft/cli/session.py
new file mode 100644
index 0000000..3342cc9
--- /dev/null
+++ b/sdk/tracecraft/cli/session.py
@@ -0,0 +1,382 @@
+"""`tracecraft session` — mirror, list, show, stop.
+
+Commands:
+    mirror   Pull new bytes from a harness session into the bucket (one-shot).
+    list     Browse sessions in the bucket.
+    show     Inspect one session's meta + tail.
+    stop     Clear local state for a session (placeholder; no daemon yet).
+
+Bucket layout (additive — does not touch existing tracecraft keys):
+
+    <project>/sessions/<harness>/<session-id>/
+        part-NNNNN-<uuid8>.jsonl   ← one per mirror flush, append-disjoint
+        meta.json                  ← cumulative metadata + redaction counts
+
+State files live under ~/.tracecraft/mirror-state/<sid>.json and store the
+byte offset into the source JSONL. Next-seq is derived from a bucket LIST
+on every call, so losing the state file is recoverable.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import re
+import tempfile
+import uuid
+from datetime import datetime, timezone
+from pathlib import Path
+
+import click
+
+from tracecraft.harness import REGISTRY, get_harness
+from tracecraft.redact import merge_counts, redact
+from tracecraft.store import get_store
+
+
+# Driven by the harness REGISTRY so adding an adapter auto-extends the CLI.
+HARNESS_CHOICES = sorted(REGISTRY)
+STATE_DIR = Path.home() / ".tracecraft" / "mirror-state"
+PART_RE = re.compile(r"part-(\d{5})-[a-f0-9]{8}\.jsonl$")
+
+
+# ---------- helpers ----------
+
+
+def _state_path(session_id: str) -> Path:
+    return STATE_DIR / f"{session_id}.json"
+
+
+def _load_state(session_id: str) -> dict:
+    p = _state_path(session_id)
+    if not p.exists():
+        return {}
+    try:
+        return json.loads(p.read_text())
+    except json.JSONDecodeError:
+        # Corrupt state file — treat as missing rather than crash.
+        return {}
+
+
+def _save_state(session_id: str, state: dict) -> None:
+    STATE_DIR.mkdir(parents=True, exist_ok=True)
+    _state_path(session_id).write_text(json.dumps(state, indent=2))
+
+
+def _session_prefix(harness_name: str, session_id: str) -> str:
+    return f"sessions/{harness_name}/{session_id}/"
+
+
+def _next_seq_for(store, harness_name: str, session_id: str) -> int:
+    """Find the next unused part-NNNNN seq by listing the bucket."""
+    prefix = _session_prefix(harness_name, session_id)
+    keys = store.list_keys(prefix)
+    seqs: list[int] = []
+    for k in keys:
+        name = k.rsplit("/", 1)[-1]
+        m = PART_RE.match(name)
+        if m:
+            seqs.append(int(m.group(1)))
+    return (max(seqs) + 1) if seqs else 0
+
+
+def _now_iso() -> str:
+    return datetime.now(timezone.utc).isoformat()
+
+
+# ---------- group ----------
+
+
+@click.group()
+def session():
+    """Mirror, browse, and inspect coding-agent sessions."""
+
+
+# ---------- mirror ----------
+
+
+@session.command("mirror")
+@click.option(
+    "--harness",
+    "harness_name",
+    required=True,
+    type=click.Choice(HARNESS_CHOICES),
+    help="Which coding agent's session format to read.",
+)
+@click.option(
+    "--session-id",
+    default=None,
+    help="Explicit session id. If omitted, picks the most recently modified session for --cwd.",
+)
+@click.option(
+    "--cwd",
+    "cwd_str",
+    default=None,
+    help="Project directory the session ran in (claude-code only). Defaults to $PWD.",
+)
+@click.option("--no-redact", is_flag=True, help="Skip redaction. Use only on fully-trusted buckets.")
+@click.option(
+    "--min-bytes",
+    default=1,
+    type=int,
+    show_default=True,
+    help="Skip upload if fewer than this many new bytes are available.",
+)
+def mirror(harness_name, session_id, cwd_str, no_redact, min_bytes):
+    """Pull new bytes from a harness session into the bucket (one-shot).
+
+    Reads from the last known byte offset (or 0 on first run), applies regex
+    redaction unless --no-redact, uploads the chunk as a new part object, and
+    updates the session's meta.json. Idempotent and safe to re-run on a cron.
+    """
+    store, cfg = get_store()
+    harness = get_harness(harness_name)
+    cwd = Path(cwd_str).expanduser().resolve() if cwd_str else Path.cwd()
+
+    # 1. Find the session
+    if session_id:
+        candidates = [s for s in harness.discover(cwd) if s.session_id == session_id]
+        sess = candidates[0] if candidates else None
+    else:
+        sess = harness.active_session(cwd)
+
+    if sess is None:
+        raise click.ClickException(
+            f"No {harness_name} session found"
+            + (f" for id={session_id}" if session_id else f" in cwd={cwd}")
+        )
+
+    state = _load_state(sess.session_id)
+    # `cursor` is an opaque per-harness position: a byte offset for file-backed
+    # harnesses (claude-code, codex, openclaw), a rowid for SQLite (hermes).
+    # The mirror loop never assumes it equals a byte count.
+    cursor = state.get("cursor", 0)
+
+    # Cheap pre-check: is there plausibly anything new? size() is sampled, not
+    # authoritative — read_new() returns the real consumed cursor below.
+    cur_size = harness.size(sess)
+    if cur_size - cursor < min_bytes:
+        click.echo(
+            f"nothing new: session={sess.session_id} cursor={cursor:,} size={cur_size:,}"
+        )
+        return
+
+    # 2. Read everything new since `cursor`, race-free: read_new returns the
+    # bytes AND the exact cursor we consumed up to. For SQLite the bytes are
+    # synthesized JSONL of new rows; raw_len is byte length, not a cursor delta.
+    chunk, next_cursor = harness.read_new(sess, cursor)
+    raw_len = len(chunk)
+
+    # 3. Redact (default on)
+    if no_redact:
+        out_bytes, counts = chunk, {}
+    else:
+        out_bytes, counts = redact(chunk)
+
+    # 4. Upload as next part
+    seq = _next_seq_for(store, harness_name, sess.session_id)
+    uniq = uuid.uuid4().hex[:8]
+    part_key = f"{_session_prefix(harness_name, sess.session_id)}part-{seq:05d}-{uniq}.jsonl"
+
+    with tempfile.NamedTemporaryFile(delete=False, suffix=".jsonl") as tf:
+        tf.write(out_bytes)
+        tf_path = tf.name
+    try:
+        store.put_file(part_key, tf_path)
+    finally:
+        try:
+            os.unlink(tf_path)
+        except OSError:
+            pass
+
+    # 5. Update meta.json (cumulative)
+    meta_key = f"{_session_prefix(harness_name, sess.session_id)}meta.json"
+    existing = store.get_json(meta_key) or {}
+    parts_log = existing.get("parts", [])
+    parts_log.append(
+        {
+            "seq": seq,
+            "uuid": uniq,
+            "cursor_range": [cursor, next_cursor],
+            "source_bytes": raw_len,
+            "uploaded_bytes": len(out_bytes),
+            "redactions": counts,
+            "uploaded_at": _now_iso(),
+        }
+    )
+    meta = {
+        "schema_version": 1,
+        "harness": harness_name,
+        "session_id": sess.session_id,
+        "source_path": str(sess.path),
+        "cwd": str(sess.cwd) if sess.cwd else None,
+        "agent_id": cfg.get("agent_id"),
+        "started_at": existing.get("started_at", _now_iso()),
+        "last_uploaded_at": _now_iso(),
+        "ended_at": existing.get("ended_at"),
+        "total_source_bytes": existing.get("total_source_bytes", 0) + raw_len,
+        "total_uploaded_bytes": existing.get("total_uploaded_bytes", 0) + len(out_bytes),
+        "redaction_counts": merge_counts(existing.get("redaction_counts", {}), counts),
+        "parts": parts_log,
+    }
+    store.put_json(meta_key, meta)
+
+    # 6. Persist local state. Advance the cursor to the position we read up to
+    # (next_cursor), NOT cursor+raw_len — those differ for SQLite where the
+    # cursor is a rowid and raw_len is synthesized-JSONL byte length.
+    _save_state(
+        sess.session_id,
+        {
+            "harness": harness_name,
+            "session_id": sess.session_id,
+            "source_path": str(sess.path),
+            "cursor": next_cursor,
+            "last_uploaded_seq": seq,
+            "last_flush_at": _now_iso(),
+        },
+    )
+
+    click.echo(
+        f"uploaded part-{seq:05d}-{uniq}  "
+        f"source={raw_len:,}B  upload={len(out_bytes):,}B  "
+        f"redactions={counts or 'none'}"
+    )
+
+
+# ---------- list ----------
+
+
+@session.command("list")
+@click.option("--harness", "harness_filter", default=None, help="Filter by harness name.")
+@click.option("--limit", default=20, type=int, show_default=True, help="Max sessions to show.")
+@click.option(
+    "--sort-by",
+    type=click.Choice(["recent", "size"]),
+    default="recent",
+    show_default=True,
+)
+def list_(harness_filter, limit, sort_by):
+    """List sessions in the bucket."""
+    store, _ = get_store()
+    keys = store.list_keys("sessions/")
+    metas: list[dict] = []
+    for k in keys:
+        if not k.endswith("/meta.json"):
+            continue
+        meta = store.get_json(k)
+        if not meta:
+            continue
+        if harness_filter and meta.get("harness") != harness_filter:
+            continue
+        metas.append(meta)
+
+    if sort_by == "recent":
+        metas.sort(key=lambda m: m.get("last_uploaded_at", ""), reverse=True)
+    else:  # size
+        metas.sort(key=lambda m: m.get("total_uploaded_bytes", 0), reverse=True)
+
+    metas = metas[:limit]
+    if not metas:
+        click.echo("(no sessions)")
+        return
+
+    click.echo(f"{'HARNESS':<14} {'SESSION':<16} {'BYTES':>12} {'PARTS':>6} {'LAST UPLOAD':<25}")
+    click.echo("-" * 80)
+    for m in metas:
+        sid = m.get("session_id", "?")
+        short = sid[:8] + ("…" if len(sid) > 8 else "")
+        click.echo(
+            f"{m.get('harness','?'):<14} {short:<16} "
+            f"{m.get('total_uploaded_bytes',0):>12,} "
+            f"{len(m.get('parts', [])):>6} "
+            f"{m.get('last_uploaded_at','-')[:24]:<25}"
+        )
+
+
+# ---------- show ----------
+
+
+@session.command("show")
+@click.argument("session_id")
+@click.option(
+    "--tail",
+    default=0,
+    type=int,
+    help="If >0, also fetch parts and print the last N lines.",
+)
+def show(session_id, tail):
+    """Inspect one session's meta + optionally tail its parts."""
+    store, _ = get_store()
+
+    # Find which harness this session lives under (search every harness folder).
+    all_meta_keys = [k for k in store.list_keys("sessions/") if k.endswith(f"/{session_id}/meta.json")]
+    if not all_meta_keys:
+        raise click.ClickException(f"session not found: {session_id}")
+    meta_key = all_meta_keys[0]
+    meta = store.get_json(meta_key)
+    click.echo(json.dumps(meta, indent=2))
+
+    if tail <= 0:
+        return
+
+    # Fetch all parts (in seq order), concatenate, print last N lines.
+    prefix = meta_key[: -len("meta.json")]
+    part_keys = sorted(
+        k for k in store.list_keys(prefix) if PART_RE.search(k.rsplit("/", 1)[-1])
+    )
+    body = bytearray()
+    for k in part_keys:
+        with tempfile.NamedTemporaryFile(delete=False) as tf:
+            tmp = tf.name
+        try:
+            store.get_file(k, tmp)
+            body.extend(Path(tmp).read_bytes())
+        finally:
+            try:
+                os.unlink(tmp)
+            except OSError:
+                pass
+
+    lines = body.splitlines()
+    click.echo("\n--- tail ---")
+    for line in lines[-tail:]:
+        try:
+            click.echo(line.decode("utf-8", errors="replace"))
+        except Exception:
+            click.echo(repr(line))
+
+
+# ---------- stop ----------
+
+
+@session.command("stop")
+@click.argument("session_id")
+def stop(session_id):
+    """Clear local mirror state for a session and mark ended_at in meta.
+
+    This is a placeholder: when --detach lands later, this command will also
+    kill the background mirror process. For now it just resets local state
+    and records the end time.
+    """
+    state_file = _state_path(session_id)
+    had_state = state_file.exists()
+    if had_state:
+        state_file.unlink()
+
+    # Best-effort: mark ended_at in meta if a meta exists.
+    store, _ = get_store()
+    meta_keys = [
+        k for k in store.list_keys("sessions/") if k.endswith(f"/{session_id}/meta.json")
+    ]
+    marked = False
+    if meta_keys:
+        meta = store.get_json(meta_keys[0]) or {}
+        if meta and not meta.get("ended_at"):
+            meta["ended_at"] = _now_iso()
+            store.put_json(meta_keys[0], meta)
+            marked = True
+
+    click.echo(
+        f"stopped session={session_id}  "
+        f"state_cleared={had_state}  meta_marked_ended={marked}"
+    )
diff --git a/sdk/tracecraft/harness/__init__.py b/sdk/tracecraft/harness/__init__.py
new file mode 100644
index 0000000..d28f357
--- /dev/null
+++ b/sdk/tracecraft/harness/__init__.py
@@ -0,0 +1,42 @@
+"""Harness adapters — each one knows how to find and read sessions from a
+specific coding agent (Claude Code, Codex, OpenClaw, Pi, OpenCode, Hermes, …).
+
+The base `Harness` protocol is intentionally tiny: discover sessions, parse a
+session id from a path, and return the new bytes since a known offset. The
+mirror loop in `tracecraft session mirror` is harness-agnostic.
+
+Adding a new harness should be a single file under this package plus one entry
+in `REGISTRY` below.
+"""
+
+from .base import Harness, Session
+from .claude_code import ClaudeCodeHarness
+from .codex import CodexHarness
+from .hermes import HermesHarness
+from .openclaw import OpenClawHarness
+
+REGISTRY: dict[str, type[Harness]] = {
+    ClaudeCodeHarness.name: ClaudeCodeHarness,
+    CodexHarness.name: CodexHarness,
+    OpenClawHarness.name: OpenClawHarness,
+    HermesHarness.name: HermesHarness,
+}
+
+
+def get_harness(name: str) -> Harness:
+    if name not in REGISTRY:
+        known = ", ".join(sorted(REGISTRY)) or "(none registered)"
+        raise ValueError(f"unknown harness '{name}'. Known: {known}")
+    return REGISTRY[name]()
+
+
+__all__ = [
+    "Harness",
+    "Session",
+    "ClaudeCodeHarness",
+    "CodexHarness",
+    "OpenClawHarness",
+    "HermesHarness",
+    "REGISTRY",
+    "get_harness",
+]
diff --git a/sdk/tracecraft/harness/base.py b/sdk/tracecraft/harness/base.py
new file mode 100644
index 0000000..132478b
--- /dev/null
+++ b/sdk/tracecraft/harness/base.py
@@ -0,0 +1,98 @@
+"""Harness protocol — the only contract a new coding-agent adapter needs to meet."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Protocol, runtime_checkable
+
+
+@dataclass(frozen=True)
+class Session:
+    """A discovered session: where it lives and what we call it."""
+
+    path: Path
+    session_id: str
+    cwd: Path | None = None  # the project dir this session ran in, if knowable
+
+
+@runtime_checkable
+class Harness(Protocol):
+    """Minimum surface a harness adapter must expose.
+
+    Concrete harnesses are instantiated with no arguments; per-call context
+    (cwd, session id, byte offset) is passed in. State belongs to the mirror
+    loop, not the harness — adapters stay stateless and easy to test.
+    """
+
+    name: str
+
+    def discover(self, cwd: Path) -> list[Session]:
+        """Return every session this harness knows about for the given cwd.
+
+        Implementations should be cheap to call repeatedly (the mirror loop
+        polls); avoid network and avoid loading file contents.
+        """
+        ...
+
+    def active_session(self, cwd: Path) -> Session | None:
+        """Return the most recently active session for cwd, or None."""
+        ...
+
+    def read_new(self, session: Session, cursor: int) -> tuple[bytes, int]:
+        """Return (new_jsonl_bytes, new_cursor) for everything after `cursor`.
+
+        `cursor` is an opaque per-harness position. For file-backed harnesses
+        it's a byte offset and the returned cursor is `offset + len(bytes)`.
+        For SQLite (Hermes) it's a rowid and the returned cursor is the max
+        rowid read; the bytes are synthesized JSONL of the new rows.
+
+        Returning the new cursor alongside the bytes makes advancement
+        race-free: the loop advances to exactly what was consumed, never to a
+        separately-sampled `size()` that may have moved between calls.
+        """
+        ...
+
+    def size(self, session: Session) -> int:
+        """Return the current end-of-stream cursor for `session`.
+
+        Used only for the cheap "is there anything new?" pre-check. The
+        authoritative advancement comes from read_new()'s returned cursor.
+        For file-backed harnesses this is `path.stat().st_size`; for SQLite
+        it's the current max rowid.
+        """
+        ...
+
+
+class FileTailHarness:
+    """Shared implementation for harnesses backed by an append-only file.
+
+    Concrete file harnesses (claude-code, codex, openclaw) inherit this and
+    define only `name`, `__init__`, and `discover` — the tail mechanics
+    (read_new / size / active_session) are identical and live here once.
+    Hermes does NOT inherit this; SQLite has different cursor semantics.
+    """
+
+    def discover(self, cwd: Path) -> list[Session]:  # pragma: no cover - overridden
+        raise NotImplementedError
+
+    def active_session(self, cwd: Path) -> Session | None:
+        sessions = self.discover(cwd)
+        if not sessions:
+            return None
+        return max(sessions, key=lambda s: s.path.stat().st_mtime)
+
+    def read_new(self, session: Session, cursor: int) -> tuple[bytes, int]:
+        data = self.read_new_bytes(session, cursor)
+        return data, cursor + len(data)
+
+    def read_new_bytes(self, session: Session, offset: int) -> bytes:
+        """Bytes-only tail from `offset` to EOF. Internal helper for read_new."""
+        if offset < 0:
+            raise ValueError(f"offset must be non-negative, got {offset}")
+        with open(session.path, "rb") as f:
+            f.seek(offset)
+            return f.read()
+
+    def size(self, session: Session) -> int:
+        return session.path.stat().st_size
diff --git a/sdk/tracecraft/harness/claude_code.py b/sdk/tracecraft/harness/claude_code.py
new file mode 100644
index 0000000..bd3bbd1
--- /dev/null
+++ b/sdk/tracecraft/harness/claude_code.py
@@ -0,0 +1,45 @@
+"""Claude Code adapter.
+
+Claude Code persists every session under
+  ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl
+
+`<encoded-cwd>` replaces path separators with hyphens and prefixes a leading
+hyphen, e.g. `/Users/x/proj` -> `-Users-x-proj`. We mirror that encoding here
+so we can find the right project directory for the user's current cwd.
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+from .base import FileTailHarness, Session
+
+
+def _encode_cwd(cwd: Path) -> str:
+    """Encode an absolute path the way Claude Code does for its projects dir.
+
+    Claude Code uses the resolved absolute path with `/` swapped for `-`,
+    keeping the leading separator's effect (so `/foo/bar` -> `-foo-bar`).
+    """
+    resolved = cwd.expanduser().resolve()
+    return str(resolved).replace(os.sep, "-")
+
+
+class ClaudeCodeHarness(FileTailHarness):
+    name = "claude-code"
+
+    def __init__(self, root: Path | None = None) -> None:
+        self.root = root or (Path.home() / ".claude" / "projects")
+
+    def _project_dir(self, cwd: Path) -> Path:
+        return self.root / _encode_cwd(cwd)
+
+    def discover(self, cwd: Path) -> list[Session]:
+        pdir = self._project_dir(cwd)
+        if not pdir.is_dir():
+            return []
+        return [
+            Session(path=jsonl, session_id=jsonl.stem, cwd=cwd)
+            for jsonl in pdir.glob("*.jsonl")
+        ]
diff --git a/sdk/tracecraft/harness/codex.py b/sdk/tracecraft/harness/codex.py
new file mode 100644
index 0000000..a5bd85e
--- /dev/null
+++ b/sdk/tracecraft/harness/codex.py
@@ -0,0 +1,43 @@
+"""Codex CLI adapter.
+
+Codex writes session rollouts under
+  ~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-<YYYY-MM-DDThh-mm-ss>-<id>.jsonl
+
+Codex doesn't shard by cwd, so `discover` walks the whole sessions tree
+(scoped to the most recent few days for performance) and returns every
+rollout. The mirror loop is responsible for picking which to follow.
+"""
+
+from __future__ import annotations
+
+import re
+from pathlib import Path
+
+from .base import FileTailHarness, Session
+
+
+_ROLLOUT_RE = re.compile(r"rollout-\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}-(?P<id>[A-Za-z0-9_-]+)\.jsonl$")
+
+
+class CodexHarness(FileTailHarness):
+    name = "codex"
+
+    def __init__(self, root: Path | None = None) -> None:
+        self.root = root or (Path.home() / ".codex" / "sessions")
+
+    def _all_rollouts(self) -> list[Path]:
+        if not self.root.is_dir():
+            return []
+        # YYYY/MM/DD/rollout-*.jsonl
+        return list(self.root.glob("*/*/*/rollout-*.jsonl"))
+
+    def discover(self, cwd: Path) -> list[Session]:
+        # Codex sessions are not partitioned by cwd; return everything we see.
+        # The mirror loop / caller decides which session to actually follow.
+        del cwd
+        sessions: list[Session] = []
+        for path in self._all_rollouts():
+            m = _ROLLOUT_RE.search(path.name)
+            session_id = m.group("id") if m else path.stem
+            sessions.append(Session(path=path, session_id=session_id))
+        return sessions
diff --git a/sdk/tracecraft/harness/hermes.py b/sdk/tracecraft/harness/hermes.py
new file mode 100644
index 0000000..671d0d0
--- /dev/null
+++ b/sdk/tracecraft/harness/hermes.py
@@ -0,0 +1,161 @@
+"""Hermes Agent adapter (Nous Research).
+
+Hermes moved off per-session JSONL to a single SQLite database:
+  ~/.hermes/state.db   (or $HERMES_HOME/state.db), WAL mode
+
+So this adapter does NOT tail a file. It opens the DB read-only and reads new
+rows from the `messages` table, synthesizing one JSON line per message. The
+mirror loop treats the synthesized bytes exactly like a file tail.
+
+Cursor semantics (the reason base.Harness decoupled cursor from byte count):
+  cursor == the highest `messages.id` already mirrored.
+  messages.id is INTEGER PRIMARY KEY AUTOINCREMENT — strictly increasing,
+  never reused even after Hermes prunes old sessions, so it's a safe
+  high-water mark. We read `WHERE session_id=? AND id>:cursor ORDER BY id`.
+
+Verified against hermes_state.py (github.com/NousResearch/hermes-agent) May 2026:
+  - sessions(id TEXT PK, source, model, started_at REAL, ended_at, title, ...)
+  - messages(id INTEGER PK AUTOINCREMENT, session_id TEXT FK, role, content,
+             tool_calls, tool_name, timestamp REAL, token_count, ...)
+  - content may be sentinel-prefixed '\\x00json:' for multimodal payloads.
+  - schema_version table; columns can be added across versions, so we read
+    whatever columns exist rather than a hardcoded list.
+
+Safety: open with mode=ro (NOT immutable — the DB is live). A short
+busy_timeout rides out the brief moments Hermes holds the write lock. We never
+write or checkpoint.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sqlite3
+from pathlib import Path
+
+from .base import Session
+
+# Sentinel Hermes uses to mark a JSON-encoded (multimodal) content payload.
+_CONTENT_JSON_PREFIX = "\x00json:"
+_BUSY_TIMEOUT_MS = 4000
+
+
+def _resolve_db_path() -> Path:
+    home = os.environ.get("HERMES_HOME")
+    base = Path(home) if home else (Path.home() / ".hermes")
+    return base / "state.db"
+
+
+def _connect_ro(db_path: Path) -> sqlite3.Connection:
+    """Read-only connection safe to use against a live WAL database."""
+    uri = f"file:{db_path}?mode=ro"
+    conn = sqlite3.connect(uri, uri=True, timeout=_BUSY_TIMEOUT_MS / 1000)
+    conn.row_factory = sqlite3.Row
+    conn.execute(f"PRAGMA busy_timeout={_BUSY_TIMEOUT_MS}")
+    return conn
+
+
+def _decode_content(value):
+    """Hermes stores multimodal content as '\\x00json:<json>'; scalars as-is."""
+    if isinstance(value, str) and value.startswith(_CONTENT_JSON_PREFIX):
+        try:
+            return json.loads(value[len(_CONTENT_JSON_PREFIX):])
+        except json.JSONDecodeError:
+            return value
+    return value
+
+
+class HermesHarness:
+    name = "hermes"
+
+    def __init__(self, db_path: Path | None = None) -> None:
+        self.db_path = db_path or _resolve_db_path()
+
+    # ---- discovery ----
+
+    def discover(self, cwd: Path) -> list[Session]:
+        # Hermes sessions aren't keyed by cwd. We surface every session row;
+        # session.path is the DB (shared by all sessions), session_id is the
+        # sessions.id TEXT value.
+        del cwd
+        if not self.db_path.exists():
+            return []
+        conn = _connect_ro(self.db_path)
+        try:
+            rows = conn.execute(
+                "SELECT id FROM sessions ORDER BY started_at DESC"
+            ).fetchall()
+        except sqlite3.Error:
+            return []
+        finally:
+            conn.close()
+        return [Session(path=self.db_path, session_id=r["id"]) for r in rows]
+
+    def active_session(self, cwd: Path) -> Session | None:
+        if not self.db_path.exists():
+            return None
+        conn = _connect_ro(self.db_path)
+        try:
+            row = conn.execute(
+                # Most-recently-active = the session owning the highest message id.
+                "SELECT session_id FROM messages ORDER BY id DESC LIMIT 1"
+            ).fetchone()
+        except sqlite3.Error:
+            row = None
+        finally:
+            conn.close()
+        if not row:
+            # Fall back to newest session even if it has no messages yet.
+            sessions = self.discover(cwd)
+            return sessions[0] if sessions else None
+        return Session(path=self.db_path, session_id=row["session_id"])
+
+    # ---- read ----
+
+    def size(self, session: Session) -> int:
+        """Current max messages.id for this session — the cursor high-water."""
+        if not self.db_path.exists():
+            return 0
+        conn = _connect_ro(self.db_path)
+        try:
+            row = conn.execute(
+                "SELECT MAX(id) AS m FROM messages WHERE session_id = ?",
+                (session.session_id,),
+            ).fetchone()
+        except sqlite3.Error:
+            return 0
+        finally:
+            conn.close()
+        return int(row["m"]) if row and row["m"] is not None else 0
+
+    def read_new(self, session: Session, cursor: int) -> tuple[bytes, int]:
+        """Synthesize JSONL for messages with id > cursor; return (bytes, new_cursor)."""
+        if cursor < 0:
+            raise ValueError(f"cursor must be non-negative, got {cursor}")
+        if not self.db_path.exists():
+            return b"", cursor
+        conn = _connect_ro(self.db_path)
+        try:
+            rows = conn.execute(
+                "SELECT * FROM messages WHERE session_id = ? AND id > ? ORDER BY id ASC",
+                (session.session_id, cursor),
+            ).fetchall()
+        except sqlite3.Error:
+            return b"", cursor
+        finally:
+            conn.close()
+
+        lines: list[str] = []
+        max_id = cursor
+        for r in rows:
+            d = dict(r)
+            if "content" in d:
+                d["content"] = _decode_content(d["content"])
+            # tool_calls / reasoning_details are JSON strings; leave as-is —
+            # consumers can parse. We just emit the row faithfully.
+            lines.append(json.dumps(d, default=str, ensure_ascii=False))
+            if d.get("id") is not None:
+                max_id = max(max_id, int(d["id"]))
+
+        blob = ("\n".join(lines) + "\n").encode("utf-8") if lines else b""
+        return blob, max_id
diff --git a/sdk/tracecraft/harness/openclaw.py b/sdk/tracecraft/harness/openclaw.py
new file mode 100644
index 0000000..e9b3129
--- /dev/null
+++ b/sdk/tracecraft/harness/openclaw.py
@@ -0,0 +1,76 @@
+"""OpenClaw adapter.
+
+OpenClaw persists session transcripts as append-only JSONL under
+  <stateDir>/agents/<agentId>/sessions/<sessionId>.jsonl
+
+where <stateDir> resolves (highest precedence first):
+  OPENCLAW_STATE_DIR  →  OPENCLAW_HOME  →  ~/.openclaw
+(--dev and --profile <name> map to ~/.openclaw-dev / ~/.openclaw-<name>; a
+caller using those can pass root= explicitly.)
+
+Verified against OpenClaw source (src/config/sessions/paths.ts) May 2026.
+
+Files in the sessions dir that are NOT transcripts and must be skipped:
+  - sessions.json          mutable session index, rewritten atomically
+  - *.tmp                  half-written atomic-store staging files
+
+Topic sessions are named  <sessionId>-topic-<topicId>.jsonl  and compaction
+successors  <sessionId>.checkpoint.<uuid>.jsonl  — both are real transcripts
+and we surface them as-is. Session ids are only unique within an agentId, so
+the stable key we expose is  <agentId>/<filename-stem>.
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+from .base import FileTailHarness, Session
+
+
+def _resolve_state_dir() -> Path:
+    """OpenClaw state dir, honoring its env-var precedence."""
+    if os.environ.get("OPENCLAW_STATE_DIR"):
+        return Path(os.environ["OPENCLAW_STATE_DIR"])
+    if os.environ.get("OPENCLAW_HOME"):
+        return Path(os.environ["OPENCLAW_HOME"])
+    return Path.home() / ".openclaw"
+
+
+class OpenClawHarness(FileTailHarness):
+    name = "openclaw"
+
+    def __init__(self, root: Path | None = None) -> None:
+        # `root` is the agents dir. Default derives from the active state dir.
+        self.root = root or (_resolve_state_dir() / "agents")
+
+    def _stable_id(self, path: Path) -> str:
+        """<agentId>__<stem> — agentId is the dir between 'agents/' and 'sessions/'.
+
+        Joined with '__' (not '/') so the id is safe as a single bucket-key
+        path segment; OpenClaw sessionIds are only unique within an agentId,
+        so the agentId prefix disambiguates across agents.
+        """
+        stem = path.stem  # filename without .jsonl
+        # path = <root>/<agentId>/sessions/<file>.jsonl
+        try:
+            agent_id = path.parent.parent.name
+        except Exception:
+            agent_id = "unknown"
+        return f"{agent_id}__{stem}"
+
+    def _all_sessions(self) -> list[Path]:
+        if not self.root.is_dir():
+            return []
+        out: list[Path] = []
+        for p in self.root.glob("*/sessions/*.jsonl"):
+            name = p.name
+            if name == "sessions.json" or name.endswith(".tmp"):
+                continue
+            out.append(p)
+        return out
+
+    def discover(self, cwd: Path) -> list[Session]:
+        # OpenClaw shards by agentId, not cwd — cwd is ignored.
+        del cwd
+        return [Session(path=p, session_id=self._stable_id(p)) for p in self._all_sessions()]
diff --git a/sdk/tracecraft/redact.py b/sdk/tracecraft/redact.py
new file mode 100644
index 0000000..c670a05
--- /dev/null
+++ b/sdk/tracecraft/redact.py
@@ -0,0 +1,52 @@
+"""Redaction v0 — regex denylist applied before bytes leave the machine.
+
+Goal: catch the obvious shapes of credentials and tokens in trace data so that
+users mirroring sessions to a bucket don't accidentally publish keys. This is
+NOT a real DLP system — it cannot catch arbitrary secrets, custom internal
+token formats, or business-logic data. It catches well-known token shapes.
+
+Every redaction is *counted*, never silent. Counts go into meta.json so users
+can audit what was scrubbed.
+"""
+
+from __future__ import annotations
+
+import re
+from typing import Final
+
+# Each (name, pattern) — name is what shows up in meta.json's redaction counter.
+# Patterns intentionally on the strict side: prefer false-negative over false-positive
+# (we'd rather miss a token than mangle source code that happens to look like one).
+_PATTERNS: Final[list[tuple[str, re.Pattern[bytes]]]] = [
+    ("aws_access_key", re.compile(rb"AKIA[0-9A-Z]{16}")),
+    ("aws_session_token", re.compile(rb"ASIA[0-9A-Z]{16}")),
+    ("anthropic_key", re.compile(rb"sk-ant-[A-Za-z0-9_-]{20,}")),
+    ("openai_key", re.compile(rb"sk-(?:proj-|svcacct-)?[A-Za-z0-9]{20,}")),
+    ("hf_token", re.compile(rb"hf_[A-Za-z0-9]{30,}")),
+    ("github_pat", re.compile(rb"gh[pousr]_[A-Za-z0-9]{30,}")),
+    ("slack_token", re.compile(rb"xox[abprs]-[A-Za-z0-9-]{10,}")),
+    ("bearer_token", re.compile(rb"Bearer\s+[A-Za-z0-9_.\-]{20,}")),
+]
+
+
+def redact(blob: bytes) -> tuple[bytes, dict[str, int]]:
+    """Return (redacted_bytes, counts).
+
+    counts maps pattern_name -> number of replacements made. Patterns not
+    matched are absent from the dict (no zero entries).
+    """
+    counts: dict[str, int] = {}
+    out = blob
+    for name, pat in _PATTERNS:
+        out, n = pat.subn(f"[REDACTED:{name}]".encode(), out)
+        if n:
+            counts[name] = n
+    return out, counts
+
+
+def merge_counts(a: dict[str, int], b: dict[str, int]) -> dict[str, int]:
+    """Sum two redaction-count dicts. Used to accumulate across parts in meta.json."""
+    out = dict(a)
+    for k, v in b.items():
+        out[k] = out.get(k, 0) + v
+    return out