agnt-diff

pg_stat for coding agents.

Import local Claude Code and Codex sessions, find wasted loops and failing command patterns, then mark workflow changes and compare whether the next sessions got better.

Local-first: no server, no upload, no account. All data stays in ~/.agnt-diff/.

Install

uv tool install --from git+https://github.com/albedoweb/agnt-diff.git agnt-diff

Requires Python 3.12+ and git. Tested on macOS 14+ and Linux x86_64 with bash/zsh. Windows is not yet validated; PRs welcome.

Quickstart

Import your recent sessions and get a workload overview. Run from inside your project repo:

cd ~/path/to/your/project

# Import last 50 sessions for this project
agnt-diff import claude-code --last 50 --cwd .
agnt-diff import codex --last 50 --cwd .

# View workload stats
agnt-diff stats --cwd .

# Diagnose error patterns
agnt-diff diagnose --cwd .

# See top failing commands
agnt-diff commands --cwd . --only-errors

# Get specific improvement recommendations
agnt-diff recommend --cwd . --top 5

If you don't specify --cwd, import defaults to the current git root. Use --all-cwds to import across all projects on the machine.

Example Output

Workload /Users/me/project   sessions=42

  Median active duration: 18m 42s
  Median wall duration:   41m 09s
  Visible tokens:         820k
  Cache tokens:           3100k
  Est cost:               $24.18
  Tool error rate:        12.4%
  Bash error rate:        18.1%

  Tools: Bash 54%, Read 22%, Edit 12%

Failing Bash subcommands:
  git status       12
  pytest           8
  uv               5

Error messages:
  not a git repository
  permission denied/sandbox
  ModuleNotFoundError

The Mark/Compare Loop

This is the unique wedge: mark a workflow change, then measure whether it helped.

Before changing your agent instructions (CLAUDE.md, AGENTS.md):

agnt-diff mark "Add git preflight rule" --label git --cwd .

After 10-20 more sessions:

agnt-diff compare --label git --window 20 --cwd .

The marker is the point of change. The compare command takes sessions before and after that point and shows whether normalized stats moved — turns, tokens, duration, error rate, cost, test coverage.

Commands

Command	Purpose
`agnt-diff workloads`	List all known projects
`agnt-diff import <source>`	Import Claude Code or Codex sessions
`agnt-diff stats`	Workload aggregate stats
`agnt-diff diagnose`	Error cluster diagnosis
`agnt-diff commands`	Top commands by source across sessions
`agnt-diff show <run_id>`	Single-run timeline drill-down
`agnt-diff mark`	Record a workflow change marker
`agnt-diff compare`	Before/after comparison
`agnt-diff recommend`	Agent instruction improvement ideas

Developer commands (AGNT_DEV=1): replay, serve, schema, run.

Privacy

agnt-diff reads local session files and writes normalized summaries under ~/.agnt-diff/runs. It does not upload data. There is no server, no telemetry, no analytics. The only network access is when you explicitly run recommend --llm <provider> (not in the current release).

Agent session logs can contain prompts, file paths, command output, and code snippets. Do not publish your ~/.agnt-diff directory. Use --anonymize-paths to redact home directory paths when sharing output:

agnt-diff stats --cwd . --json --anonymize-paths

Cost Disclaimer

Est cost is calculated from token usage and bundled pricing tables. It is useful for trend analysis, not official billing reconciliation. Cache token estimates vary by source.

Supported Sources

Source	Status	Notes
Claude Code	dogfood	Reads `~/.claude/projects/.../*.jsonl`
Codex CLI	dogfood	Reads `~/.codex/sessions/.../*.jsonl`
Gemini CLI	planned	Not implemented
Cursor CLI	planned	Not implemented
OpenCode	planned	Not implemented

Troubleshooting

No sessions found:

agnt-diff sessions list claude-code
agnt-diff sessions list codex

State directory permission error:

AGNT_DIFF_HOME=/tmp/agnt-diff agnt-diff import codex --latest --cwd .

Accidentally imported sessions from another project?

agnt-diff workloads                      # see all known projects
agnt-diff stats --cwd <your-project>     # scope to one project

Small sample:

If you have fewer than ~10 sessions, compare and recommend output is directional, not statistically stable.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
prompts		prompts
src/agnt		src/agnt
tests		tests
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agnt-diff

Install

Quickstart

Example Output

The Mark/Compare Loop

Commands

Privacy

Cost Disclaimer

Supported Sources

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agnt-diff

Install

Quickstart

Example Output

The Mark/Compare Loop

Commands

Privacy

Cost Disclaimer

Supported Sources

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages