Skip to content

Agent health diagnostics: drift report, context budget, and activity timeline per agent #385

@markhayden

Description

@markhayden

Summary

Tooling to help a user understand and keep their agents healthy — a guided way to see what an agent is actually doing, how its projected content has drifted from its package/managed sources, and how much context it loads on a fresh session. Brainstorm-level; details TBD.

(Filed off the back of a debugging session where it was hard to tell why an agent ignored a tool: its AGENTS.md body was stale seed-once content, its managed block was un-applied drift, and a stale workflow-skill shadowed the package's — none of which surfaced anywhere obvious.)

Possible capabilities

  • Drift report per agent: compare the agent's live runtime files (workspace AGENTS.md/SOUL.md/skills) against (a) what its package would project and (b) what bakin doctor managed blocks expect — flag seed-once staleness, un-applied managed-block drift, and shadowed/colliding skills.
  • Context budget per fresh session: estimate/measure how many tokens the agent's seeded context (SOUL/IDENTITY/AGENTS/TOOLS + enabled lessons + managed blocks) consumes at session start, and flag bloat.
  • "What is this agent doing" timeline: recent tasks, tool calls (from the audit/usage feed), and outcomes for an agent, in one view — e.g. "Pixel ran N image tasks; M used bakin_exec_images_generate, K used native".
  • Tool-usage sanity: surface when an agent is bypassing the preferred tool path (e.g. native image gen vs bakin_exec_images_generate).
  • Lesson/skill effectiveness: which lessons were retrieved, scores, whether they're enabled.

Likely a bakin agents doctor <id> CLI + a UI panel on the agent detail page. Could reuse existing primitives: the audit/usage feed (src/core/usage.ts), managed-block check (bakin agent-rules --check), and the agent-package lockfile/projection records.

Acceptance (rough)

  • A single command/view that tells a user "here's what your agent looks like vs what it should be, here's what it's been doing, and here's its session context cost."

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions