Skip to content

feat(core): run liveness classification — productive vs planning-only vs blocked #462

@markhayden

Description

@markhayden

Follow-on to the execution-safety ledger.

Upgrade stuck-run detection from "is the heartbeat stale" to "is the run productive" (paperclip's run-liveness.ts pattern). Bakin already has the evidence streams: progress logs, assets saved (bakin_exec_assets_save), files written, step submissions. Classify live runs as producing | planning_only | blocked_on_gate | silent and:

  • feed the watchdog a richer supersede signal (a run that's been planning-only for 20 minutes is a better recovery candidate than one that just saved an asset, even with identical heartbeat ages)
  • reduce false supersedes further (the overlap scenario's root)
  • surface the classification on the task drawer / agents page

The runs ledger (runs.heartbeat_at, settle reasons) is the natural home for the rollup; evidence counts can come from audit + assets without a new recorder.

Ref: .claude/specs/execution-safety-ledger.md §10.2.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions