Skip to content

Trace JSONL retention/rotation for traces/ (bound unbounded growth on us-ny1) #10

Description

@aaronmarkham

Trace JSONL retention/rotation for the us-ny1 builder

Follow-up from the PR #9 review. The trace emitter (#7) writes one hash-chained JSONL per ingest run at <store.root>/traces/<run_id>.jsonl. The builder loops zeitghost ingest every ZEITGHOST_INTERVAL (3600s default) — so over months on us-ny1, traces/ grows unboundedly.

Not urgent (each file is small — a couple events per cycle when there's new news, zero-byte/absent when nothing new), but it should be bounded before it's a year of files.

Options

  • A --max-trace-age-days cleanup pass in ingest (or a tiny zeitghost prune-traces command) that deletes/compacts JSONLs older than N days.
  • Or roll traces into a single append log with periodic compaction.
  • Whatever the choice, keep it aligned with the shard-store backup story (traces live under store.root, so they're bind-mounted + rsync'd on us-ny1 — pruning should be safe w.r.t. backups).

Acceptance criteria

  • traces/ has a bounded retention policy (age- or count-based), documented.
  • Pruning never deletes a trace referenced by a still-live shard's trace_ref within the retention window (or that's an explicit, documented tradeoff).
  • Pairs with the eventual verify-trace <run_id> audit command — don't prune what we'd want to audit.

Relates to #7 (trace emitter), #8 (reanalyze, which also writes traces).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions