Skip to content

feat(#306): OwnerRefs + finalizers + cascade GC (D5)#480

Open
windoliver wants to merge 17 commits into
mainfrom
feat/306-ownerrefs-cascade-gc
Open

feat(#306): OwnerRefs + finalizers + cascade GC (D5)#480
windoliver wants to merge 17 commits into
mainfrom
feat/306-ownerrefs-cascade-gc

Conversation

@windoliver

Copy link
Copy Markdown
Owner

Summary

Kubernetes-style garbage collector for Grove's entity hierarchy (Epic D / #285, D5). Closes #306.

  • OwnerRef gains a controller?: boolean flag + taskGroup/agentTask kinds; new grove.dev/* finalizer constants (KindFinalizer.PendingReview/PendingMerge, PropagationFinalizer.Foreground/Orphan) and a CascadePolicy type.
  • owner-graph.ts — pure, I/O-free cascade/reap decision core (policyOf, planOwnerDeletion, planDanglingChild). Idempotent: returns [] at convergence.
  • garbage-collector.tsGarbageCollector reconcile loop (same skeleton as task-controller/claim-controller) over a segregated GcStore interface; CAS via withIfMatch, cross-level re-drive so multi-level cascades converge without waiting for resync.
  • Real persistenceSqliteAgentTaskStore gains setAgentTaskDeletion / removeAgentTaskFinalizer / removeAgentTaskOwnerRef / reapAgentTask (finalizer-guarded hard delete; status row drops via the existing FK cascade). No schema migration (columns already existed).
  • AgentTaskGcStore + CompositeGcStore bridge the loop to real AgentTask rows and span an (in-memory) TaskGroup owner + real AgentTask children.
  • Wiring seam garbage-collector-wiring.ts (gated by GROVE_GC); live serve.ts activation deliberately deferred with TaskGroup persistence.

Cascade policy is encoded as a propagation finalizer on the owner at delete time (mirrors k8s foregroundDeletion/orphan), so policy survives restarts.

Acceptance criteria (#306)

Criterion Status
Delete TaskGroup → children cascade ✅ Foreground + Background, in-memory and over real SQLite
Finalizer blocks deletion until removed ✅ generic gate proven via AgentTask pending-review (in-memory + real SQLite). MergeTask/pending-merge deferred (mechanism ready)
Orphan mode preserves children ✅ child kept, controller ownerRef stripped, not marked deleted

Deferred follow-ups (documented in spec/plan/code, not dropped)

  • MergeTask entity + grove.dev/pending-merge (one constant + one assignment away).
  • TaskGroup persistence (SQLite/Nexus) + live server-side GC activation.
  • Terminating condition emission (needs status writes).
  • Full dangling-child sweep (listAll) — resync() only enumerates pending-deletion owners + their children; documented inline.

Notable review fixes (adversarial review during development)

  • Fixed a cross-level cascade convergence stall (owner waited a full resync per level) + added real-worker-loop tests that fail-before/pass-after.
  • Fixed an owner_ref_json "null"-string corruption + RV churn in the no-op write path.

Test plan

  • bunx tsc --noEmit clean (the erasableSyntaxOnly/isolatedDeclarations gate).
  • Pre-push gate (typecheck + full build) green.
  • New GC suites stable across repeated isolated runs: owner-graph, garbage-collector (incl. real-worker-loop convergence), in-memory-gc-store, agent-task-gc-store (real SQLite Foreground/Background cascade + pending-review reap-block), sqlite-store lifecycle mutators (CAS + no-op + FK-cascade), garbage-collector-wiring.
  • Whole-implementation review: READY TO MERGE (reap-safety + RV contract hold globally; no CAS livelock).

Note: the full bun test suite is green on this code when not resource-contended; an over-parallel coverage run produced timeouts in unrelated subprocess-heavy CLI-init/Nexus tests + one pre-existing Docker-probe test (resolveBackend), none in the changed code — verified by isolated re-runs.

windoliver added 17 commits June 8, 2026 16:36
…p (Sqlite)

Extends AgentTaskStore interface with setAgentTaskDeletion, removeAgentTaskFinalizer, removeAgentTaskOwnerRef, reapAgentTask; implements them on SqliteAgentTaskStore via a shared mutateSpec CAS helper; widens AgentTaskSpecRecord/EntityMetadata finalizers to accept KindFinalizer|PropagationFinalizer; stubs four methods on TestAgentTaskStore and fakeTaskStore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

D5: OwnerRefs + finalizers + cascade GC

1 participant