Skip to content

docs(controller-manager): assess disabling embedded GC at core scope#639

Open
ecv wants to merge 3 commits into
mainfrom
docs/assess-core-scope-embedded-gc
Open

docs(controller-manager): assess disabling embedded GC at core scope#639
ecv wants to merge 3 commits into
mainfrom
docs/assess-core-scope-embedded-gc

Conversation

@ecv

@ecv ecv commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Assessment (RFC) of whether the embedded Kubernetes garbage collector (GC) controller should be disabled when the controller-manager runs at --control-plane-scope=core, to cut goroutine/memory usage.

Doc only — no behavioral change. Follow-up to #631; complements #632 (which stripped managedFields from per-project quota caches).

Document

docs/proposals/core-scope-embedded-gc-assessment.md

It covers:

  • How the embedded GC is wired here (with file:line references).
  • The monitor-per-GVK cost model (goroutines + metadata heap) and that it's measurable today via --controllers=*,-garbagecollector.
  • Which cascade-deletion paths are covered without GC (namespace content deleter, project purger, explicit IAM finalizers) vs. which rely on the embedded GC (User-owned PolicyBinding/UserPreference; downstream anchor ConfigMaps; possibly membership PolicyBindings).
  • Options: leave enabled / disable via existing flag / default-disable in code at core scope.
  • Recommendation: do not disable yet; measure cost and prove cascade coverage first, with a concrete staging validation plan.

Note

This is an assessment with open questions and a measurement plan, not a proposal to disable. It explicitly does not claim disabling is safe today.

Refs #631, #632

🤖 Generated with Claude Code

Follow-up to #631. Assessment of the memory/goroutine cost of the
embedded Kubernetes garbage collector at --control-plane-scope=core, the
cascade-deletion risk of disabling it, and a measurement/validation plan.
Doc only; no behavioral change.

Refs #631

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ecv ecv enabled auto-merge June 2, 2026 20:17
@ecv ecv requested a review from scotwells June 4, 2026 18:17
@ecv

ecv commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Seeing if this is safe.

Verify the Section 4 cascade-deletion paths against current code and
resolve the two open questions:

- Organization -> namespace removal is GC-driven: the org controller has
  no finalizer and no explicit namespace delete, so disabling GC leaks
  the namespace and (via the stalled content deleter) all its contents.
  Moved from 4a "covered" to 4b as the highest-severity path.
- OrganizationMembership -> owned PolicyBindings relies on GC: no
  deletion finalizer, only steady-state pruning of undesired bindings.
- User -> PolicyBinding/UserPreference confirmed (three owned objects).
- Downstream anchors: no core controller uses the anchor strategy, so
  likely out of core scope; flagged to confirm.

Conclusion unchanged: disabling at core scope is not safe today. Add
Section 7 listing the required explicit-cleanup conversions (Changes
1-3) and the two verifications (4-5) that must land first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant