Skip to content

feat(k8score): idle reaper for namespace-scoped dynamic informers#772

Open
nadaverell wants to merge 1 commit into
feature/per-namespace-dynamic-informersfrom
feature/dynamic-informer-idle-reaper
Open

feat(k8score): idle reaper for namespace-scoped dynamic informers#772
nadaverell wants to merge 1 commit into
feature/per-namespace-dynamic-informersfrom
feature/dynamic-informer-idle-reaper

Conversation

@nadaverell
Copy link
Copy Markdown
Contributor

@nadaverell nadaverell commented May 25, 2026

Summary

Follow-up to #770 (stacked on it — base is the B2.1 branch; retarget to main once #770 merges). B2.1 lets a GVR have many namespace-scoped informers; over the life of a long-running, namespace-restricted deployment those can accumulate as new namespaces get watched. This adds an idle reaper that stops a namespace-scoped informer after InformerIdleTTL (default 30m) without a read, re-creating it transparently on next access.

Liveness = the read path (no SSE lease needed)

Every read — List / Get / Count / the EnsureWatching confirm-step — stamps the entry's lastAccess (atomic, off the hot lock). The frontend re-lists open views every 60–120s (ResourcesView.tsx / GitOpsView.tsx React Query refetchInterval), and SSE topology rebuilds also List — both flow through that stamp. So an open view keeps its informer warm and can't be reaped out from under a live stream, which is the concern that would otherwise motivate an SSE subscriber lease. The TTL (30m) is comfortably above the poll interval, so the lease/refcount coupling is unnecessary; I went with the simpler read-driven design deliberately.

  • Cluster-wide informers are exempt (one per GVR, already bounded, serve every namespace — churning them is pure cost). The common cluster-wide-RBAC deployment is therefore completely unaffected: nothing is ever reaped.
  • A negative InformerIdleTTL disables reaping; zero selects the default.

Why informers are now built directly (not via a factory)

A DynamicSharedInformerFactory caches one informer per GVR, so after the reaper stops an informer the factory would hand back the stopped instance on re-request — and a SharedInformer can't be re-Run. Switching to NewFilteredDynamicInformer gives standalone informers, each under its own cancelable context, that can be stopped and re-created freely. This also let me drop the per-namespace factory map and simplify Stop() (closing stopCh cancels every informer's context; no factory shutdown).

Tests

pkg/k8score: idle namespace-scoped informer is reaped while a freshly-read one is kept and re-created on next access; cluster-wide informers are exempt even when ancient; negative TTL disables reaping. Reaper is driven with an explicit now for deterministic, timing-free tests. go build ./..., go test ./internal/server ./internal/k8s ./pkg/k8score all green.

Note

Default behavior change is limited to namespace-restricted deployments (reaper trims idle per-ns informers there). Cluster-wide deployments see no behavioral change.


Note

Medium Risk
Changes core dynamic cache lifecycle and informer construction; mis-tuned TTL or missed touch() paths could stop watches for idle namespaces (transparent re-sync on next access). Cluster-wide RBAC deployments are explicitly exempt from reaping.

Overview
Namespace-restricted dynamic CRD watches can accumulate one informer per namespace; this PR adds an idle reaper that stops namespace-scoped informers after InformerIdleTTL (default 30m without a read) and re-creates them on the next List/Get. Cluster-wide informers are never reaped; a negative TTL turns reaping off.

Liveness is read-driven: lastAccess is updated atomically on list/get/count paths (including EnsureWatching confirm and ListWatched for background scanners), so active UI polling keeps informers warm without an SSE lease.

Implementation shift: informers are built with NewFilteredDynamicInformer instead of a shared DynamicSharedInformerFactory, so reaped/stopped informers are not reused from a factory cache. Stop() no longer shuts down factories—per-informer contexts cancel via stopCh. Sync completion only marks synced if the entry is still the one that started syncing (avoids races after a reap/re-create).

DynamicCacheConfig.InformerIdleTTL is the new knob (zero = default, negative = disabled). Tests cover idle vs fresh namespaces, cluster-wide exemption, disable flag, and transparent re-create after reap.

Reviewed by Cursor Bugbot for commit 721654d. Bugbot is set up for automated code reviews on this repo. Configure here.

@nadaverell nadaverell requested a review from hisco as a code owner May 25, 2026 20:53
@nadaverell nadaverell force-pushed the feature/per-namespace-dynamic-informers branch 2 times, most recently from b073af6 to 979ddff Compare May 25, 2026 21:10
@nadaverell nadaverell force-pushed the feature/dynamic-informer-idle-reaper branch from e5f3876 to fd252dc Compare May 25, 2026 21:14
@nadaverell nadaverell force-pushed the feature/per-namespace-dynamic-informers branch from 979ddff to 789b641 Compare May 25, 2026 21:32
@nadaverell nadaverell force-pushed the feature/dynamic-informer-idle-reaper branch from fd252dc to a06adb5 Compare May 25, 2026 21:34
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a06adb5. Configure here.

Comment thread pkg/k8score/dynamic_cache.go
@nadaverell nadaverell force-pushed the feature/per-namespace-dynamic-informers branch 2 times, most recently from 7c35d55 to bc65871 Compare May 25, 2026 21:53
@nadaverell nadaverell force-pushed the feature/dynamic-informer-idle-reaper branch from a06adb5 to c0fb7a0 Compare May 25, 2026 21:54
@nadaverell nadaverell force-pushed the feature/per-namespace-dynamic-informers branch from bc65871 to ee22b11 Compare May 25, 2026 22:02
@nadaverell nadaverell force-pushed the feature/dynamic-informer-idle-reaper branch from c0fb7a0 to a085db6 Compare May 25, 2026 22:04
@nadaverell nadaverell force-pushed the feature/per-namespace-dynamic-informers branch from ee22b11 to 12ad04e Compare May 25, 2026 22:10
Per-namespace dynamic informers (the B2.1 model) can accumulate over the
life of a long-running, namespace-restricted deployment as different
namespaces get watched. Add an idle reaper that stops a namespace-scoped
informer after InformerIdleTTL (default 30m) without a read, re-creating it
transparently on the next access.

Liveness is the read path itself: List/Get/Count/EnsureWatching stamp the
entry's lastAccess. The frontend re-lists open views every 60-120s (React
Query refetchInterval), and SSE topology rebuilds also list — both flow
through that stamp — so an open view's informer is never reaped out from
under it, and an explicit SSE lease/refcount is unnecessary. Cluster-wide
informers (one per GVR, serve every namespace) are exempt; a negative TTL
disables reaping.

To make eviction safe, informers are now constructed directly via
NewFilteredDynamicInformer instead of through a shared factory: a factory
caches one informer per GVR and would hand back the stopped instance after a
reap (a SharedInformer cannot be re-Run). Standalone informers, each under
its own cancelable context, can be stopped and re-created freely. This also
removes the per-namespace factory map and simplifies Stop().

Builds on #770.
@nadaverell nadaverell force-pushed the feature/dynamic-informer-idle-reaper branch from a085db6 to 721654d Compare May 25, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant