Skip to content

feat(helm): preflight check for cert-manager presence#383

Draft
WentingWu666666 wants to merge 1 commit into
documentdb:mainfrom
WentingWu666666:developer/wentingwu/helm-cert-manager-preflight
Draft

feat(helm): preflight check for cert-manager presence#383
WentingWu666666 wants to merge 1 commit into
documentdb:mainfrom
WentingWu666666:developer/wentingwu/helm-cert-manager-preflight

Conversation

@WentingWu666666
Copy link
Copy Markdown
Collaborator

Draft PR for review/discussion only. Part of the GA-readiness chart audit (#381).

What this PR does

Adds a Helm preflight check that fails the install/upgrade with an actionable error when cert-manager is not installed in the target cluster.

Why

The chart unconditionally creates cert-manager.io/v1 Issuer and Certificate resources for the validating webhook and the CNPG plugin sidecars. Today, if cert-manager is missing:

  • helm install returns success.
  • The operator pod never becomes ready (no webhook TLS Secret readiness probe fails forever).
  • The user is left debugging a stuck deployment with no breadcrumb pointing at the missing dependency.

Failing loud at install time is much friendlier.

How

templates/00_preflight.yaml uses .Capabilities.APIVersions.Has "cert-manager.io/v1" and {{- fail }} to abort if absent.

Gated by certManager.preflightCheck (default true) so it can be disabled for offline templating (GitOps render pipelines) where API discovery returns only stable Kubernetes APIs.

Local verification

Scenario Expected Result
helm template (no API discovery) fail fails with our message
helm template --api-versions cert-manager.io/v1 render renders
helm template --set certManager.preflightCheck=false render renders
helm upgrade --dry-run on kind (cert-manager installed) render ✅ renders

Tracking

The chart unconditionally creates cert-manager.io/v1 Issuer and
Certificate resources for the validating webhook and the CNPG plugin
sidecars. If cert-manager is not installed in the cluster, the install
`succeeds` from Helm's perspective but the operator never becomes
ready (webhook TLS Secret is never issued, readiness probe never
passes), and the user is left to figure out why.

Add a templates/00_preflight.yaml that uses
.Capabilities.APIVersions.Has to detect cert-manager.io/v1 and fail the
install/upgrade with an actionable error message naming the missing
dependency and how to install it.

The check is on by default and gated by certManager.preflightCheck so
it can be disabled for offline templating (GitOps render pipelines)
where API discovery is unreliable. Disabling does NOT remove the
dependency.

Addresses M4 from documentdb#381.

Verified locally:
  - `helm template` without cert-manager API: fails with our error.
  - `helm template --api-versions cert-manager.io/v1`: renders.
  - `helm template --set certManager.preflightCheck=false`: renders.
  - `helm upgrade --dry-run` on kind (cert-manager installed): passes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@documentdb-triage-tool
Copy link
Copy Markdown

🤖 Auto-triaged by documentdb-triage-tool.

Applied: enhancement
Project fields suggested: Component manifests · Priority P2 · Effort M · Status In Progress
Confidence: 0.82 (mixed)

Reasoning

effort from diff stats (30+0 LOC, 2 files); LLM: Adds a small Helm preflight template (~few lines) to fail fast when cert-manager is absent, improving UX for a silent failure mode; part of a tracked GA-readiness audit.

If a label is wrong, remove it manually and ping @patty-chow so the rules can be tuned. The bot will not re-label items that already have component labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants