fargocd: drop dead webhook port; fargocd-manager: add metrics + probes by tamalsaha · Pull Request #485 · kubeops/installer

tamalsaha · 2026-06-03T07:46:49Z

Summary

Two related chart changes that align the deployed port surface with what the binaries actually serve.

`charts/fargocd` — drop dead webhook port

Removes containerPort: 9443 from the deployment and the https 443 → 9443 entry from the service. The operator constructs a controller-runtime webhook.NewServer(...) but never registers any handlers (SetupWebhookWithManager is not called), so the TLS endpoint had nothing listening behind it. Metrics (8443) and the in-pod probes port (8081) are unchanged.

`charts/fargocd-manager` — add metrics surface and real probes

The hub-side OCM AddOn manager pod previously declared no ports, ran no probes, and had no Service. The addon-framework's genericapiserver already listens on :8443 HTTPS — this PR wires it up:

containerPort: 8443 named metrics
Liveness + readiness httpGet /healthz against port: metrics with scheme: HTTPS (kubelet probes skip TLS verification, so the addon-framework's runtime self-signed cert is fine)
ClusterIP Service exposing metrics 8443 → metrics, plus the prometheus.io/builtin annotations
Optional ServiceMonitor + SA-token Secret, gated on monitoring.agent == prometheus.io/operator. Uses insecureSkipVerify: true because the addon-framework's serving cert SAN is localhost.
monitoring.{agent, serviceMonitor.labels} values + helper templates mirroring the fargocd chart
README regen reflects the new values keys

Pairs with kubeops/fargocd#27 which wires real workqueue + reflector metrics into the manager's /metrics. The ServiceMonitor will scrape successfully without that PR too — it'll just see Go runtime + process metrics until a fargocd image carrying #27 lands.

Caveat

charts/fargocd-manager/values.openapiv3_schema.yaml is auto-generated from apis/installer/v1alpha1/fargocd_manager_types.go via make manifests. The new monitoring block needs a Monitoring field added to FargocdManagerSpec (mirroring FargocdSpec) and a regen run before strict schema validation will accept it. Kept out of scope here to keep this PR chart-only.

Test plan

helm template charts/fargocd renders without the https / 9443 entries
helm template charts/fargocd-manager renders Service + Deployment with probes against port: metrics scheme: HTTPS (no ServiceMonitor)
helm template charts/fargocd-manager --set monitoring.agent=prometheus.io/operator renders the ServiceMonitor + SA-token Secret with insecureSkipVerify: true
Deploy to a hub cluster: confirm kubelet liveness + readiness probes against /healthz pass
With manager: expose Prometheus workqueue/reflector metrics; drop dead webhook port fargocd#27 image: confirm a Prometheus Operator scrapes workqueue_* and reflector_* series via the ServiceMonitor

fargocd chart -- drop the dead `https` (443 -> 9443) port from both the deployment and the service. The operator constructs a controller-runtime webhook server but never registers any handlers (`SetupWebhookWithManager` is not called anywhere), so the TLS endpoint had nothing listening behind it. Metrics (8443) and the in-pod probes port (8081) stay as is. fargocd-manager chart -- declare matching port surface and a metrics service: - containerPort 8081 (probes) and 8443 (metrics) - HTTP /healthz readiness + liveness probes against the probes port (paired with the new --health-probe-bind-address plumbing in the fargocd manager binary) - ClusterIP Service exposing metrics 8443 -> metrics, with the prometheus.io/builtin annotations matching the fargocd chart - Optional ServiceMonitor + SA-token Secret gated on monitoring.agent == prometheus.io/operator - monitoring.{agent,serviceMonitor.labels} values + helper templates mirroring the fargocd chart The ServiceMonitor uses insecureSkipVerify because the addon-framework controller's serving cert is generated at runtime with SAN=localhost. Note: values.openapiv3_schema.yaml is auto-generated; the new monitoring block will need a corresponding `Monitoring` field on FargocdManagerSpec (in kubeops.dev/installer/apis) and a `make manifests` run before strict schema validation will accept it. Signed-off-by: Tamal Saha <tamal@appscode.com>

chart-doc-gen output picks up the new monitoring.agent and monitoring.serviceMonitor.labels keys added to values.yaml. Signed-off-by: Tamal Saha <tamal@appscode.com>

The addon-framework controller exposes Prometheus /metrics on its :8443 HTTPS endpoint via genericapiserver, but the framework itself registers no collectors. Without a workqueue metrics provider, the endpoint only returns Go runtime + process collectors, which is enough to confirm the pod is alive but says nothing about reconcile backlog or throughput. Blank-import k8s.io/component-base/metrics/prometheus/workqueue in the manager package so its init() registers the prometheus provider against client-go's workqueue (via workqueue.SetProvider) and adds the standard workqueue_{depth,adds_total,queue_duration_seconds, work_duration_seconds,retries_total,longest_running_processor_seconds, unfinished_work_seconds} collectors to legacyregistry. The ServiceMonitor shipped by the fargocd-manager chart in kubeops/installer#485 picks these up automatically — no chart change needed. Signed-off-by: Tamal Saha <tamal@appscode.com>

The chart change in kubeops/installer#485 points liveness/readiness probes at the addon-framework's existing HTTPS /healthz on :8443 (kubelet's httpGet skips TLS verification, so the runtime self-signed cert is fine). With that, the dedicated plain-HTTP server on :8081 has no consumer -- drop the listener, the --health-probe-bind-address flag, and the ProbeAddr option. The workqueue and reflector metrics wired up in the previous commits stay (they live on the same :8443 endpoint as /healthz). Signed-off-by: Tamal Saha <tamal@appscode.com>

Drop the dedicated probes (8081) container port and point the readiness + liveness httpGet at the metrics port (8443) with scheme: HTTPS. The OCM addon-framework's genericapiserver already serves /healthz there, and kubelet's httpGet probes skip TLS verification so the runtime self-signed cert (SAN=localhost) is not an issue. Drops the dependency on the now-removed --health-probe-bind-address plumbing in kubeops/fargocd#27. Signed-off-by: Tamal Saha <tamal@appscode.com>

…hook port (#27) * manager: serve health probes on :8081; drop dead webhook port The fargocd chart's deployment declared a containerPort 9443 (`https`) and the service forwarded port 443 to it, but the operator only constructs a controller-runtime webhook server -- no admission handlers are ever registered, so nothing was listening behind that TLS endpoint. Drop both from the embedded chart copy under pkg/manager/agent-manifests (the installer-side copy is updated in the kubeops/installer repo). For the OCM AddOn `fargocd manager` subcommand, the addon-framework controller binds HTTPS on :8443 with a runtime self-signed cert (SAN=localhost), which makes kubelet probes awkward. Stand up a plain HTTP probe server (default :8081) that serves /healthz and /readyz, gated by --health-probe-bind-address (set to empty to disable). The embedded fargocd chart still uses controller-runtime's own probe plumbing on the same port name, so naming stays consistent. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: wire client-go workqueue metrics into /metrics The addon-framework controller exposes Prometheus /metrics on its :8443 HTTPS endpoint via genericapiserver, but the framework itself registers no collectors. Without a workqueue metrics provider, the endpoint only returns Go runtime + process collectors, which is enough to confirm the pod is alive but says nothing about reconcile backlog or throughput. Blank-import k8s.io/component-base/metrics/prometheus/workqueue in the manager package so its init() registers the prometheus provider against client-go's workqueue (via workqueue.SetProvider) and adds the standard workqueue_{depth,adds_total,queue_duration_seconds, work_duration_seconds,retries_total,longest_running_processor_seconds, unfinished_work_seconds} collectors to legacyregistry. The ServiceMonitor shipped by the fargocd-manager chart in kubeops/installer#485 picks these up automatically — no chart change needed. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: wire client-go reflector metrics into /metrics Pairs with the workqueue blank import: workqueue metrics cover reconcile backlog/throughput on the addon controllers, reflector metrics cover the list/watch behaviour of the informers feeding those reconcilers. client-go/tools/cache exposes a MetricsProvider interface but ships no off-the-shelf prometheus wrapper, so register a labelled set of collectors (reflector_{lists_total, list_duration_seconds, items_per_list, watches_total, short_watches_total, watch_duration_seconds, items_per_watch, last_resource_version}) against legacyregistry and bind them via SetReflectorMetricsProvider. Each metric is labelled by reflector name (the watched resource type) so per-informer behaviour is distinguishable. go.mod: promote github.com/prometheus/client_golang from indirect to direct -- it was already vendored transitively, so no vendor/ change. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: drop dedicated probe server; use addon-framework /healthz The chart change in kubeops/installer#485 points liveness/readiness probes at the addon-framework's existing HTTPS /healthz on :8443 (kubelet's httpGet skips TLS verification, so the runtime self-signed cert is fine). With that, the dedicated plain-HTTP server on :8081 has no consumer -- drop the listener, the --health-probe-bind-address flag, and the ProbeAddr option. The workqueue and reflector metrics wired up in the previous commits stay (they live on the same :8443 endpoint as /healthz). Signed-off-by: Tamal Saha <tamal@appscode.com> --------- Signed-off-by: Tamal Saha <tamal@appscode.com>

kodiakhq Bot previously approved these changes Jun 3, 2026

View reviewed changes

fargocd-manager: regen README for monitoring values

c82ea11

chart-doc-gen output picks up the new monitoring.agent and monitoring.serviceMonitor.labels keys added to values.yaml. Signed-off-by: Tamal Saha <tamal@appscode.com>

tamalsaha dismissed kodiakhq[bot]’s stale review via c82ea11 June 3, 2026 07:51

kodiakhq Bot previously approved these changes Jun 3, 2026

View reviewed changes

tamalsaha mentioned this pull request Jun 3, 2026

manager: expose Prometheus workqueue/reflector metrics; drop dead webhook port kubeops/fargocd#27

Merged

4 tasks

tamalsaha dismissed kodiakhq[bot]’s stale review via 492bfc9 June 3, 2026 08:44

kodiakhq Bot approved these changes Jun 3, 2026

View reviewed changes

tamalsaha changed the title ~~fargocd,fargocd-manager: rationalize chart port surface~~ fargocd: drop dead webhook port; fargocd-manager: add metrics + probes Jun 3, 2026

tamalsaha merged commit 5322394 into master Jun 3, 2026
4 of 8 checks passed

tamalsaha deleted the charts-port-surface branch June 3, 2026 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fargocd: drop dead webhook port; fargocd-manager: add metrics + probes#485

fargocd: drop dead webhook port; fargocd-manager: add metrics + probes#485
tamalsaha merged 3 commits into
masterfrom
charts-port-surface

tamalsaha commented Jun 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

tamalsaha commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

charts/fargocd — drop dead webhook port

charts/fargocd-manager — add metrics surface and real probes

Caveat

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

tamalsaha commented Jun 3, 2026 •

edited

Loading

`charts/fargocd` — drop dead webhook port

`charts/fargocd-manager` — add metrics surface and real probes