fargocd: drop dead webhook port; fargocd-manager: add metrics + probes#485
Merged
Conversation
fargocd chart -- drop the dead `https` (443 -> 9443) port from both the
deployment and the service. The operator constructs a controller-runtime
webhook server but never registers any handlers (`SetupWebhookWithManager`
is not called anywhere), so the TLS endpoint had nothing listening
behind it. Metrics (8443) and the in-pod probes port (8081) stay as is.
fargocd-manager chart -- declare matching port surface and a metrics
service:
- containerPort 8081 (probes) and 8443 (metrics)
- HTTP /healthz readiness + liveness probes against the probes port
(paired with the new --health-probe-bind-address plumbing in the
fargocd manager binary)
- ClusterIP Service exposing metrics 8443 -> metrics, with the
prometheus.io/builtin annotations matching the fargocd chart
- Optional ServiceMonitor + SA-token Secret gated on
monitoring.agent == prometheus.io/operator
- monitoring.{agent,serviceMonitor.labels} values + helper templates
mirroring the fargocd chart
The ServiceMonitor uses insecureSkipVerify because the addon-framework
controller's serving cert is generated at runtime with SAN=localhost.
Note: values.openapiv3_schema.yaml is auto-generated; the new
monitoring block will need a corresponding `Monitoring` field on
FargocdManagerSpec (in kubeops.dev/installer/apis) and a `make manifests`
run before strict schema validation will accept it.
Signed-off-by: Tamal Saha <tamal@appscode.com>
chart-doc-gen output picks up the new monitoring.agent and monitoring.serviceMonitor.labels keys added to values.yaml. Signed-off-by: Tamal Saha <tamal@appscode.com>
tamalsaha
added a commit
to kubeops/fargocd
that referenced
this pull request
Jun 3, 2026
The addon-framework controller exposes Prometheus /metrics on its
:8443 HTTPS endpoint via genericapiserver, but the framework itself
registers no collectors. Without a workqueue metrics provider, the
endpoint only returns Go runtime + process collectors, which is
enough to confirm the pod is alive but says nothing about reconcile
backlog or throughput.
Blank-import k8s.io/component-base/metrics/prometheus/workqueue in
the manager package so its init() registers the prometheus provider
against client-go's workqueue (via workqueue.SetProvider) and adds
the standard workqueue_{depth,adds_total,queue_duration_seconds,
work_duration_seconds,retries_total,longest_running_processor_seconds,
unfinished_work_seconds} collectors to legacyregistry.
The ServiceMonitor shipped by the fargocd-manager chart in
kubeops/installer#485 picks these up automatically — no chart
change needed.
Signed-off-by: Tamal Saha <tamal@appscode.com>
Merged
4 tasks
tamalsaha
added a commit
to kubeops/fargocd
that referenced
this pull request
Jun 3, 2026
The chart change in kubeops/installer#485 points liveness/readiness probes at the addon-framework's existing HTTPS /healthz on :8443 (kubelet's httpGet skips TLS verification, so the runtime self-signed cert is fine). With that, the dedicated plain-HTTP server on :8081 has no consumer -- drop the listener, the --health-probe-bind-address flag, and the ProbeAddr option. The workqueue and reflector metrics wired up in the previous commits stay (they live on the same :8443 endpoint as /healthz). Signed-off-by: Tamal Saha <tamal@appscode.com>
Drop the dedicated probes (8081) container port and point the readiness + liveness httpGet at the metrics port (8443) with scheme: HTTPS. The OCM addon-framework's genericapiserver already serves /healthz there, and kubelet's httpGet probes skip TLS verification so the runtime self-signed cert (SAN=localhost) is not an issue. Drops the dependency on the now-removed --health-probe-bind-address plumbing in kubeops/fargocd#27. Signed-off-by: Tamal Saha <tamal@appscode.com>
tamalsaha
added a commit
to kubeops/fargocd
that referenced
this pull request
Jun 3, 2026
…hook port (#27) * manager: serve health probes on :8081; drop dead webhook port The fargocd chart's deployment declared a containerPort 9443 (`https`) and the service forwarded port 443 to it, but the operator only constructs a controller-runtime webhook server -- no admission handlers are ever registered, so nothing was listening behind that TLS endpoint. Drop both from the embedded chart copy under pkg/manager/agent-manifests (the installer-side copy is updated in the kubeops/installer repo). For the OCM AddOn `fargocd manager` subcommand, the addon-framework controller binds HTTPS on :8443 with a runtime self-signed cert (SAN=localhost), which makes kubelet probes awkward. Stand up a plain HTTP probe server (default :8081) that serves /healthz and /readyz, gated by --health-probe-bind-address (set to empty to disable). The embedded fargocd chart still uses controller-runtime's own probe plumbing on the same port name, so naming stays consistent. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: wire client-go workqueue metrics into /metrics The addon-framework controller exposes Prometheus /metrics on its :8443 HTTPS endpoint via genericapiserver, but the framework itself registers no collectors. Without a workqueue metrics provider, the endpoint only returns Go runtime + process collectors, which is enough to confirm the pod is alive but says nothing about reconcile backlog or throughput. Blank-import k8s.io/component-base/metrics/prometheus/workqueue in the manager package so its init() registers the prometheus provider against client-go's workqueue (via workqueue.SetProvider) and adds the standard workqueue_{depth,adds_total,queue_duration_seconds, work_duration_seconds,retries_total,longest_running_processor_seconds, unfinished_work_seconds} collectors to legacyregistry. The ServiceMonitor shipped by the fargocd-manager chart in kubeops/installer#485 picks these up automatically — no chart change needed. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: wire client-go reflector metrics into /metrics Pairs with the workqueue blank import: workqueue metrics cover reconcile backlog/throughput on the addon controllers, reflector metrics cover the list/watch behaviour of the informers feeding those reconcilers. client-go/tools/cache exposes a MetricsProvider interface but ships no off-the-shelf prometheus wrapper, so register a labelled set of collectors (reflector_{lists_total, list_duration_seconds, items_per_list, watches_total, short_watches_total, watch_duration_seconds, items_per_watch, last_resource_version}) against legacyregistry and bind them via SetReflectorMetricsProvider. Each metric is labelled by reflector name (the watched resource type) so per-informer behaviour is distinguishable. go.mod: promote github.com/prometheus/client_golang from indirect to direct -- it was already vendored transitively, so no vendor/ change. Signed-off-by: Tamal Saha <tamal@appscode.com> * manager: drop dedicated probe server; use addon-framework /healthz The chart change in kubeops/installer#485 points liveness/readiness probes at the addon-framework's existing HTTPS /healthz on :8443 (kubelet's httpGet skips TLS verification, so the runtime self-signed cert is fine). With that, the dedicated plain-HTTP server on :8081 has no consumer -- drop the listener, the --health-probe-bind-address flag, and the ProbeAddr option. The workqueue and reflector metrics wired up in the previous commits stay (they live on the same :8443 endpoint as /healthz). Signed-off-by: Tamal Saha <tamal@appscode.com> --------- Signed-off-by: Tamal Saha <tamal@appscode.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related chart changes that align the deployed port surface with what the binaries actually serve.
charts/fargocd— drop dead webhook portRemoves
containerPort: 9443from the deployment and thehttps443 → 9443 entry from the service. The operator constructs a controller-runtimewebhook.NewServer(...)but never registers any handlers (SetupWebhookWithManageris not called), so the TLS endpoint had nothing listening behind it. Metrics (8443) and the in-pod probes port (8081) are unchanged.charts/fargocd-manager— add metrics surface and real probesThe hub-side OCM AddOn manager pod previously declared no ports, ran no probes, and had no Service. The addon-framework's
genericapiserveralready listens on:8443HTTPS — this PR wires it up:containerPort: 8443namedmetricshttpGet /healthzagainstport: metricswithscheme: HTTPS(kubelet probes skip TLS verification, so the addon-framework's runtime self-signed cert is fine)Serviceexposingmetrics8443 →metrics, plus theprometheus.io/builtinannotationsServiceMonitor+ SA-tokenSecret, gated onmonitoring.agent == prometheus.io/operator. UsesinsecureSkipVerify: truebecause the addon-framework's serving cert SAN islocalhost.monitoring.{agent, serviceMonitor.labels}values + helper templates mirroring the fargocd chartPairs with kubeops/fargocd#27 which wires real workqueue + reflector metrics into the manager's
/metrics. The ServiceMonitor will scrape successfully without that PR too — it'll just see Go runtime + process metrics until a fargocd image carrying #27 lands.Caveat
charts/fargocd-manager/values.openapiv3_schema.yamlis auto-generated fromapis/installer/v1alpha1/fargocd_manager_types.goviamake manifests. The newmonitoringblock needs aMonitoringfield added toFargocdManagerSpec(mirroringFargocdSpec) and a regen run before strict schema validation will accept it. Kept out of scope here to keep this PR chart-only.Test plan
helm template charts/fargocdrenders without thehttps/ 9443 entrieshelm template charts/fargocd-managerrenders Service + Deployment with probes againstport: metrics scheme: HTTPS(no ServiceMonitor)helm template charts/fargocd-manager --set monitoring.agent=prometheus.io/operatorrenders the ServiceMonitor + SA-token Secret withinsecureSkipVerify: true/healthzpassworkqueue_*andreflector_*series via the ServiceMonitor