feat(kubescape): route runtime-detection alerts to Headlamp, Slack, and Coroot by devantler · Pull Request #2445 · devantler-tech/platform

devantler · 2026-07-04T11:38:58Z

🤖 Generated by the Daily AI Assistant

Why

The Headlamp Kubescape plugin's Runtime Detection → Alerts page shows "Alertmanager URL is not configured", and Kubescape's runtime-detection alerts (rule violations, malware) were flowing nowhere. That tab reads only from a Prometheus Alertmanager, which the Coroot migration removed from the cluster — so there was no source to point it at.

What

Reintroduces a single, tiny, Kubescape-scoped Alertmanager (prod-only; not a return of the old Prometheus stack) and points the node-agent at it, so runtime alerts now reach all three intended places: the Headlamp plugin, Slack (the existing shared webhook), and Coroot (via the node-agent's stdout, which Coroot's log capture surfaces).

Operational notes

One manual step: the Headlamp plugin's Alertmanager address is a per-browser setting that can't be seeded declaratively (headlamp#3979). Set it once per operator to kubescape/alertmanager:9093. Until then the data source exists but the tab stays empty. Documented in docs/dr/alerting.md.
New dependency: the prometheus-community Alertmanager Helm chart.
Needs a direct merge after promotion (trusted-author PR).

…nd Coroot The Headlamp Kubescape plugin's "Runtime Detection > Alerts" tab warned "Alertmanager URL is not configured" because that tab reads ONLY from a Prometheus Alertmanager (GET /api/v2/alerts), and the Coroot migration removed Alertmanager from the cluster — so there was no source and the node-agent exported its runtime alerts nowhere. Reintroduce a single minimal Alertmanager (prometheus-community chart 1.40.1, ~10m/32Mi, emptyDir, hardened securityContext) scoped to the kubescape namespace, prod-only — NOT a re-adoption of the Prometheus stack. Wire the node-agent to fan each alert out to all three destinations: * Headlamp — nodeAgent.config.alertManagerExporterUrls -> the Alertmanager, which the plugin queries. (One manual per-user step remains: set "kubescape/alertmanager:9093" in the plugin settings; the address is browser-local, not declaratively seedable — headlamp#3979.) * Slack — the Alertmanager slack_configs receiver -> the shared ${alertmanager_webhook_url} incoming-webhook (same channel as Coroot/Flux). * Coroot — nodeAgent.config.stdoutExporter (default) -> Coroot's eBPF log capture surfaces the alert in its Logs view (Coroot CE has no alert receiver). Adds a CiliumNetworkPolicy allowing the Headlamp API-server Service-proxy to reach :9093 and the Alertmanager to reach hooks.slack.com; documents the design and the manual Headlamp step in docs/dr/alerting.md. Validated: ksail --config ksail.prod.yaml workload validate (485 files), kustomize build of the hetzner controllers overlay, and the naming CI check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai · 2026-07-04T11:39:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7240fffb-40f3-4ef3-811d-d15841917556

📥 Commits

Reviewing files that changed from the base of the PR and between 4efee1b and dced919.

📒 Files selected for processing (1)

k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml

🚧 Files skipped from review as they are similar to previous changes (1)

k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml

📜 Recent review details

⏰ Context from checks skipped due to timeout. (2)

GitHub Check: CI - Required Checks
GitHub Check: Analyze (python)

📝 Walkthrough

Walkthrough

This PR adds a Kubescape-scoped Alertmanager deployment, network policy, node-agent exporter wiring, and documentation for runtime-detection alert fan-out to Slack, Coroot, and Headlamp.

Changes

Kubescape Alertmanager deployment and wiring

Layer / File(s)	Summary
Helm repository and Alertmanager release configuration `k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml`, `k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml`, `k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml`	Adds the Helm repository, Alertmanager release, and Slack webhook Secret used by the release.
CiliumNetworkPolicy for Alertmanager traffic `k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml`	Adds ingress on port 9093 and egress to Slack and DNS for Alertmanager pods.
Alertmanager kustomization resource list `k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml`	Defines the Kustomize manifest that includes the Alertmanager repository, release, secret, and network policy resources.
Kubescape node-agent exporter patch and controller wiring `k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml`, `k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml`	Adds the Kubescape HelmRelease patch for runtime-detection exporters and wires the alertmanager directory and patch into the controllers kustomization.
Alerting documentation update `docs/dr/alerting.md`	Documents the Kubescape runtime-detection alert integration, the three alert destinations, and the manual Headlamp Alertmanager URL configuration.

Estimated code review effort: 3 (Moderate) | ~25 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: routing Kubescape runtime-detection alerts to Headlamp, Slack, and Coroot.
Description check	✅ Passed	The description is directly related to the changeset and explains the Alertmanager reintroduction and alert routing changes.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/kubescape-runtime-alerts

_{Comment @coderabbitai help to get the list of available commands.}

devantler · 2026-07-04T12:19:33Z

@coderabbitai review

coderabbitai · 2026-07-04T12:19:39Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

🧹 Nitpick comments (2)

k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml (1)
17-18: 🧹 Nitpick | 🔵 Trivial

Verify prod substitution and consider surfacing delivery failures.

Syntax and key-to-path coupling verified correct. One operational note: if alertmanager_webhook_url is ever missing/renamed in the prod variables Secret, this silently falls back to the .invalid placeholder rather than failing reconciliation, so Slack delivery would quietly break. Alertmanager exposes alertmanager_notifications_failed_total; consider ensuring it's scraped/alerted on (e.g., via Coroot) so a bad substitution doesn't go unnoticed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml`
around lines 17 - 18, The Alertmanager secret’s
`${alertmanager_webhook_url:=...}` fallback can hide a missing or renamed prod
variable by silently using the placeholder URL, so check the `slack-webhook-url`
substitution path in the secret generation flow and make it fail or surface an
obvious configuration error when the variable is absent. Also ensure
`alertmanager_notifications_failed_total` is being scraped and alerted on (for
example through Coroot) so broken Slack delivery is detected quickly.
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml (1)
37-86: 🔒 Security & Privacy | 🔵 Trivial | ⚡ Quick win

Consider disabling the default-mounted service account token.

The pod's securityContext is hardened extensively (drop ALL, readOnlyRootFilesystem, non-root, seccomp), but automountServiceAccountToken is left at the chart's default (true), even though this Alertmanager instance has no need to call the Kubernetes API.
🔒 Suggested addition
     fullnameOverride: alertmanager
     replicaCount: 1
+    automountServiceAccountToken: false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml`
around lines 37 - 86, Disable the default-mounted service account token in the
Alertmanager Helm values by setting automountServiceAccountToken to false
alongside the existing podSecurityContext and securityContext hardening in the
alertmanager Helm release values. This Alertmanager instance does not need
Kubernetes API access, so add the setting in the same values block that defines
fullnameOverride, persistence, and extraSecretMounts to keep the pod
least-privileged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml`:
- Around line 37-86: Disable the default-mounted service account token in the
Alertmanager Helm values by setting automountServiceAccountToken to false
alongside the existing podSecurityContext and securityContext hardening in the
alertmanager Helm release values. This Alertmanager instance does not need
Kubernetes API access, so add the setting in the same values block that defines
fullnameOverride, persistence, and extraSecretMounts to keep the pod
least-privileged.

In `@k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml`:
- Around line 17-18: The Alertmanager secret’s
`${alertmanager_webhook_url:=...}` fallback can hide a missing or renamed prod
variable by silently using the placeholder URL, so check the `slack-webhook-url`
substitution path in the secret generation flow and make it fail or surface an
obvious configuration error when the variable is absent. Also ensure
`alertmanager_notifications_failed_total` is being scraped and alerted on (for
example through Coroot) so broken Slack delivery is detected quickly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5261491e-8f47-45cc-badd-4a6d7130a7f0

📥 Commits

Reviewing files that changed from the base of the PR and between 32ce888 and 4efee1b.

📒 Files selected for processing (8)

docs/dr/alerting.md
k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml
k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml
k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{yaml,yml}: Use Kustomize overlays rather than editing base resources directly; k8s/bases/ is immutable from overlays and changes should be made with patches: in provider or cluster overlays.
Keep manifest changes small and use YAML/schema validation before submitting a manifest PR; for files with cluster context, prefer ksail workload validate / kubectl kustomize / kubectl apply --dry-run=client as appropriate.

Files:

k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml
k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml
k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml

k8s/**

📄 CodeRabbit inference engine (AGENTS.md)

k8s/**: Respect Flux dependency order: bootstrap → infrastructure-controllers → infrastructure → apps, with the prod-only infrastructure-overprovisioning layer hanging off infrastructure without gating apps.
Follow the hierarchical Kustomization flow: base configurations in k8s/bases/ feed provider overlays in k8s/providers/, which feed cluster overlays in k8s/clusters/.

Files:

k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml
k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml
k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml

🧠 Learnings (1)

📚 Learning: 2026-07-01T21:13:36.950Z

Learnt from: devantler
Repo: devantler-tech/platform PR: 2359
File: k8s/bases/apps/actual-budget/helm-release.yaml:62-111
Timestamp: 2026-07-01T21:13:36.950Z
Learning: When reviewing Kustomize/Helm YAML in this repo, keep the base vs provider overlay split: `k8s/bases/apps/**` and `k8s/bases/infrastructure/**` should contain each app’s full, environment-agnostic configuration (including base-level postRenderer Kustomize patches such as deployment strategy, topology spread, probes, and env injection). `k8s/providers/{docker,hetzner}/**` should only add small provider-specific deltas (e.g., `interval`, `persistence.size`) via patch files (like `k8s/providers/<provider>/apps/<app>/patches/helm-release-patch.yaml`). If configuration is identical across providers (e.g., OIDC/OAuth env vars where `${domain}` is resolved per cluster via envsubst), it belongs in the base and must not be duplicated into provider overlays.

Applied to files:

k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/secret.yaml
k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml
k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml
k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml

🪛 markdownlint-cli2 (0.22.1)

docs/dr/alerting.md

[warning] 122-122: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (8)

docs/dr/alerting.md (2)

122-124: 📐 Maintainability & Code Quality | ⚡ Quick win

Add a language tag to the example fence.

Line 122 trips MD040. Mark the block as text (or console) so docs lint cleanly.

Proposed fix

-```
+```text
 kubescape/alertmanager:9093

</details>

<!-- cr-comment:v1:47221755df8546f9c0a84d36 -->

_Source: Linters/SAST tools_

---

`126-128`: _🎯 Functional Correctness_ | _⚡ Quick win_

**Verify the proxy-RBAC note.**

This section says the plugin reads via the API-server service proxy, but the `get/create` permission claim is specific enough that it should be confirmed against the actual RBAC rule before publishing. If the binding only grants `get` on `services/proxy`, this will mislead operators.

<!-- cr-comment:v1:bc8cc05c86b2145ae12415b6 -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-repository.yaml (1)</summary><blockquote>

`1-10`: LGTM!

<!-- cr-comment:v1:a01ce1910a40def551e3a146 -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/alertmanager/helm-release.yaml (1)</summary><blockquote>

`1-118`: LGTM! Chart version, `extraSecretMounts` field, and emptyDir-on-disabled-persistence behavior all verified against the upstream `prometheus-community/alertmanager` chart.

<!-- cr-comment:v1:07cf03233f46f8811cb08f08 -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/alertmanager/kustomization.yaml (1)</summary><blockquote>

`1-9`: LGTM!

<!-- cr-comment:v1:dc581c3ba6db26d02d702759 -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/alertmanager/cilium-network-policy.yaml (1)</summary><blockquote>

`18-31`: _🩺 Stability & Availability_

**Cross-file dependency is already covered**

`allow-kubescape` already allows intra-namespace traffic and DNS egress for every `kubescape` pod, so Alertmanager doesn’t need additional rules for the node-agent path or `hooks.slack.com` resolution. 

				> Likely an incorrect or invalid review comment.

<!-- cr-comment:v1:73c2a31054187e2d1e4a57d2 -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/kubescape/patches/helm-release-patch.yaml (1)</summary><blockquote>

`1-34`: LGTM!

<!-- cr-comment:v1:3c7fb0dd0be87c1256b50abb -->

</blockquote></details>
<details>
<summary>k8s/providers/hetzner/infrastructure/controllers/kustomization.yaml (1)</summary><blockquote>

`11-17`: LGTM!



Also applies to: 80-83

<!-- cr-comment:v1:b14629628be1e049bbe8b938 -->

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

…anager It never calls the Kubernetes API; chart default is true. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

devantler · 2026-07-04T13:33:34Z

🤖 Generated by the Daily AI Assistant

Review-body nitpick resolution record (no threads exist for these):

automountServiceAccountToken — valid, fixed in dced919 (false at chart top level; this Alertmanager never calls the Kubernetes API).
.invalid fallback should fail / surface — refuted as-designed: the ${alertmanager_webhook_url:=…invalid} default is deliberate. Flux postBuild substitution failing on a missing var would block the ENTIRE hetzner infrastructure Kustomization (every controller), which is a far worse failure mode than one alert route going quiet; the placeholder keeps the blast radius at 'Slack delivery off'. The alertmanager_notifications_failed_total scrape/alert idea is a fair follow-up for the Coroot integration but out of scope for this PR.

devantler · 2026-07-04T23:23:03Z

🤖 Generated by the Daily AI Assistant

Resolution record for the CodeRabbit review-body nitpicks (2026-07-04 12:27Z review — no inline threads exist for these):

Silent ${alertmanager_webhook_url:=…invalid} fallback — keeping as-is, by design. The inline default is the repo-wide Flux-substitution convention that keeps ksail workload validate (no SOPS access) and the local/CI overlays building; a strict/hard-fail substitution would wedge the entire hetzner infrastructure Kustomization on one missing variable (the exact blast-radius class of the 2026-07-02 prod wedge). Misconfiguration is not silent in practice: the variable is the SAME one Coroot incidents and the Flux notification-controller already post to (a rename breaks visibly in three systems), the RFC-2606 .invalid placeholder makes a bad substitution obvious in the rendered config, and Alertmanager logs failed notifications — which Coroot's log capture ingests via the same channel as the node-agent stdout alerts.
automountServiceAccountToken: false — already in the HelmRelease values (helm-release.yaml, "never calls the Kubernetes API" block); the finding is stale against the current head.

github-project-automation Bot added this to 🌊 Project Board Jul 4, 2026

github-project-automation Bot moved this to 🫴 Ready in 🌊 Project Board Jul 4, 2026

This was referenced Jul 4, 2026

feat(kubescape): route runtime threat alerts off stdout → Coroot-native PromQL alert → Slack #2449

Closed

roadmap: Kubescape security stack → 100% and hold (posture · CVE · runtime) #2447

Open

coderabbitai Bot reviewed Jul 4, 2026

View reviewed changes

coderabbitai Bot approved these changes Jul 4, 2026

View reviewed changes

fix(kubescape): do not mount the SA token in the runtime-alert Alertm…

dced919

…anager It never calls the Kubernetes API; chart default is true. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(kubescape): route runtime-detection alerts to Headlamp, Slack, and Coroot#2445

feat(kubescape): route runtime-detection alerts to Headlamp, Slack, and Coroot#2445
devantler wants to merge 2 commits into
mainfrom
claude/kubescape-runtime-alerts

devantler commented Jul 4, 2026

Uh oh!

coderabbitai Bot commented Jul 4, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

coderabbitai Bot commented Jul 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

devantler commented Jul 4, 2026

Why

What

Operational notes

Uh oh!

coderabbitai Bot commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

coderabbitai Bot commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

devantler commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jul 4, 2026 •

edited

Loading

coderabbitai Bot commented Jul 4, 2026 •

edited

Loading