🤖 Generated by the Daily AI Assistant
Part of #2447.
Problem
Kubescape runtime threat detection is enabled and working (node-agent healthy, 288 ApplicationProfiles learned) but its alerts go nowhere durable. The node-agent config.json shows stdoutExporter: true with alertManagerExporterUrls: [], prometheusExporterEnabled: false, syslogExporterURL: "", httpExporterConfig: null, and malwareDetectionEnabled: false. So every runtime alert is written to stdout and vanishes — no route to Coroot, Slack, or the daily engineer. There are 0 RuntimeRuleAlertBindings and 0 alerts routed anywhere in 24h.
Proposed direction (Coroot-native, minimal-custom — settled with the maintainer)
All declarative, no new infrastructure, both ends native to the existing stack:
- kubescape HelmRelease (
k8s/bases/infrastructure/controllers/kubescape/helm-release.yaml): enable the node-agent Prometheus exporter (nodeAgent.config.prometheusExporterEnabled: true), exposing node_agent_alert_counter{rule_id,…} on :8080/metrics.
- Expose it to Coroot via the pod annotations Coroot's cluster-agent scrapes (
coroot.com/scrape-metrics: "true", coroot.com/metrics-port: "8080") — Coroot uses annotation-based service discovery, not ServiceMonitor.
- Coroot CR (
.../coroot/patches/coroot-patch.yaml): add a custom PromQL alertingRules[] entry on increase(node_agent_alert_counter[…]) > 0, routed to Slack via the existing notificationIntegrations webhook (scoped so it doesn't reopen the muted-alerts fatigue).
Trade-off (accepted): the Prometheus exporter emits counters (rule_id + pod + namespace), so the Slack message is "kubescape rule X fired on pod Y — click through"; the full incident payload stays in the node-agent logs Coroot already ingests. The non-native alternative (a dedicated Alertmanager for the richer AlertManager-exporter payload) is deliberately rejected.
Validate live before committing (both doc-uncertain): (a) that cluster-agent-scraped custom series are queryable in Coroot PromQL alert rules; (b) the exact Coroot CR notificationIntegrations / alertingRules field names against the live CRD (kubectl explain coroot.spec…). Consider whether to also enable malwareDetectionEnabled (resource cost vs. coverage) as a follow-up.
Rough size
M.
Acceptance criteria
- Runtime detections surface as a Coroot alert and reach Slack (counter-level, with click-through), fully declarative in Git.
- No stdout-only dead end; no new standalone component.
- The rule is scoped so it does not reintroduce Slack alert fatigue.
Part of #2447.
Problem
Kubescape runtime threat detection is enabled and working (node-agent healthy, 288 ApplicationProfiles learned) but its alerts go nowhere durable. The node-agent
config.jsonshowsstdoutExporter: truewithalertManagerExporterUrls: [],prometheusExporterEnabled: false,syslogExporterURL: "",httpExporterConfig: null, andmalwareDetectionEnabled: false. So every runtime alert is written to stdout and vanishes — no route to Coroot, Slack, or the daily engineer. There are 0RuntimeRuleAlertBindings and 0 alerts routed anywhere in 24h.Proposed direction (Coroot-native, minimal-custom — settled with the maintainer)
All declarative, no new infrastructure, both ends native to the existing stack:
k8s/bases/infrastructure/controllers/kubescape/helm-release.yaml): enable the node-agent Prometheus exporter (nodeAgent.config.prometheusExporterEnabled: true), exposingnode_agent_alert_counter{rule_id,…}on:8080/metrics.coroot.com/scrape-metrics: "true",coroot.com/metrics-port: "8080") — Coroot uses annotation-based service discovery, notServiceMonitor..../coroot/patches/coroot-patch.yaml): add a custom PromQLalertingRules[]entry onincrease(node_agent_alert_counter[…]) > 0, routed to Slack via the existingnotificationIntegrationswebhook (scoped so it doesn't reopen the muted-alertsfatigue).Trade-off (accepted): the Prometheus exporter emits counters (
rule_id+ pod + namespace), so the Slack message is "kubescape rule X fired on pod Y — click through"; the full incident payload stays in the node-agent logs Coroot already ingests. The non-native alternative (a dedicated Alertmanager for the richer AlertManager-exporter payload) is deliberately rejected.Validate live before committing (both doc-uncertain): (a) that cluster-agent-scraped custom series are queryable in Coroot PromQL alert rules; (b) the exact
CorootCRnotificationIntegrations/alertingRulesfield names against the live CRD (kubectl explain coroot.spec…). Consider whether to also enablemalwareDetectionEnabled(resource cost vs. coverage) as a follow-up.Rough size
M.
Acceptance criteria