fix: null-guard responseObject.metadata.name in ActivityPolicy creates by ecv · Pull Request #672 · milo-os/milo

ecv · 2026-06-27T03:18:43Z

What

Make the create rule summaries null-safe in six ActivityPolicies that
dereference audit.responseObject.metadata.name. Guard the leaf and fall back
to audit.objectRef.name.

Why

Same DLQ-leak class as the prod DLQSlowLeak on the NSO gateway policy
(milo-os/activity#212). A create rule matches on verb/requestObject, so a
rejected create (409/422/admission-deny) still matches — but the apiserver
returns a Status object as responseObject with no metadata.name. CEL
raises no such key: name, the event goes to the DLQ, and retries fail
identically. Latent until the first rejected create of each kind.

Fixed (create rules)

config/services/activity/policies/resourcemanager/ — project, organization
config/services/activity/policies/iam/ — role, group, serviceaccount
config/services/identity/policies/ — serviceaccount

Remediation

Merge + milo release → the milo-activity-policies Flux Kustomization
re-applies the corrected CRs; processor retries any DLQ backlog. No kubectl.

NSO fix (gateway + 5 more): fix: null-guard responseObject derefs across NSO ActivityPolicies (DLQ leak) datum-cloud/network-services-operator#230
billing fix: milo-os/billing (separate PR)
Runbook fix: fix: correct DLQ runbook CEL null-check guidance activity#213 · Investigation: fix: gateway ActivityPolicy summary CEL leaks DLQ on events missing responseObject.metadata.name activity#212

The create rules dereference audit.responseObject.metadata.name in their summary, but a rejected create (409/422/admission-deny) returns a Status object with no metadata.name. The rules still match (keyed on verb/request), so CEL raises "no such key: name", the event goes to the DLQ, and retries fail identically -- a slow DLQ leak (same class as the gateway policy leak that fired DLQSlowLeak in prod). Guard the leaf and fall back to audit.objectRef.name across the resourcemanager (project, organization), iam (role, group, serviceaccount), and identity (serviceaccount) create rules.

…creates The create summaries guarded responseObject.metadata.name and fell back to audit.objectRef.name, but a generateName create rejected before a name is assigned has an EMPTY objectRef.name. The else branch then derefs audit.objectRef.name on a Status responseObject and raises "no such key: name" again, so the event still dead-letters. The assigned name is carried on the Status at responseObject.details.name. Replace each create summary with a per-level-guarded fallback chain: responseObject.metadata.name -> objectRef.name -> responseObject.details.name -> literal The details branch is guarded with has(responseObject.details) because some Status responses carry no details. Confirmed live in the sibling NSO repo: activity-processor logs show generateName creates (Connector) re-failing the DLQ retry on the objectRef.name branch. Same latent gap exists for every metadata.name create rule here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ecv · 2026-06-27T04:58:25Z

Blast radius + suggested follow-up (not in this PR)

While extending this fix (added responseObject.details.name fallback for generateName creates rejected before naming — see latest commit), I swept the rest of the activity policies for the same responseObject deref bug class. Summary of what's left:

Already safe — these non-create rules deref responseObject but their match clause guards it down to the ref object, so a rejected/Status response never reaches the summary:

resourcemanager/organizationmembership (delete) — has(responseObject.spec) && has(responseObject.spec.userRef)
notification/contactgroupmembership (delete) — guards contactRef + contactGroupRef
iam/groupmembership (delete) — guards userRef + groupRef
iam/userdeactivation (delete) — guards userRef
quota/resourcegrant (delete) — guards consumerRef
iam/userinvitation (accept/decline/cancel) — guards responseObject.status.organization.displayName fully → safe

Residual (low-probability) gap: each of the above derefs one level deeper than its match guards — e.g. match checks ...spec.userRef, summary uses ...spec.userRef.name. If the ref object is ever present without name, CEL raises no such key: name → DLQ. In practice refs always carry a name, so this is latent, not active. No DLQ evidence for these.

Recommendation: a separate follow-up to push each guard to the leaf (&& has(...userRef.name)) or fall the summary back to objectRef.name. Out of scope for this PR — flagging so it's tracked, not pursuing it here.

This was referenced Jun 27, 2026

fix: guard ref names in billingaccountbinding create policy milo-os/billing#69

Open

Validate ActivityPolicy CEL for unguarded responseObject derefs (DLQ-leak guard) milo-os/activity#214

Open

ecv marked this pull request as ready for review June 28, 2026 04:56

ecv requested review from JoseSzycho and scotwells June 29, 2026 22:37

ecv enabled auto-merge June 30, 2026 01:44

JoseSzycho approved these changes Jun 30, 2026

View reviewed changes

ecv merged commit f865d85 into main Jun 30, 2026
5 checks passed

ecv deleted the fix/activitypolicy-create-summary-dlq-guard branch June 30, 2026 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: null-guard responseObject.metadata.name in ActivityPolicy creates#672

fix: null-guard responseObject.metadata.name in ActivityPolicy creates#672
ecv merged 2 commits into
mainfrom
fix/activitypolicy-create-summary-dlq-guard

ecv commented Jun 27, 2026

Uh oh!

ecv commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ecv commented Jun 27, 2026

What

Why

Fixed (create rules)

Remediation

Related

Uh oh!

ecv commented Jun 27, 2026

Blast radius + suggested follow-up (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants