Skip to content

feat(sbommanager): key SBOM name on (digest, syftVersion)#823

Closed
jnathangreeg wants to merge 1 commit into
mainfrom
feat/naut-1310-sbom-digest-scoped-key
Closed

feat(sbommanager): key SBOM name on (digest, syftVersion)#823
jnathangreeg wants to merge 1 commit into
mainfrom
feat/naut-1310-sbom-digest-scoped-key

Conversation

@jnathangreeg
Copy link
Copy Markdown
Contributor

@jnathangreeg jnathangreeg commented May 26, 2026

Summary

Aligns sbom_manager's reservation contract with content-addressed storage: the SBOMSyft resource name is now derived from (imageDigest, syftVersion) rather than (imageTag, imageID). The tag becomes metadata via the existing ImageTagMetadataKey annotation.

Why

Two tag aliases of the same image (e.g. myapp:v1 + myapp:latest with identical digest) currently produce different SBOM names and get tracked as independent reservations — even though the underlying SBOM content is byte-identical. Downstream backends that store SBOMs content-addressed (e.g. kubescape/backend's gRPC PutSBOMStream keyed on (image_digest, syft_version)) can't faithfully implement the slug-scoped contract, so the adapter has to either dedupe across tags (which masks per-slug state) or layer machinery to maintain a slug→key index.

Per the discussion on armosec/private-node-agent#316: "Make the contract digest-scoped end to end in node-agent too (digest + syftVersion as the real key, tag/name only as metadata)."

What changes

pkg/sbommanager/v1/sbom_manager.go:

  • normalizedID := normalizeImageID(...) is computed first.
  • sbomName, err := names.ImageInfoToSlug(s.version, normalizedID) — last 6 chars of the digest hex provide the image identity, the sanitized syft version distinguishes tool-version upgrades.
  • The tag continues to be set on Annotations[ImageTagMetadataKey] for traceability.

No interface changes (SbomClient signatures unchanged); no in-cluster Storage impl changes; no mock changes. The semantic shift lives entirely in how sbomName is composed.

Behavioral notes

  • Same image, different tags, same digest → one SBOMSyft. (was: two)
  • Same image, same digest, different syft versions → two SBOMSyfts. (unchanged)
  • Tool-version upgrade still creates a new resource (different name).
  • TooLarge / Learning / NodeName branching in processContainerWithMetadata keeps the same shape because the annotations on the existing row carry that state.

Test plan

  • Existing pkg/sbommanager/v1/... tests pass.
  • Manual: two pods referencing the same digest under different tags produce a single SBOM row.
  • Manual: bumping syft version produces a new SBOM row.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor
    • Updated SBOM name generation to use normalized image identifiers, eliminating redundant computations during container processing.

Review Change Stack

Previously sbom_manager built the SBOMSyft resource name from
ImageInfoToSlug(imageTag, imageID), which mixed the image tag into the
reservation key. Two references to the same digest under different tags
(e.g. myapp:v1 + myapp:latest with identical digest) produced different
SBOM names and got tracked as independent reservations even though the
underlying SBOM content is byte-identical.

Switch the name composition to (digest, syftVersion):
- normalizedID is computed first.
- The slug is built from (s.version, normalizedID) so the digest hash
  stub identifies the image and the (sanitized) syft version
  distinguishes tool-version upgrades.
- The original image tag is preserved as ImageTagMetadataKey on the
  annotations for traceability — tag/name is metadata now, not key.

This aligns sbom_manager's reservation contract with the way
content-addressed storage layers (e.g. the kubescape/backend gRPC
store) actually key SBOMs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a71e2b3b-2c28-4696-9675-6fd96f10a898

📥 Commits

Reviewing files that changed from the base of the PR and between 11a512d and 9cfb8cd.

📒 Files selected for processing (1)
  • pkg/sbommanager/v1/sbom_manager.go

📝 Walkthrough

Walkthrough

The change refactors processContainerWithMetadata to compute normalized image identifiers once and reuse the result. Previously, normalizedID was derived later during SBOM construction; now it is calculated earlier and shared across SBOM name generation and metadata annotations.

Changes

Image ID Normalization Deduplication

Layer / File(s) Summary
Deduplicate image ID normalization
pkg/sbommanager/v1/sbom_manager.go
normalizedID is computed earlier using the image tag and ID, passed to names.ImageInfoToSlug(s.version, normalizedID) for SBOM naming, and reused for metadata annotations. The redundant later calculation of normalizedID is removed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A normalized ID, computed with care,
No longer duplicated here and there!
One value early, reused throughout,
Cleaner code, without a doubt! 🎯

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: deriving SBOM names from (digest, syftVersion) instead of the previous approach using image tags and IDs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/naut-1310-sbom-digest-scoped-key

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@matthyx matthyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issue: this PR changes the SBOMSyft name contract in node-agent, but kubevuln still looks these objects up by the old image-slug-based name.

In node-agent the name now comes from names.ImageInfoToSlug(s.version, normalizedID) in pkg/sbommanager/v1/sbom_manager.go, while kubevuln still computes/uses names.ImageInfoToSlug(imageTagNormalized, imageHash) / workload.ImageSlug in controllers/http.go and core/services/scan.go, then does a direct SBOMSyfts(...).Get(name) in repositories/apiserver.go.

That changes the shared storage key, so kubevuln will stop finding node-agent-generated SBOMs and will regenerate them instead of reusing them. I think this has to be a coordinated cross-repo change (plus migration/cleanup for existing old-name SBOMSyft objects), or the existing name contract should stay in place and dedupe happen another way.

@matthyx
Copy link
Copy Markdown
Contributor

matthyx commented May 26, 2026

this won't work, we have to find another way without renaming the SBOM... because kubevuln expects a certain name to reuse the SBOM (and it'll be the same for backend)
let's discuss tomorrow

@github-actions
Copy link
Copy Markdown

Performance Benchmark Results

Node-Agent Resource Usage
Metric BEFORE AFTER Delta
Avg CPU (cores) 0.213 0.209 -2.0%
Peak CPU (cores) 0.223 0.217 -2.9%
Avg Memory (MiB) 323.997 334.188 +3.1%
Peak Memory (MiB) 325.504 346.473 +6.4%
Dedup Effectiveness (AFTER only)
Event Type Passed Deduped Ratio
capabilities 0 0 N/A
hardlink 6000 0 0.0%
http 1705 119456 98.6%
network 902 77862 98.9%
open 33489 622734 94.9%
symlink 6000 0 0.0%
syscall 989 1924 66.0%
Event Counters
Metric BEFORE AFTER
capability_counter 10 9
dns_counter 1395 1431
exec_counter 7041 7228
network_counter 92506 94956
open_counter 770140 792698
syscall_counter 3550 3546

@jnathangreeg
Copy link
Copy Markdown
Contributor Author

Confirmed the cross-coupling: kubevuln still computes names.ImageInfoToSlug(imageTagNormalized, c.ImageHash) at controllers/http.go:193 and does a direct SBOMSyfts(ns).Get(name) via repositories/apiserver.go:904 — renaming the SBOMSyft in node-agent would break kubevuln's SBOM-reuse path.

Converting this PR to draft until we settle on a different approach (no rename). Happy to sync whenever works for you.

@jnathangreeg jnathangreeg marked this pull request as draft May 27, 2026 07:12
@matthyx matthyx moved this to To Archive in KS PRs tracking May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants