Skip to content

Implement Runtime NVMe Instance Storage Discovery Using AWS EBS Symlinks#396

Merged
neddp merged 17 commits into
mainfrom
fix-nvme-instance-storage-discovery
Jun 12, 2026
Merged

Implement Runtime NVMe Instance Storage Discovery Using AWS EBS Symlinks#396
neddp merged 17 commits into
mainfrom
fix-nvme-instance-storage-discovery

Conversation

@neddp

@neddp neddp commented Feb 2, 2026

Copy link
Copy Markdown
Member

Problem

On AWS Nitro-based instances with NVMe devices, the kernel's PCIe enumeration order is non-deterministic. This means:

  • /dev/nvme0n1 could be the root EBS volume OR instance storage
  • /dev/nvme1n1 could be instance storage OR the root EBS volume
  • The order varies between boots and instance types
  • There is no guaranteed ordering

Solution

Implemented runtime discovery to reliably identify instance storage by excluding EBS volumes.

Discovery Algorithm

  1. Glob all NVMe devices: /dev/nvme*n1
  2. Glob EBS symlinks: /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_*
  3. Resolve each symlink to its target device
  4. Subtract EBS devices from all NVMe devices = instance storage
  5. Validate count matches CPI expectations
  6. Partition only the discovered instance storage devices

Why EBS Symlinks Are Reliable

AWS automatically creates persistent symlinks for all EBS volumes via udev rules:

/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol{volume_id}

Backwards Compatibility

Non-NVMe instances: No changes to behavior

  • Traditional Xen instances (/dev/xvdb, /dev/sdb) use CPI paths directly
  • Paravirtual instances work as before

This must be merged together with the CPI changes - cloudfoundry/bosh-aws-cpi-release#196


Pair @Ivaylogi98

@rkoster rkoster left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I would have expected this logic to go into the https://github.com/cloudfoundry/bosh-agent/tree/main/infrastructure/devicepathresolver package.

Comment thread platform/linux_platform.go Outdated
@github-project-automation github-project-automation Bot moved this from Inbox to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group Feb 3, 2026
@neddp

neddp commented Feb 3, 2026

Copy link
Copy Markdown
Member Author

In general I would have expected this logic to go into the https://github.com/cloudfoundry/bosh-agent/tree/main/infrastructure/devicepathresolver package.

Thank you for the review! That's was a big oversight on my end, I'll look into it.

@rkoster

rkoster commented Feb 3, 2026

Copy link
Copy Markdown
Contributor

No worries 🙂

@beyhan

beyhan commented Feb 5, 2026

Copy link
Copy Markdown
Member

We discussed this during the FI WG meeting and this have to relay on the stemcell agent settings and agent strategy for disc handling.

@neddp neddp requested a review from rkoster February 9, 2026 14:27
@neddp neddp changed the title Implement Runtime NVMe Instance Storage Discovery Using EBS Symlinks Implement Runtime NVMe Instance Storage Discovery Using AWS EBS Symlinks Feb 9, 2026
@rkoster

rkoster commented Feb 12, 2026

Copy link
Copy Markdown
Contributor

As discussed during the working group meeting, focus is now on validating: cloudfoundry/bosh-aws-cpi-release#196 (comment)

@rkoster

rkoster commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

As per: cloudfoundry/bosh-aws-cpi-release#196 (comment) this change is still needed. Please continue reviewing.

@rkoster rkoster requested review from a team and ramonskie and removed request for a team February 19, 2026 15:54

@rkoster rkoster left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think this PR could be done in an IaaS agnostic way.

Comment thread infrastructure/devicepathresolver/aws_nvme_instance_storage_resolver.go Outdated
Comment thread infrastructure/devicepathresolver/aws_nvme_instance_storage_resolver.go Outdated
neddp and others added 3 commits February 25, 2026 14:06
* Refactor instance storage discovery into configurable component

Implement auto-detection for instance storage disk type

* Fix windows tests

* Fix windows tests (but for real this time)
@neddp neddp force-pushed the fix-nvme-instance-storage-discovery branch from e7d00b4 to dcd857a Compare February 25, 2026 12:09
@rkoster

rkoster commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

@neddp could you take a look at these failing unit tests?

@neddp

neddp commented Mar 26, 2026

Copy link
Copy Markdown
Member Author

Hi @rkoster,

We still haven't had the time to test the changes on an actual deployment. I will move the PR to draft until we can confirm everything is working fine.

We'll address the tests as well.

@neddp neddp marked this pull request as draft March 26, 2026 13:04
* Make implementation iaas-agnostic

* Rename storage resolver files

* Fix tests

* Remove instance storage resolver

* Don't use the aws pattern as default

* Refactor NVMe instance storage discovery and remove unused symlink patterns

* Enhance NVMe instance storage discovery with managed volume pattern support

* Fix unit tests

* Don't run windows unit tests when not supported

* Simplify FakeDevicePathResolver by removing unused fields and methods

* Wait for udev to settle before resolving EBS symlinks

* Add debug logs

* Import udev and add comment about why it's needed
@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds SymlinkDeviceResolver with NVMe constants, constructor, ResolveSymlinksToDevices, GetDevicesByPattern, and FilterDevices plus Ginkgo tests. Wires a SymlinkDeviceResolver into NewProvider and NewLinuxPlatform. Refactors Linux SetupRawEphemeralDisks to discover instance-storage devices (NVMe glob + symlink exclusion or identity resolution), sort/validate discovered devices, and partition discovered device paths. Also updates fake resolver recording and platform tests to inject and use the new resolver.

Suggested reviewers

  • aramprice
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: implementing runtime NVMe instance storage discovery using AWS EBS symlinks as an exclusion mechanism.
Description check ✅ Passed The description comprehensively explains the problem (non-deterministic NVMe enumeration), the solution (runtime discovery via EBS symlink exclusion), the algorithm, and backwards compatibility considerations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-nvme-instance-storage-discovery

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@neddp

neddp commented Apr 30, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@neddp neddp marked this pull request as ready for review April 30, 2026 12:33
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 1, 2026
DiskSettings does not implement fmt.Stringer, so using %s produced
malformed %!s(...) output. Switch to %+v for actionable error messages.
@neddp

neddp commented Jun 1, 2026

Copy link
Copy Markdown
Member Author

Both suggestions were addressed.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
platform/linux_platform.go (1)

859-875: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fail closed when managed-volume symlink resolution is incomplete.

managedDevices is treated as authoritative here, but the resolver upstream currently skips unreadable symlinks. If one EBS symlink is missed and the filtered count still matches len(devices), the code below can mklabel a managed volume or even the root disk. Please make unresolved managed-volume symlinks abort discovery instead of continuing with a partial exclusion set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@platform/linux_platform.go` around lines 859 - 875, ResolveSymlinksToDevices
currently may skip unreadable symlinks, letting managedDevices be incomplete;
change the resolver to surface skipped/unresolved symlinks (e.g., change
ResolveSymlinksToDevices to return (devices []string, skipped int, err error) or
return an error when any symlink cannot be read) and update this call site in
linux_platform.go to treat any skipped/unresolved count or non-nil error as
fatal: after calling p.symlinkDeviceResolver.ResolveSymlinksToDevices, if
skipped>0 (or err != nil) return an error instead of continuing, so
managedDevices cannot be partial before calling FilterDevices/instanceStorage
and proceeding with mklabel operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@platform/linux_platform.go`:
- Around line 883-888: The loop calling devicePathResolver.GetRealDevicePath
ignores the boolean timedOut return; update the loop in linux_platform.go (where
devicePathResolver.GetRealDevicePath is invoked) to check the timedOut flag and
treat it as an explicit failure: if timedOut or realPath == "" return a wrapped
error (similar to other call sites) instead of continuing, so the function
returns a clear timeout error rather than allowing an empty path to be passed to
parted; use the same bosherr.WrapErrorf pattern and include context mentioning
the device and that resolution timed out.

---

Outside diff comments:
In `@platform/linux_platform.go`:
- Around line 859-875: ResolveSymlinksToDevices currently may skip unreadable
symlinks, letting managedDevices be incomplete; change the resolver to surface
skipped/unresolved symlinks (e.g., change ResolveSymlinksToDevices to return
(devices []string, skipped int, err error) or return an error when any symlink
cannot be read) and update this call site in linux_platform.go to treat any
skipped/unresolved count or non-nil error as fatal: after calling
p.symlinkDeviceResolver.ResolveSymlinksToDevices, if skipped>0 (or err != nil)
return an error instead of continuing, so managedDevices cannot be partial
before calling FilterDevices/instanceStorage and proceeding with mklabel
operations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ab6aff62-71d1-4d01-9a14-e284e52205e7

📥 Commits

Reviewing files that changed from the base of the PR and between c43c6a1 and 3b63c11.

📒 Files selected for processing (1)
  • platform/linux_platform.go

Comment thread platform/linux_platform.go
@neddp neddp requested a review from aramprice June 1, 2026 05:47
neddp added 2 commits June 1, 2026 09:01
Silently skipping a symlink that cannot be resolved leaves the managed
device exclusion set incomplete. If an EBS volume's by-id symlink is
broken, FilterDevices would not exclude it and the device could be
misidentified as instance storage, potentially causing data loss.

Return a wrapped error instead of continuing, so callers can propagate
the failure rather than proceeding with a partial exclusion set.
The platform-level test still expected the old skip-and-continue behavior.
Updated to assert that a broken managed volume symlink returns an error.
aramprice
aramprice previously approved these changes Jun 1, 2026
@aramprice

Copy link
Copy Markdown
Member

/coderabbitai review

@neddp

neddp commented Jun 3, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@infrastructure/devicepathresolver/symlink_device_resolver_test.go`:
- Around line 88-90: The test currently expects an error from
resolver.ResolveSymlinksToDevices("/dev/disk/by-id/nvme-*") but doesn't assert
the returned mapping is nil; update the test to also assert that the returned
result (the value assigned from ResolveSymlinksToDevices) is nil on failure to
ensure no partial mapping is returned. Locate the call to
ResolveSymlinksToDevices and the variables holding its return values (e.g.,
result, err) and add an assertion like Expect(result).To(BeNil()) immediately
after Expect(err).To(HaveOccurred()) /
Expect(err.Error()).To(ContainSubstring("nvme-invalid")) so the contract on
error paths is enforced.

In `@platform/linux_platform.go`:
- Around line 884-889: The current call to
p.devicePathResolver.GetRealDevicePath handles err before checking timedOut,
which incorrectly treats cases where timedOut==true and err!=nil as a generic
resolver error; update the logic in the function containing this call so that
you check the timedOut boolean first and return the timeout-specific error
(bosherr.Errorf("Timed out resolving device path for %+v", device)) when
timedOut is true, otherwise handle err via bosherr.WrapErrorf; reference
GetRealDevicePath, devicePathResolver, timedOut, and the existing
bosherr.WrapErrorf/bosherr.Errorf calls to locate and adjust the branches
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8929d5c5-a5fd-4727-be1e-bbcf81c8484c

📥 Commits

Reviewing files that changed from the base of the PR and between c43c6a1 and 27f51d4.

📒 Files selected for processing (4)
  • infrastructure/devicepathresolver/symlink_device_resolver.go
  • infrastructure/devicepathresolver/symlink_device_resolver_test.go
  • platform/linux_platform.go
  • platform/linux_platform_test.go

Comment thread infrastructure/devicepathresolver/symlink_device_resolver_test.go Outdated
Comment thread platform/linux_platform.go
- discoverIdentityInstanceStorage: check timedOut before err so the
  explicit timeout error is never shadowed when both are set
- symlink_device_resolver_test: assert result is nil on failure to
  verify the fail-closed contract
@neddp

neddp commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 5, 2026
rkoster
rkoster previously approved these changes Jun 11, 2026
@github-project-automation github-project-automation Bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group Jun 11, 2026
@neddp neddp dismissed stale reviews from rkoster and coderabbitai[bot] via 7c956d4 June 11, 2026 14:42
@neddp neddp merged commit 142e849 into main Jun 12, 2026
23 of 24 checks passed
@github-project-automation github-project-automation Bot moved this from Pending Merge | Prioritized to Done in Foundational Infrastructure Working Group Jun 12, 2026
neddp added a commit that referenced this pull request Jun 14, 2026
…nks (#396)

* Implement runtime NVMe instance storage discovery using EBS symlinks

* Fix nvme instance storage discovery (#400)

* Refactor instance storage discovery into configurable component

Implement auto-detection for instance storage disk type

* Fix windows tests

* Fix windows tests (but for real this time)

* Remove leftover path normalization (#401)

* Fix nvme instance storage discovery (#407)

* Make implementation iaas-agnostic

* Rename storage resolver files

* Fix tests

* Remove instance storage resolver

* Don't use the aws pattern as default

* Refactor NVMe instance storage discovery and remove unused symlink patterns

* Enhance NVMe instance storage discovery with managed volume pattern support

* Fix unit tests

* Don't run windows unit tests when not supported

* Simplify FakeDevicePathResolver by removing unused fields and methods

* Wait for udev to settle before resolving EBS symlinks

* Add debug logs

* Import udev and add comment about why it's needed

* Fix missing closing bracket

* Update infrastructure/devicepathresolver/symlink_device_resolver.go

Co-authored-by: Ivaylo Ivanov <ivaylogi98@gmail.com>

* Update infrastructure/devicepathresolver/symlink_device_resolver_test.go

Co-authored-by: Ivaylo Ivanov <ivaylogi98@gmail.com>

* Fix lint identation

* Fix: skip unresolvable symlinks instead of returning error

ResolveSymlinksToDevices now logs a warning and continues when a
symlink cannot be resolved (e.g. stale/broken symlinks in
/dev/disk/by-id/). This prevents unnecessary deploy failures while
the count validation in discoverNVMeInstanceStorage still catches
any real mismatches.

Co-authored-by: Ivaylo Ivanov <ivaylogi98@gmail.com>

* Use the already constructed udev instance

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix: use %+v instead of %s for DiskSettings in error message

DiskSettings does not implement fmt.Stringer, so using %s produced
malformed %!s(...) output. Switch to %+v for actionable error messages.

* fix: handle timedOut in discoverIdentityInstanceStorage

* resolver: fail hard on unresolvable symlinks in ResolveSymlinksToDevices

Silently skipping a symlink that cannot be resolved leaves the managed
device exclusion set incomplete. If an EBS volume's by-id symlink is
broken, FilterDevices would not exclude it and the device could be
misidentified as instance storage, potentially causing data loss.

Return a wrapped error instead of continuing, so callers can propagate
the failure rather than proceeding with a partial exclusion set.

* test: update linux_platform test for hard-fail on broken symlinks

The platform-level test still expected the old skip-and-continue behavior.
Updated to assert that a broken managed volume symlink returns an error.

* fix: prioritize timedOut over err; assert nil result on resolver error

- discoverIdentityInstanceStorage: check timedOut before err so the
  explicit timeout error is never shadowed when both are set
- symlink_device_resolver_test: assert result is nil on failure to
  verify the fail-closed contract

---------

Co-authored-by: Ivaylo Ivanov <ivaylogi98@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

6 participants