-
Notifications
You must be signed in to change notification settings - Fork 50
Add SLES support for AMD gpu-operator #365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
sajmera-pensando
merged 4 commits into
ROCm:main
from
Priyankasaggu11929:enable-sles-support
May 19, 2026
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
55157d9
add support for detecting SLES nodes and automatically selecting appr…
Priyankasaggu11929 332a340
add SLES Dockerfile template (DockerfileTemplate.sles) using prebuilt…
Priyankasaggu11929 2adfee0
tests: update internal/utils_test.go for added support for SLES 15 SP7+
Priyankasaggu11929 5a0c824
use "registry.suse.com" as the default base image registry if OS == "…
Priyankasaggu11929 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # SPDX-License-Identifier: MIT | ||
| # | ||
| # Uses a SUSE built AMD GPU driver image as source, avoiding the need | ||
| # for a SUSEConnect registration key. modules.dep uses relative paths so it is | ||
| # valid for any kernel version in the same codestream (stable kABI). | ||
| # | ||
| # Build args: | ||
| # KERNEL_FULL_VERSION - target kernel version, e.g. "6.4.0-150700.53.25-default" | ||
| # SUSE_PREBUILT_DRIVER_IMG - SUSE-published driver image (built against GA kernel) | ||
|
|
||
| ARG SUSE_PREBUILT_DRIVER_IMG | ||
|
|
||
| FROM ${SUSE_PREBUILT_DRIVER_IMG} AS driver-source | ||
|
|
||
| FROM $$BASEIMG_REGISTRY/bci/bci-micro:$$VERSION | ||
|
|
||
| ARG KERNEL_FULL_VERSION | ||
|
|
||
| # Relocate modules, kernel tree and depmod metadata from the GA kernel path to | ||
| # the target kernel version path; no depmod re-run needed (relative paths). | ||
| COPY --from=driver-source /opt/lib/modules/ /opt/lib/modules-prebuilt/ | ||
| RUN set -euo pipefail; \ | ||
| GA_KERNEL=$(ls /opt/lib/modules-prebuilt/ | head -1); \ | ||
| mkdir -p /opt/lib/modules/${KERNEL_FULL_VERSION}/updates/dkms; \ | ||
| cp /opt/lib/modules-prebuilt/${GA_KERNEL}/updates/dkms/amd* \ | ||
| /opt/lib/modules/${KERNEL_FULL_VERSION}/updates/dkms/; \ | ||
| cp /opt/lib/modules-prebuilt/${GA_KERNEL}/modules.* \ | ||
| /opt/lib/modules/${KERNEL_FULL_VERSION}/; \ | ||
| cp -r /opt/lib/modules-prebuilt/${GA_KERNEL}/kernel \ | ||
| /opt/lib/modules/${KERNEL_FULL_VERSION}/kernel; \ | ||
| rm -rf /opt/lib/modules-prebuilt | ||
|
|
||
| RUN mkdir -p /firmwareDir/updates/amdgpu | ||
| COPY --from=driver-source /firmwareDir/updates/amdgpu /firmwareDir/updates/amdgpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug — data tables are inconsistent. This prebuilt-image table only registers
15.7 → 7.0.3, butSLESDefaultDriverVersionsMapper(utils.go) defaults to7.0.2for SP6 and6.2.2for SP5/base. With those defaults, the lookup ingetKM(around line 572) silently misses andSUSE_PREBUILT_DRIVER_IMGis never injected. The new Dockerfile template has no fallback —FROM ${SUSE_PREBUILT_DRIVER_IMG} AS driver-sourceresolves toFROM AS driver-source, which fails the build.Net effect: with the defaults this PR ships, SP5 and SP6 nodes (which the PR claims to support) cannot build. Either add prebuilt entries for the SP5/SP6 default driver versions, narrow
SLESDefaultDriverVersionsMapperto only return versions present here, or surface an explicit error on lookup miss.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer.
SLESDefaultDriverVersionsMapperinutils.gowas supposed to be cleaned up as part of my previous commit refreshes.I have narrowed the supported versions to SLES 15 SP7 only and keeping just the
15.7 -> 7.0.3entry in the table. Since, we are now going to build the prebuilt driver container images starting SLES 15 SP7.(I'll either send a new PR or update this one to add SLES 16.0 as well to the table, as soon as the respective prebuilt container image is ready).
Also addressed the silent-skip behavior to now properly return error messages on any version outside SLES 15.7 for now.
Also, please note - I will update the driver version to use the new amdgpu driver version -
v31.20once the respective prebuilt container image is ready and published. Thanks!