Skip to content

Add CDI for WSLC GPU#40583

Open
kvega005 wants to merge 26 commits into
microsoft:masterfrom
kvega005:gpuCDI
Open

Add CDI for WSLC GPU#40583
kvega005 wants to merge 26 commits into
microsoft:masterfrom
kvega005:gpuCDI

Conversation

@kvega005
Copy link
Copy Markdown
Contributor

@kvega005 kvega005 commented May 18, 2026

Summary of the Pull Request

This pull request implements a CDI spec for the WSL GPU device, adds a runtime hook for container setup, and updates the Docker integration to request GPU resources via CDI. The changes also include schema additions, refactoring, and updates to tests to align with the new approach.

Container Device Interface (CDI) Integration for GPU:

  • Added CDI schema definitions in cdi_schema.h and logic in WSLCInit.cpp to generate a CDI spec for the WSL GPU device, including device nodes, mounts, and a runtime hook (wsl-gpu-hook). This spec is written to /etc/cdi/microsoft.com-wslc.json for use by container runtimes.

  • Implemented the wsl-gpu-hook entrypoint and registration, which configures the container's dynamic linker (ld.so.conf.d/ld.wsl.conf and ld.so.cache) to ensure GPU libraries are discoverable inside the container.

Docker Integration and Schema Updates:

  • Updated the Docker host config schema to support DeviceRequests, enabling containers to request the WSL GPU device via CDI rather than through direct device and bind mounts.

Refactoring and Removal of Legacy GPU Setup:

  • Removed previous logic that manually set LD_LIBRARY_PATH and bind-mounted GPU libraries/drivers, as this is now handled by the CDI spec and runtime hook.

Test Updates:

  • Updated tests to validate the new CDI-based GPU setup, checking for the presence of the ld.so.conf fragment and correct linker cache population, instead of checking LD_LIBRARY_PATH and manual mounts. [1] [2]

PR Checklist

  • Closes: Link to issue #xxx
  • Communication: I've discussed this with core contributors already. If work hasn't been agreed, this work might be rejected
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized
  • Dev docs: Added/updated if needed
  • Documentation updated: If checked, please file a pull request on our docs repo and link it here: #xxx

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

Copilot AI review requested due to automatic review settings May 18, 2026 21:49
Comment thread src/linux/init/WSLCInit.cpp Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Container Device Interface (CDI) plumbing for WSLC GPU support: generating a CDI spec in the guest, enabling Docker’s CDI feature, and switching WSLC container creation from manual device/bind mounts + LD_LIBRARY_PATH manipulation to a CDI DeviceRequests flow with an OCI runtime hook to configure the dynamic linker inside the container.

Changes:

  • Generate a CDI spec for the WSL GPU device and register a wsl-gpu-hook OCI hook in the Linux init path.
  • Update Docker integration to request the GPU via HostConfig.DeviceRequests (driver cdi) instead of manual binds/devices and LD_LIBRARY_PATH.
  • Update Windows tests to validate dynamic linker configuration (ld.so.conf.d fragment + ld.so.cache) rather than LD_LIBRARY_PATH.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/windows/WSLCTests.cpp Updates GPU container validation to assert ld.so.conf.d fragment and linker cache presence.
src/windows/wslcsession/WSLCContainer.cpp Switches GPU container setup to Docker DeviceRequests (CDI) and removes legacy env/bind/device injection.
src/windows/inc/docker_schema.h Extends Docker schema with DeviceRequest and HostConfig.DeviceRequests serialization support.
src/shared/inc/lxinitshared.h Adds shared constants for the hook name and CDI kind/device identifiers.
src/shared/inc/cdi_schema.h Introduces JSON schema types for CDI spec generation.
src/linux/init/WSLCInit.cpp Generates /etc/cdi/microsoft.com-wslc.json, writes Docker daemon config to enable CDI, and implements the wsl-gpu-hook entrypoint.
src/linux/init/init.cpp Routes the new wsl-gpu-hook basename to the hook entry function.

Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp
Comment thread src/linux/init/WSLCInit.cpp
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
@kvega005 kvega005 marked this pull request as ready for review May 18, 2026 23:00
@kvega005 kvega005 requested a review from a team as a code owner May 18, 2026 23:00
Copilot AI review requested due to automatic review settings May 18, 2026 23:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

src/linux/init/WSLCInit.cpp:143

  • WriteToFile() does not truncate existing files. Re-writing /etc/docker/daemon.json this way can leave trailing bytes from an older, longer config and result in invalid JSON (breaking Docker startup). Prefer a truncating/atomic write strategy for this file.
    config["features"]["cdi"] = true;

    THROW_LAST_ERROR_IF(UtilMkdirPath("/etc/docker", 0755) < 0);
    THROW_LAST_ERROR_IF(WriteToFile(c_daemonConfigPath, config.dump().c_str()) < 0);
}

src/linux/init/WSLCInit.cpp:176

  • Same truncation concern as above: the hook writes ld.wsl.conf using WriteToFile() without truncation, so if the file already exists with longer content, stale trailing bytes/lines can remain and affect linker configuration. Use a truncating/atomic write for this file.
    const std::string confDir = rootfs + "/etc/ld.so.conf.d";
    const std::string confPath = confDir + "/ld.wsl.conf";

    THROW_LAST_ERROR_IF(UtilMkdirPath(confDir.c_str(), 0755) < 0);
    THROW_LAST_ERROR_IF(WriteToFile(confPath.c_str(), LXSS_LIB_PATH "\n") < 0);

Comment thread src/linux/init/WSLCInit.cpp
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
Comment thread src/linux/init/WSLCInit.cpp Outdated
Copilot AI review requested due to automatic review settings May 19, 2026 20:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Comment thread src/linux/init/util.h
Comment thread src/linux/init/util.cpp
Comment thread src/linux/init/WSLCInit.cpp
Comment thread src/linux/init/WSLCInit.cpp
Comment thread test/windows/WSLCTests.cpp
Comment thread test/windows/WSLCTests.cpp
Comment thread test/windows/WSLCTests.cpp
Comment thread src/linux/init/util.h
uint16_t UtilWinAfToLinuxAf(uint16_t AddressFamily);

int WriteToFile(const char* Path, const char* Content, int permissions = 0644);
int WriteToFile(const char* Path, const char* Content, int OpenFlags = O_WRONLY | O_CLOEXEC | O_CREAT, int Permissions = 0644);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know why you did things in this order (to match real open argument order), but we should be careful that any callers that were previously passing optional vales for permission are not now passing them as OpenFlags.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you did an audit of everybody calling WriteToFile?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe safest to just remove the default parameters from the def and chase down compiler errors to make sure they are all right.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I checked. There were actually no callers using the third (permissions) parameter at all. We can actually remove that entirely if you want.

rootfsPath = bundle / rootfsPath;
}

rootfsPath = std::filesystem::canonical(rootfsPath);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, but is there not a canonicalize that modifies the existing path?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think canonicalization is only provided as non-member functions that return a new path. There is no path::canonical method.

@benhillis benhillis dismissed their stale review May 19, 2026 23:25

Looking good.

Copy link
Copy Markdown
Member

@benhillis benhillis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking close, just please make sure none of the write helpers are accidentally passing the wrong thing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants