cozystack · Aleksei Sviridkin (lexfrei) · May 28, 2026 · gemini-code-assist · May 28, 2026
@@ -0,0 +1,129 @@
+---
+title: "Running Containerized GPU Workloads"
+linkTitle: "GPU Containers"
+description: "Run CUDA pods and other containerized GPU workloads on Cozystack management nodes that ship the NVIDIA driver and container toolkit via the distro package manager."
+weight: 160
+---
+
+This page covers running GPU workloads in regular Kubernetes pods (CUDA, ML training, inference) on Cozystack management cluster nodes. It targets the typical Linux GPU node shape — `apt`-installed (or `dnf`-installed) NVIDIA driver plus `nvidia-container-toolkit` — and uses the `container` variant of the `cozystack.gpu-operator` package.
+
+If instead you want to pass whole GPUs to KubeVirt VMs, see [GPU Passthrough](/docs/next/virtualization/gpu/) and [GPU Sharing with HAMi](/docs/next/kubernetes/gpu-sharing/) (HAMi is for fractional sharing in tenant Kubernetes clusters and is orthogonal to this variant; you can stack HAMi on top once the container variant is up).
+
+## When to pick this variant
+
+The `cozystack.gpu-operator` package exposes three architectural variants. Pick `container` when **all** of the following are true:
+
+- The host already runs the NVIDIA driver, installed via the distro package manager (`apt install nvidia-driver-*` on Ubuntu/Debian or the equivalent on RHEL/Fedora/openSUSE). The operator must not load its own kernel module.
+- The host already has `nvidia-container-toolkit` installed (`apt install nvidia-container-toolkit`). The operator must not deploy its own toolkit DaemonSet — that would overwrite `/etc/containerd/config.toml` and the CDI hooks the host package already wired up.
+- You want GPUs exposed to containers as `nvidia.com/gpu`, not passed through to KubeVirt VMs.
+
+The other two variants exist for the opposite host shape: `default` (passthrough) unbinds the host driver and binds `vfio-pci` for VM passthrough, and `vgpu` requires the proprietary NVIDIA vGPU host driver plus a license server. Neither path produces a working setup on a host that already ships the driver and container toolkit through apt — the operator and the host install fight each other.
+
+## Prerequisites
+
+- A Cozystack management cluster with at least one GPU-enabled node.
+- The GPU node runs a supported Linux distribution (Ubuntu, Debian, RHEL, Fedora, openSUSE) with the NVIDIA driver installed via the distro package manager. Verify with `nvidia-smi` over SSH or `kubectl debug node/<node-name>` — it must enumerate the physical GPUs and report a working driver version.
+- `nvidia-container-toolkit` installed on the same node and registered with containerd (`grep nvidia /etc/containerd/config.toml` shows the runtime entry).
+- `kubectl` configured against the management cluster.
+
+The operator-validator auto-detects pre-installed host drivers by probing `/host/usr/bin/nvidia-smi`, so on standard Ubuntu/Debian/RHEL/Fedora layouts no `hostPaths.driverInstallDir` override is needed. On Talos this probe misses (the extension installs `nvidia-smi` at `/usr/local/bin/`), so Talos requires a different starting point — see `packages/system/gpu-operator/examples/values-native-talos.yaml` in the [cozystack repo](https://github.com/cozystack/cozystack) for a working reference with the compat DaemonSet and the matching `driverInstallDir` override.
+
+## 1. Install the GPU Operator (container variant)
+
+**Do not** add `cozystack.gpu-operator` to `bundles.enabledPackages` for this variant. The platform Helm chart's optional-package template hardcodes `spec.variant: default` for every name in `enabledPackages` and reconciles the resulting `Package` CR under Helm ownership — any user `Package` CR with `variant: container` is overwritten on the next reconcile. Apply the `Package` CR directly instead; the cozystack platform controller installs it without the bundle entry.
+
+Apply a `Package` CR with `variant: container`:
+
+```yaml
+apiVersion: cozystack.io/v1alpha1
+kind: Package
+metadata:
+  name: cozystack.gpu-operator
+spec:
+  variant: container
-apiVersion: cozystack.io/v1alpha1
-kind: Package
-metadata:
-  name: cozystack.gpu-operator
-spec:
-  variant: container
+apiVersion: cozystack.io/v1alpha1
+kind: Package
+metadata:
+  name: cozystack.gpu-operator
+  namespace: cozy-system
+spec:
+  variant: container
-apiVersion: cozystack.io/v1alpha1
-kind: Package
-metadata:
-  name: cozystack.gpu-operator
-spec:
-  variant: container
+apiVersion: cozystack.io/v1alpha1
+kind: Package
+metadata:
+  name: cozystack.gpu-operator
+  namespace: cozy-system
+spec:
+  variant: container
+```
+
+```bash
+kubectl apply -f gpu-operator-container.yaml
+```
+
+The platform controller resolves the variant against the `PackageSource` (`packages/core/platform/sources/gpu-operator.yaml`), pulls `values.yaml` + `values-container.yaml` from the OCI repository, and installs the chart into `cozy-gpu-operator`.
+
+## 2. Verify the operator is healthy
+
+All pods in the `cozy-gpu-operator` namespace should reach `Running`:
+
+```bash
+kubectl get pods --namespace cozy-gpu-operator
+```
+
+Example output (pod names will vary):
+
+```console
+NAME                                                          READY   STATUS    RESTARTS   AGE
+gpu-feature-discovery-7jpzv                                   1/1     Running   0          2m
+gpu-operator-7976b5b8fb-xqg2z                                 1/1     Running   0          3m
+nvidia-cuda-validator-tjkfh                                   0/1     Completed 0          2m
+nvidia-dcgm-exporter-rmpfg                                    1/1     Running   0          2m
+nvidia-device-plugin-daemonset-cqj9w                          1/1     Running   0          2m
+nvidia-operator-validator-q5n4k                               1/1     Running   0          3m
+```
+
+The `container` variant does **not** spawn `nvidia-driver-daemonset`, `nvidia-container-toolkit-daemonset`, or `nvidia-vfio-manager` — all three are pinned off by design.
+
+The node should advertise `nvidia.com/gpu` as an allocatable resource:
+
+```bash
+kubectl describe node <node-name>
+```
+
+```console
+...
+Capacity:
+  ...
+  nvidia.com/gpu:         2
+  ...
+Allocatable:
+  ...
+  nvidia.com/gpu:         2
+...
+```
+
+## 3. Run a sample CUDA pod
+
+Create a pod that requests one GPU and runs `nvidia-smi`:
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: cuda-smoke
+spec:
+  restartPolicy: OnFailure
+  containers:
+  - name: cuda
+    image: nvcr.io/nvidia/cuda:12.4.1-base-ubuntu22.04
+    command: ["nvidia-smi"]
+    resources:
+      limits:
+        nvidia.com/gpu: 1
+```
+
+```bash
+kubectl apply -f cuda-smoke.yaml
+kubectl logs cuda-smoke
+```
+
+The output should enumerate the GPU(s) visible to the pod and report the driver version that the host runs.
+
+## Fractional GPU sharing
+
+The `container` variant exposes whole GPUs through the upstream NVIDIA device plugin. To slice one GPU across multiple pods (memory and compute quotas per pod), enable HAMi on top — HAMi reuses the same device plugin layer and is wired in via the `cozystack.hami` package, which already depends on `cozystack.gpu-operator`. See [GPU Sharing with HAMi](/docs/next/kubernetes/gpu-sharing/) for the tenant Kubernetes flow; for management-cluster workloads the wiring is the same package set with HAMi enabled.
+
+## Variant comparison
+
+| Workload shape | Variant | Host driver | Host container toolkit | Notes |
+| --- | --- | --- | --- | --- |
+| Containers (CUDA pods, ML) | `container` | required | required | This page |
+| Whole GPU to one VM | `default` | must NOT be loaded — operator binds `vfio-pci` | not used | [GPU Passthrough](/docs/next/virtualization/gpu/) |
+| Sliced GPU to multiple VMs | `vgpu` | proprietary NVIDIA vGPU host driver | not used | Requires NVIDIA vGPU license + a Delegated License Service endpoint |