msft-preview: runtime and runtime-rs: Add support for non-VF physical network devices to both runtime and runtime-rs#441
Conversation
Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/43668151 Rationale: This is a temporary solution for optimizing memory usage for the current mechanism of requesting resources through pod Limit annotations: - if no Limits are specified and hence WorkloadMemMB is 0, set a default value 'StaticWorkloadDefaultMem' to allocate a default amount of memory for use for containers in the sandbox in addition to the base memory - if Limits are specified, the base memory and the sum of Limits are allocated. The end user needs to be aware of the minimum memory requirements for their pods, otherwise the pod will be stuck in the ContainerCreating state Testing: Manual testing, creating pods with Limits and without limits, and with two containers where each container has a limit, tested with integration in a SPEC file where the config variables were set via environment variables via the make command Adapted by @mfrw from 3.1.0 to apply to 3.2.0 Signed-off-by: Muhammad Falak R Wani <mwani@microsoft.com> Signed-off-by: Manuel Huber <mahuber@microsoft.com> runtime: Remove unused VMM options for mem alloc - We only ever tested these fork changes with CLH+MSHV - Remove these options as we don't use QEMU/FC Signed-off-by: Manuel Huber <mahuber@microsoft.com>
This branch starts introducing additional scripting to build, deploy
and evaluate the components used in AKS' Pod Sandboxing and
Confidential Containers preview features. This includes the capability
to build the IGVM file and its reference measurement file for remote
attestation.
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
tools: Improve igvm-builder and node-builder/azure-linux scripting
- Support for Mariner 3 builds using OS_VERSION variable
- Improvements to IGVM build process and flow as described in README
- Adoption of using only cloud-hypervisor-cvm on CBL-Mariner
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
tools: Add package-tools-install functionality
- Add script to install kata-containers(-cc)-tools bits
- Minor improvements in README.md
- Minor fix in package_install
- Remove echo outputs in package_build
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
tools: Enable setting IGVM SVN
- Allow setting SVN parameter for IGVM build scripting
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: introduce BUILD_TYPE variable
This lets developers build and deploy Kata in debug mode without having to make
manual edits to the build scripts.
With BUILD_TYPE=debug (default is release):
* The agent is built in debug mode.
* The agent is built with a permissive policy (using allow-all.rego).
* The shim debug config file is used, ie. we create the symlink
configuration-clh-snp-debug.toml <- configuration-clh-snp.toml.
For example, building and deploying Kata-CC in debug mode is now as simple as:
make BUILD_TYPE=debug all-confpods deploy-confpods
Also do note that make still lets you override the other variables even after
setting BUILD_TYPE. For example, you can use the production shim config with
BUILD_TYPE=debug:
make BUILD_TYPE=debug SHIM_USE_DEBUG_CONFIG=no all-confpods deploy-confpods
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
node-builder: introduce SHIM_REDEPLOY_CONFIG
See README: when SHIM_REDEPLOY_CONFIG=no, the shim configuration is NOT
redeployed, so that potential config changes made directly on the host
during development aren't lost.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
node-builder: Use img for Pod Sandboxing
Switch from UVM initrd to image format
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Adapt README instructions
- Sanitize containerd config snippet
- Set podOverhead for Kata runtime class
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
tools: Adapt AGENT_POLICY_FILE path
- Adapt path in uvm_build.sh script to comply
with the usptream changes we pulled in
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Use Azure Linux 3 as default path
- update recipe and node-builder scripting
- change default value on rootfs-builder
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Deploy-only for AzL3 VMs
- split deployment sections in node-builder README.md
- install jq, curl dependencies within IGVM script
- add path parameter to UVM install script
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Minor updates to README.md
- no longer install make package, is part of meta package
- remove superfluous popd
- add note on permissive policy for ConfPods UVM builds
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Updates to README.md
- with the latest 3.2.0.azl4 package on PMC, can remove OS_VERSION parameter
and use the make deploy calls instead of copying files by hand for variant
I (now aligned with Variant II)
- with the latest changes on msft-main, set the podOverhead to 600Mi
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: Fix SHIM_USE_DEBUG_CONFIG behavior
Using a symlink would create a cycle after calling this script again when
copying the final configuration at line 74 so we just use cp instead.
Also, I moved this block to the end of the file to properly override the final
config file.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
node-builder: Build and install debug configuration for pod sandboxing
For ease of debugging, install a configuration-clh-debug.toml for pod
sandboxing as we do in Conf pods.
Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
runtime: remove clh-snp config file usage in makefile
Not needed to build vanilla kata
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
package_tools_install.sh: include nsdax.gpl.c
Include nsdax.gpl.c
Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
node-builder: fix typo in string comparison
This also fixes a shellcheck error and lets us require the
shellcheck-required job:
In ./tools/osbuilder/node-builder/azure-linux/uvm_build.sh line 34:
if [ -z "${UVM_KERNEL_HEADER_DIR}}" ]; then
^-- SC2157 (error): Argument to -z is always false due to literal strings.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
docs: node-builder: fix static check error
This fixes the below static check error to follow up on the infra fix from
kata-containers#11646:
2025-07-31T19:32:45.0031829Z time="2025-07-31T19:32:44.990004665Z" level=fatal msg="found 2 parse errors:\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Set up environment\" (heading: {Name:Set up environment MDName:Set up environment LinkName:set-up-environment Level:2})\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Install build dependencies\" (heading: {Name:Install build dependencies MDName:Install build dependencies LinkName:install-build-dependencies Level:2})" commit=1d17f56b1aa7a880468b8e25d14467c92dca8eeb name=kata-check-markdown pid=9075 source=check-markdown version=0.0.1
Note: that is likely flagged because having two headings with the same
name, even under different sections, makes it impossible to create a
canonical heading link in Markdown.
This should eventually be squashed into the node-builder commit.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
docs: node-builder: Remove references to moby-containerd-cc
As we adopted containerd2, we remove references to our prior
forked containerd version.
Signed-off-by: Manuel Huber <mahuber@microsoft.com>
node-builder: 2Mb aligned guest image size
Build the mariner guest image using IMAGE_SIZE_ALIGNMENT_MB=2.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
to-squash: node-builder: add reference to README.md
This is needed to avoid the following static-checks error:
2025-08-05T21:27:20.0028337Z [static-checks.sh:808] ERROR: Document tools/osbuilder/node-builder/azure-linux/README.md is not referenced
This commit is to be squashed into the node-builder commit.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
After these changes:
1. The value of the K8s runtime class memory overhead:
- Covers the memory usage from all the Host-side components (mainly
the Kata Shim and the VMM).
- Doesn't include the memory usage from any Guest-side components.
2. The value of a pod memory limit specified by the user:
- Is equal to the memory size of the Pod VM.
- Includes the memory usage from all the Guest-side components
(mainly user's workload, the Guest kernel, and the Kata Agent)
- Doesn't include the memory usage from any Host-side components.
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
runtime: fix `make test`
This addresses the following errors from `make test` to allow us to require
that upstream CI:
https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142422035?pr=392#step:13:53
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
- similar to the static_sandbox_default_workload_mem option, assign a default number of vcpus to the VM when no limits are given, 1 vcpu in this case - similar to commit c7b8ee9, do not allocate additional vcpus when limits are provided Signed-off-by: Manuel Huber <mahuber@microsoft.com>
Point to msft-preview Signed-off-by: Manuel Huber <mahuber@microsoft.com> Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
For our Kata UVM, we know we need at least 128MB of memory to prevent instability in the guest. Enforce this constraint with a descriptive error to prevent users from destabilizing the UVM with faulty k8s configurations. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
If memory limit is set and less than minimum, set it to minimum. This is to to account for kata-containers@0ec3403 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Add Microsoft mandatory file SECURITY.md Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- Change Makefile to point to fork - Change versions.yaml to point to proper version on fork Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This change mirrors host networking into the guest as before, but now also includes the default gateway neighbor entry for each interface. Pods using overlay/synthetic gateways (e.g., 169.254.1.1) can hit a first-connect race while the guest performs the initial ARP. Preseeding the gateway neighbor removes that latency and makes early connections (e.g., to the API Service) deterministic. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This is a fork temporary measure to unblock CI required tests in our fork, while we find a way to remove the 'main' hard codes from upstream. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Background: * `pull_request` runs on the PR branch code and has access to secrets ONLY if the PR is from microsoft/kata-containers (i.e. NOT from an external contributor who forked the repo). * `pull_request_target` runs on the trusted main branch code by default and has access to secrets for any PR. Reference: https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#pull_request Upstream uses `pull_request_target` (and manually checks out the PR code) to have access to secrets for PRs from external contributors, however we don't expect external PRs, hence we can use `pull_request`. Furthermore, since `pull_request_target` only runs from the default branch, we need to use `pull_request` anyway as we have multiple leading branches (i.e., msft-main, msft-preview, and release branches). https://github.blog/changelog/2025-11-07-actions-pull_request_target-and-environment-branch-protections-changes/ Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
set default to msft-preview Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
use upstream cloud-hypervisor. This is to unblock the CI and let CLH build Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
update target branch to msft-preview Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
This fixes a CI static check failure Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
- tests that deploy pods with too small of a memory limit - try to set a minimum memory limit for some containerd tests - tests that use runners we don't have - tests that depend on pushing to GHCR Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
Enable VFIO device pass-through at VM creation time on Cloud Hypervisor, in addition to the existing hot-plug path. Signed-off-by: Roaa Sakr <romoh@microsoft.com>
There was a problem hiding this comment.
Pull request overview
This PR extends Kata’s Go runtime (virtcontainers) and Rust runtime (runtime-rs) to support physical network interfaces that are not SR-IOV VFs (e.g., VMBus-backed NICs) by using tap/bridge networking instead of assuming VFIO passthrough.
Changes:
- Go: Physical endpoint now branches VF vs non-VF behavior (VFIO passthrough vs tap/bridge connect/disconnect) and persists additional physical endpoint networking state.
- Go: Physical interface detection switches to using netlink
ParentDevBus(PCI/VMBus). - Rust: Adds bus-type detection (PCI/VMBus), VF detection, and physical endpoint creation that supports non-VF physical NICs.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/runtime/virtcontainers/physical_endpoint.go | Adds VF/non-VF branching, sysfs path handling for PCI/VMBus, persists NetPair/BusType. |
| src/runtime/virtcontainers/physical_endpoint_test.go | Expands unit tests for VF vs non-VF behavior, sysfs-path helpers, save/load behavior. |
| src/runtime/virtcontainers/persist/api/network.go | Extends persisted PhysicalEndpoint schema with NetPair and BusType. |
| src/runtime/virtcontainers/network_linux.go | Updates physical detection callsite and link typing for PhysicalEndpoint. |
| src/runtime-rs/crates/resource/src/network/utils/link/mod.rs | Adds bus-type detection, iface sysfs path resolution, and VF detection helpers. |
| src/runtime-rs/crates/resource/src/network/network_with_netns.rs | Updates physical endpoint creation to pass required params for non-VF setup. |
| src/runtime-rs/crates/resource/src/network/network_pair.rs | Adds NetworkPair::new_for_physical() and tests for VF vs non-VF behavior. |
| src/runtime-rs/crates/resource/src/network/endpoint/physical_endpoint.rs | Implements VF vs non-VF attach/detach logic (VFIO vs network device) and persists added state. |
| src/runtime-rs/crates/resource/src/network/endpoint/endpoint_persist.rs | Extends persisted physical endpoint state with VF/bus metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… fetch link from kernel Agent-Logs-Url: https://github.com/microsoft/kata-containers/sessions/37741095-6baf-41d0-be28-31710b28dbb4 Co-authored-by: sharath-srikanth-chellappa <115591284+sharath-srikanth-chellappa@users.noreply.github.com>
…/37741095-6baf-41d0-be28-31710b28dbb4 Co-authored-by: sharath-srikanth-chellappa <115591284+sharath-srikanth-chellappa@users.noreply.github.com>
…) fix, comment update Agent-Logs-Url: https://github.com/microsoft/kata-containers/sessions/3a97fe22-9884-4104-a6c0-60981cc1f63f Co-authored-by: sharath-srikanth-chellappa <115591284+sharath-srikanth-chellappa@users.noreply.github.com>
Use sysIfaceDevicePath when probing physfn Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… between integer types' Correcting the strconv function Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hers Agent-Logs-Url: https://github.com/microsoft/kata-containers/sessions/f9ce4ef3-9e58-4054-aac8-03c4e016b091 Co-authored-by: sharath-srikanth-chellappa <115591284+sharath-srikanth-chellappa@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…VF path Agent-Logs-Url: https://github.com/microsoft/kata-containers/sessions/3019da9e-d213-4361-9226-4e28fe0114be Co-authored-by: sharath-srikanth-chellappa <115591284+sharath-srikanth-chellappa@users.noreply.github.com>
TestIsPhysicalIface: remove spuriousParentDevBusfrom Bridge and fetch link from kernelnetInfo.Linkpanic inaddSingleEndpoint(): resolve link vianetlink.LinkByNamewhen nil*PhysicalEndpointto rate-limiter switches to prevent hard failure: VF → no-op, non-VF → tap interface namePhysicalEndpoint.load()to uses.Physical.NetPair, persist/restoreIsVFviapersistapi.PhysicalEndpointisPhysicalIface()to reflectParentDevBus-based detectionremoveTxRateLimitererror message ("adding" → "removing")get_bus_type(): only mapErrorKind::NotFoundtoOk(None); propagate all other I/O errors (comment 3033974934)RxRateLimiter/TxRateLimiterbool fields toPhysicalEndpoint;SetRxRateLimiter()/SetTxRateLimiter()set the flag and return nil for non-VF (comment 3034009975)