Skip to content

Compute node topology in virt-handler directly#34

Open
harshitgupta1337 wants to merge 20 commits into
microsoft:virtstack-plugin-refactorfrom
harshitgupta1337:nodelabeling-topology
Open

Compute node topology in virt-handler directly#34
harshitgupta1337 wants to merge 20 commits into
microsoft:virtstack-plugin-refactorfrom
harshitgupta1337:nodelabeling-topology

Conversation

@harshitgupta1337

@harshitgupta1337 harshitgupta1337 commented Aug 21, 2025

Copy link
Copy Markdown

virt-handler passes a VirtualMachineOptions object to the virt-launcher in multiple Command API calls. The VirtualMachineOptions object contains a field Topology that contains information about the node's NUMA topology. This PR modifies how the virt-handler computes the node's topology for building the VirtualMachineOptions object.

What this PR does

Before this PR: KubeVirt relies on Libvirt's node capabilities XML to extract the node topology. virt-handler generates the node capabilities XML for node labeling primarily, but it is also used to read the node topology. This is an easily replaceable dependency on Libvirt in virt-handler.

After this PR: virt-handler's now reads the node's topology from /sysfs.

Why we need it and why it was done in this way

This is a necessary step toward decoupling virt-handler from Libvirt.

The following tradeoffs were made: The downside of KubeVirt itself querying the node topology is that we would have to add test cases to ensure that the topology querying logic is functionally correct.

@harshitgupta1337 harshitgupta1337 marked this pull request as ready for review August 21, 2025 17:48
@harshitgupta1337 harshitgupta1337 marked this pull request as draft August 21, 2025 17:50

This comment was marked as outdated.

@harshitgupta1337 harshitgupta1337 marked this pull request as ready for review August 24, 2025 05:05

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies virt-handler to compute node topology directly from /sysfs instead of relying on Libvirt's node capabilities XML, as part of an effort to decouple KubeVirt from Libvirt.

  • Replaces Libvirt XML parsing with direct /sysfs file system reading for topology information
  • Removes dependency on libvirtxml package from the topology computation path
  • Adds comprehensive test coverage for the new topology reading functionality

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/virt-handler/vm.go Updates VirtualMachineController to use nodeTopology instead of capabilities
pkg/virt-handler/topology.go New implementation that reads node topology directly from /sysfs
pkg/virt-handler/topology_test.go Comprehensive test suite for the new topology reading functions
pkg/virt-handler/options.go Removes libvirt capabilities to topology conversion functions
pkg/virt-handler/options_test.go Removes tests for the old libvirt-based topology conversion
pkg/virt-handler/BUILD.bazel Updates build dependencies to include new topology files
cmd/virt-handler/virt-handler.go Initializes nodeTopology using the new ReadNodeTopology function

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +44 to +56
for _, line := range strings.Split(string(data), "\n") {
if strings.HasPrefix(line, "Node") && strings.Contains(line, "MemTotal") {
fields := strings.Fields(line)
if len(fields) >= 4 {
val, err := strconv.ParseUint(fields[3], 10, 64)
if err == nil {
return val, nil
} else {
return 0, err
}
}
}
}

Copilot AI Aug 24, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsing logic assumes a specific format with MemTotal always being the 4th field (index 3). This is fragile and could break if the format changes. Consider using a more robust parsing approach that searches for the field containing 'MemTotal:' and then extracts the value from the next field.

Copilot uses AI. Check for mistakes.
Comment on lines +253 to +255
nodes, _ := filepath.Glob(filepath.Join(sysfsNodePath, "node[0-9]*"))
for _, node := range nodes {
cellId, err := strconv.ParseUint(strings.TrimPrefix(filepath.Base(node), "node"), 10, 32)

Copilot AI Aug 24, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error from filepath.Glob is silently ignored. If the glob pattern fails or there's an I/O error, the function will continue with an empty slice, potentially returning an empty topology without any indication of failure.

Suggested change
nodes, _ := filepath.Glob(filepath.Join(sysfsNodePath, "node[0-9]*"))
for _, node := range nodes {
cellId, err := strconv.ParseUint(strings.TrimPrefix(filepath.Base(node), "node"), 10, 32)
nodes, err := filepath.Glob(filepath.Join(sysfsNodePath, "node[0-9]*"))
if err != nil {
return nil, fmt.Errorf("failed to glob NUMA nodes: %w", err)
}
for _, node := range nodes {

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +78
func getHugepageSizes(hugepagesDir string) []uint64 {
hugepageSizes := make([]uint64, 0)
entries, err := os.ReadDir(hugepagesDir)
if err != nil {
return hugepageSizes

Copilot AI Aug 24, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently returning an empty slice when the hugepages directory cannot be read may mask legitimate errors. Consider logging the error or returning it to the caller to distinguish between "no hugepages configured" and "error reading hugepages directory".

Suggested change
func getHugepageSizes(hugepagesDir string) []uint64 {
hugepageSizes := make([]uint64, 0)
entries, err := os.ReadDir(hugepagesDir)
if err != nil {
return hugepageSizes
func getHugepageSizes(hugepagesDir string) ([]uint64, error) {
hugepageSizes := make([]uint64, 0)
entries, err := os.ReadDir(hugepagesDir)
if err != nil {
return hugepageSizes, fmt.Errorf("failed to read hugepages directory %s: %w", hugepagesDir, err)

Copilot uses AI. Check for mistakes.
Comment on lines +255 to +256
cellId, err := strconv.ParseUint(strings.TrimPrefix(filepath.Base(node), "node"), 10, 32)
if err != nil {

Copilot AI Aug 24, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently skipping nodes with unparseable IDs could hide configuration issues or unexpected filesystem states. Consider logging a warning when a node directory cannot be parsed to aid in debugging.

Suggested change
cellId, err := strconv.ParseUint(strings.TrimPrefix(filepath.Base(node), "node"), 10, 32)
if err != nil {
if err != nil {
log.Printf("Warning: unable to parse NUMA node directory '%s': %v", node, err)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants