Skip to content

kola: add bootc-base tag for kola tests#4469

Draft
yasminvalim wants to merge 4 commits into
coreos:mainfrom
yasminvalim:poc-run-tests
Draft

kola: add bootc-base tag for kola tests#4469
yasminvalim wants to merge 4 commits into
coreos:mainfrom
yasminvalim:poc-run-tests

Conversation

@yasminvalim
Copy link
Copy Markdown
Contributor

@yasminvalim yasminvalim commented Mar 5, 2026

Adds a bootc-base tag for Kola tests that do not set register.Test.UserData (no test-specific Ignition/Butane).

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a --no-ignition flag to kola to allow running tests on pre-baked QCOW2 images, optimizing certain testing scenarios. However, the implementation of the --ssh-user flag introduces a potential SSH configuration injection vulnerability in mantle/platform/cluster.go as the user-supplied SSHUser string is written directly into the ssh-config file without sanitization. It is recommended to sanitize this input to prevent arbitrary SSH option injection. Additionally, consider improving the readability of the machine creation logic in the QEMU platform code.

Comment on lines +154 to +158
if bc.rconf.SSHUser != "" {
if _, err := fmt.Fprintf(sshBuf, " User %s\n", bc.rconf.SSHUser); err != nil {
return err
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The SSHUser command-line flag is written directly into the ssh-config file without sanitization. An attacker who can control the command-line arguments to kola can inject arbitrary SSH configuration options by including newlines in the SSHUser string. This can lead to arbitrary command execution if the ssh-config file is used by the user or another tool (e.g., via ProxyCommand).

Comment thread mantle/platform/machine/qemu/cluster.go Outdated
Comment on lines 71 to 96
qc.mu.Lock()

conf, err := qc.RenderUserData(userdata, map[string]string{})
if err != nil {
noIgnition := qc.RuntimeConf().NoIgnition
var conf *conf.Conf
var confPath string
var err error
if noIgnition {

qc.mu.Unlock()
return nil, err
} else {
conf, err = qc.RenderUserData(userdata, map[string]string{})
if err != nil {
qc.mu.Unlock()
return nil, err
}
qc.mu.Unlock()

if conf.IsIgnition() {
confPath = filepath.Join(dir, "ignition.json")
if err := conf.WriteFile(confPath); err != nil {
return nil, err
}
} else if !conf.IsEmpty() {
return nil, fmt.Errorf("qemu only supports Ignition or empty configs")
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block for handling Ignition is a bit complex and hard to follow due to the locking and branching. It can be simplified by restructuring the if condition and moving the lock to be more tightly scoped around the operation it protects. This will improve readability and maintainability.

noIgnition := qc.RuntimeConf().NoIgnition
	var conf *conf.Conf
	var confPath string
	var err error
	if !noIgnition {
		qc.mu.Lock()
		conf, err = qc.RenderUserData(userdata, map[string]string{})
		qc.mu.Unlock()
		if err != nil {
			return nil, err
		}

		if conf.IsIgnition() {
			confPath = filepath.Join(dir, "ignition.json")
			if err := conf.WriteFile(confPath); err != nil {
				return nil, err
			}
		} else if !conf.IsEmpty() {
			return nil, fmt.Errorf("qemu only supports Ignition or empty configs")
		}
	}

@dustymabe
Copy link
Copy Markdown
Member

TASK:

Running tests that not need ignition against a container image, instead of starting a VM, would drastically reduce the load on the Jenkins infra.

IDEA:

Adapt kola to be able to run tests without relying on ignition

* Assume that we get an ssh key that can log in as root

* Assume that we get a QCOW image with that ssh key injected

* Copy and run tests scripts over SSH in a QEMU VM

* Consider splitting kola for COSA

Do you have any context for all of this? Running nested container images inside kubernetes/openshift (where we run our pipeline today) isn't trivial so I'm not sure if it will save us much.

Also, the description here is contradictory. It says we should be able to run tests against a container, but then you mention a QCOW with an ssh key inject, which is a VM. What's the real goal here?

@yasminvalim
Copy link
Copy Markdown
Contributor Author

TASK:
Running tests that not need ignition against a container image, instead of starting a VM, would drastically reduce the load on the Jenkins infra.
IDEA:
Adapt kola to be able to run tests without relying on ignition

* Assume that we get an ssh key that can log in as root

* Assume that we get a QCOW image with that ssh key injected

* Copy and run tests scripts over SSH in a QEMU VM

* Consider splitting kola for COSA

Do you have any context for all of this? Running nested container images inside kubernetes/openshift (where we run our pipeline today) isn't trivial so I'm not sure if it will save us much.

Also, the description here is contradictory. It says we should be able to run tests against a container, but then you mention a QCOW with an ssh key inject, which is a VM. What's the real goal here?

Hey Dusty, I’ll send over the task and the context I have. To be honest, I’m still figuring it out myself. Since this is a spike, the goal is to investigate and see what’s actually feasible. The DoD in the jira ticket is to create a POC and document the different approaches.

@yasminvalim yasminvalim closed this Apr 7, 2026
@yasminvalim yasminvalim reopened this Apr 7, 2026
@yasminvalim yasminvalim changed the title kola: add --no-ignition flag to run tests without ignition kola: add bootc-base tag for kola tests Apr 8, 2026
Copy link
Copy Markdown
Member

@joelcapitao joelcapitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good overall, though I think we can already add systemd/SMBIOS support in this PR to implement SSH key provisioning.
So, instead of injecting SSH keys via Ignition, QEMU would be started with:

-smbios type=11,value=io.systemd.credential.binary:tmpfiles.extra=<base64-encoded-tmpfiles-config>

That way, we'd be able to run the kola bootc-base tagged tests against bootc image.

Comment thread mantle/kola/tests/ostree/sync.go Outdated
… set UserData.docs.

Tag the agreed core, ostree, and rpm-ostree upgrade-rollback tests so we can select them without custom Ignition
Provision SSH keys via systemd tmpfiles.extra credentials passed through QEMU SMBIOS so bootc-base tests can run without Ignition.
@yasminvalim
Copy link
Copy Markdown
Contributor Author

@joelcapitao I added the support for SMBIOS and I guess it's good since I tested manually and worked fine. I need to test with bootc too.

Copy link
Copy Markdown
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for working on this!

I think it's OK to try things out by having some of the built-in tests be base bootc-compatible to start. But the real value is in external tests. Once we add enablement for bootc-base for external tests, I would probably even drop all of the internal tags we added to tests here. I don't think it's a good idea to have kola built-in tests be a chokepoint/maintenance burden as we look to scale out kola usage across !CoreOS.

sv(&kola.Sharding, "sharding", "", "Provide e.g. 'hash:m/n' where m and n are integers, 1 <= m <= n. Only tests hashing to m will be run.")
bv(&kola.Options.SSHOnTestFailure, "ssh-on-test-failure", false, "SSH into a machine when tests fail")
bv(&kola.QEMUOptions.NoIgnition, "no-ignition", false, "Run without Ignition; provision SSH via systemd SMBIOS credentials (requires -p qemu and --qemu-image)")
sv(&kola.QEMUOptions.SSHUser, "ssh-user", "", "SSH user when using --no-ignition (default: root)")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we bother with --ssh-user? vs just always using root. What's the rationale there?

},
// FIXME run on RHCOS once it has https://github.com/coreos/ignition-dracut/pull/93
Distros: []string{"fcos"},
Tags: []string{kola.BootcBaseTag},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this should be a bootc-base test. It would pass of course, but the whole sentinel value + restamp on first boot dance is CoreOS-specific AFAIK.

Description: "Verify installing an rpm does not persist when using `ostree admin unlock`.",
FailFast: true,
Tags: []string{"ostree", kola.NeedsInternetTag}, // need network to pull RPM
Tags: []string{"ostree", kola.NeedsInternetTag, kola.BootcBaseTag}, // need network to pull RPM
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should drop this and the following one too. It wouldn't work on a composefs-based system.

Description: "Verify an upgrade and rollback with a simulated update works.",
FailFast: true,
Tags: []string{"rpm-ostree", "upgrade"},
Tags: []string{"rpm-ostree", "upgrade", kola.BootcBaseTag},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this. There's no rpm-ostree in minimal.

// Use default builder if none provided
builder = qc.ensureBuilderDefaults(builder)

qm, config, err := qc.createMachine(userdata)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createMachine seems to already handle the case where userdata could be nil. Would it be cleaner to instead keep using createMachine, and conditionalize whatever else is needed there on nil userdata?

}
builder.Smbios = append(builder.Smbios, smbios)
} else {
builder.SetConfig(config)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already called higher up?


qemuBuilder.UUID = qm.id
qemuBuilder.ConsoleFile = qm.consolePath
qemuBuilder.NumaNodes = options.NumaNodes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have lost this.

Comment on lines +136 to +140
if qc.flight.opts.Arch != "" {
if err := builder.SetArchitecture(qc.flight.opts.Arch); err != nil {
return nil, err
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like something that should just already be taken care of when builder was constructed.

Comment on lines +141 to +143
if qc.flight.opts.Firmware != "" {
builder.Firmware = qc.flight.opts.Firmware
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants