Skip to content

iscsi: replace host targetcli with VM-based iSCSI target#1660

Draft
bruno-fs wants to merge 8 commits into
rhinstaller:mainfrom
bruno-fs:iscsi-4044
Draft

iscsi: replace host targetcli with VM-based iSCSI target#1660
bruno-fs wants to merge 8 commits into
rhinstaller:mainfrom
bruno-fs:iscsi-4044

Conversation

@bruno-fs

Copy link
Copy Markdown
Contributor

Summary

  • Replace host-side targetcli with a VM-based iSCSI target using QEMU mcast socket networking, making the tests fully self-contained — no kernel modules, no bridge networking, no special capabilities
  • Remove knownfailure tag from iscsi and iscsi-bind tests (disabled since May 2018)
  • Add iscsi-ordering test for INSTALLER-4044 (ignoredisk before iscsi command ordering bug)

How it works

A second tiny VM acts as the iSCSI disk server. Both VMs talk over a virtual network (QEMU multicast sockets on localaddr=127.0.0.1). On first run, a Fedora cloud image is downloaded and prepared with virt-customize (targetcli + sshd), then cached. Subsequent runs create a qcow2 overlay and configure the target via SSH (~30s setup).

Container
├── QEMU: iSCSI target VM (10.10.10.1:3260)
│   ├── mcast socket NIC (iSCSI traffic)
│   └── SLIRP NIC (SSH for setup)
└── QEMU: anaconda VM (10.10.10.2)
    ├── mcast socket NIC (iSCSI traffic)
    └── SLIRP NIC (internet for packages)

After installation, the target VM is shut down and RESULT is extracted from the iSCSI backing store via guestfish on the host side.

Test plan

  • iscsi test passes in devcontainer (run_kickstart_tests.sh)
  • iscsi test passes in kstest-runner container (containers/runner/launch)
  • keyboard smoke test unaffected
  • iscsi-bind test
  • iscsi-ordering test
  • CI: /kickstart-test iscsi

Related: INSTALLER-4044

bruno-fs and others added 4 commits May 20, 2026 14:59
The iscsi and iscsi-bind tests have been knownfailure since May 2018
because prepare() called targetcli on the host/container, requiring
kernel modules unavailable in the container runner.

Replace the local targetcli approach with a VM-based iSCSI target
using QEMU mcast socket networking. The test is now fully
self-contained — no kernel modules, no bridge networking, no special
capabilities.

Architecture:
- Target VM: Fedora cloud image prepared with virt-customize
  (targetcli + sshd), boots with mcast socket NIC + SLIRP SSH NIC
- Test VM: anaconda boots with mcast NIC (iSCSI) + SLIRP user NIC
  (internet for packages)
- Both VMs join the same mcast group (230.0.0.1:PORT) with
  localaddr=127.0.0.1

The first run downloads and prepares a Fedora cloud image (~2-5 min).
The base image is cached at /var/tmp/kstest-iscsi-cache/; subsequent
runs create a qcow2 overlay (instant) and configure the target via
SSH (~30s).

Also adds iscsi-ordering test for INSTALLER-4044 (ignoredisk before
iscsi command ordering bug).

Related: INSTALLER-4044

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CRITICAL: Use flock to serialize base image cache creation and PID-
  qualified temp files to prevent corruption from parallel cold-cache runs
- HIGH: Fix broken virsh domstate error detection (was grepping stderr
  that was suppressed); use dominfo existence check instead
- HIGH: Check virt-cat exit code; use .tmp + mv pattern to prevent
  empty RESULT file on extraction failure
- HIGH: Log guestfish errors to guestfish.log instead of suppressing
- MEDIUM: Fix anaconda log extraction directory nesting (copy to
  $disksdir/ not $disksdir/anaconda/)
- MEDIUM: Increase target VM shutdown timeout from 30s to 60s
- LOW: Fix iscsi-ordering.sh execute permission
- LOW: Replace deprecated egrep with grep -E in all kickstart files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SSH retry loop suppresses all output, so a missing sshpass
binary causes 120 silent iterations before timing out with a
misleading "target VM did not become reachable" error.

Add an explicit check at the start of create_iscsi_target_vm().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread functions.sh Fixed
Comment thread functions.sh Fixed
Comment thread functions.sh Fixed
Comment thread functions.sh Fixed
Comment thread iscsi-bind.sh Fixed
Comment thread iscsi.sh Fixed
Comment thread iscsi.sh Fixed
Comment thread iscsi.sh Fixed
Comment thread iscsi.sh Fixed
Comment thread iscsi.sh Fixed
@bruno-fs

Copy link
Copy Markdown
Contributor Author

Full disclosure, this PR was cooked entirely on AI autonomously. All I did was

  • prepare a SPEC (in tandem with claude) to plan it
  • let agents implement this in a sandbox (in yolo mode)
  • after it was done, I manually ran containers/runner/launch keyboard iscsi

I still need to review/refine this...

Comment thread functions.sh Outdated
bruno-fs and others added 2 commits May 20, 2026 20:36
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace hardcoded Fedora 42 cloud image URL with dynamic lookup
  via getfedora.org releases API (falls back to KSTEST_ISCSI_TARGET_IMAGE
  env var)
- Fix ShellCheck SC2155: split local declaration and assignment to
  avoid masking return values
- Fix ShellCheck SC2034: annotate intentionally unused `initiator`
  parameter, rename loop variable `i` to `_retry`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread iscsi-bind.sh
bruno-fs and others added 2 commits May 20, 2026 20:59
Extract common iSCSI target setup (test ID sanitization, IQN
construction, target VM creation, kickstart placeholder substitution)
into _prepare_iscsi_target() in iscsi.sh. The iscsi-bind and
iscsi-ordering variants now only add their extra sed substitutions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the python3 one-liner with jq for parsing getfedora.org
releases.json — simpler, no python dependency in the shell path.
Add jq to the kstest-runner Dockerfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants