feat: initial CI/release scaffolding + Phase 0 bug fixes (vsock/serial/timer)#2
Closed
tolgakaratas wants to merge 8 commits intomasterfrom
Closed
feat: initial CI/release scaffolding + Phase 0 bug fixes (vsock/serial/timer)#2tolgakaratas wants to merge 8 commits intomasterfrom
tolgakaratas wants to merge 8 commits intomasterfrom
Conversation
Source code changes (no CI/infrastructure): - Cross-platform module gating: storage/virtio keep tests portable, Linux-only modules gated with cfg(target_os = "linux") - Shared compat module (IoctlReq, SendPthreadT) for glibc/musl differences - All clippy lints resolved via cargo fix + cargo clippy --fix on Rust 1.95 - musl static build compatibility: SYS_renameat2 raw syscall, platform- correct ioctl types, Send wrapper for pthread_t - Fix _host_offset naming bug in balloon inflate (compile error on Linux) - Platform-conditional cast for libc::S_IFMT (u16 macOS, u32 Linux) - dead_code allow on modules with forward-declared upstream API - rustfmt applied with max_width=120 Verified: 0 clippy errors on Linux (rust:1.95) and macOS, 266+188 tests pass.
- profile.release: LTO fat, codegen-units=1, panic=abort, strip=true - Cargo.toml: homepage, repository, keywords, MSRV 1.87 - Workspace members: add rust-version = "1.87" - rustfmt.toml: max_width=120 matching original codebase style - .editorconfig: consistent settings across editors - Makefile: add shift-left targets (make ci, make fix, make lint) - .gitignore: add VM artifact patterns (*.img, *.qcow2)
Workflows: - build.yml: fmt, clippy, musl static build+test, MSRV 1.87 check, cargo-deny, security audit (with smart change detection) - release-please.yml: conventional commits to automated release PRs - release.yml: x86_64+aarch64 musl static binaries, SHA256 checksums, cosign keyless signing, SLSA attestation, SBOM (SPDX) - security-scan.yml: weekly cargo audit, cargo deny, CodeQL Rust - dependabot.yml: weekly cargo+actions updates with semantic grouping - dependabot-auto-merge.yml: auto-squash-merge patch/minor updates Templates: - Issue templates (bug report, feature request) - Pull request template with checklist
- SECURITY.md: vulnerability reporting via GitHub private advisories - CONTRIBUTING.md: setup, shift-left local CI (make ci), pre-commit hooks, conventional commits, code style guide - CHANGELOG.md: initial file for release-please automation - README.md: CI status, license, and MSRV badges
- mise: rust + cargo-binstall + pre-commit; setup/ci tasks - pre-commit: cargo autofix on commit, test+deny on push - deny.toml: license allowlist (MIT/Apache/BSD/ISC), advisory checks - release-please: Rust release type, version sync, changelog sections
clone-init forks the agent off PID 1 of the initrd before exec'ing systemd. Inside that fork-descendant, every blocking sleep call (usleep/nanosleep/std::thread::sleep) never wakes — the kernel timer state for the child is wedged. The pre-execve usleep(50_000) killed the child mid-sleep, and the agent's heartbeat loop wedged on its first SO_RCVTIMEO recv after sending Ready. - crates/clone-init/src/main.rs: drop the pre-execve usleep; child setsid + execve immediately so the kernel doesn't park it. - crates/guest-agent/src/main.rs: replace every blocking sleep with libc::sched_yield() loops; mark the vsock fd O_NONBLOCK and use MSG_PEEK + MSG_DONTWAIT for recv pacing. - src/virtio/vsock.rs: log every TX op so heartbeat-cadence regressions are visible in the VMM stderr stream.
Serial::write buffered guest stdout until \n or 256 bytes, so
no-trailing-newline payloads (notably the `clone login: ` prompt
agetty prints and then sleeps in ppoll) never reached the
/tmp/clone-{pid}.console socket. `clone attach` showed nothing.
- src/vmm/serial.rs: tee every byte to console_fd immediately;
retain an 8 KiB rolling history of recent output.
- src/vmm/mod.rs: on console-client attach, replay the history before
registering the live tee fd, so a late `clone attach` still sees
the boot banner and login prompt that were already printed.
Vcpu::new masked off TSC-deadline (CPUID.1.ECX[24]) and the kvmclock
feature bits (CPUID.0x40000001.EAX[0,3,24]) to dodge a fork/restore
bug where MSR_KVM_SYSTEM_TIME_NEW didn't round-trip through GET_MSRS.
Cost: the guest fell back to TSC calibration via PIT/HPET, the
in-kernel irqchip under-delivered ticks on idle APs, and idle CPUs
received ~zero LOC interrupts. systemd then wedged in
synchronize_rcu_normal because the grace period waits for every CPU
to pass through a quiescent state, which a tickless idle AP never does.
- src/vmm/vcpu.rs: keep both TSC-deadline and kvmclock bits in fresh
CPUID. Pin TSC frequency via set_tsc_khz(get_tsc_khz()) so the
guest doesn't have to calibrate against PIT/HPET. Fork path
(from_template) keeps its existing snapshot-aware MSR handling.
- src/main.rs: drop the rcupdate.rcu_expedited=1 +
rcu_normal_after_boot=0 cmdline workaround now that the underlying
timer path is fixed.
Verified on Ubuntu rootfs, 2 vCPUs, 1 GB RAM:
before: LOC cpu0=102 cpu1=0 over 17 min, clocksource=tsc-early,
systemd in D-state on synchronize_rcu_normal
after: LOC cpu0=17936 cpu1=1030 over ~1 min, clocksource=tsc
(kvm-clock available), systemd S-state and reaches login.
Author
|
Closing — wrong target repo. Will re-open against unixshells/clone:master once upstream PR strategy is finalized. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Foundation layer (8 commits): the initial GitHub Actions CI + release pipeline, README + LICENSE + dev tooling scaffold, and the three production bug fixes that closed out Phase 0.
Groups
Initial scaffolding (5 commits)
9c467c6fix: code quality, musl compatibility, cross-platform module gating72e8325build: optimize release profile and project metadatad0ec713ci: add GitHub Actions CI/CD with release automation1bc4b37docs: add project documentation and README badgesf4c9ef8chore: add development tooling and release configurationPhase 0 bug fixes (3 commits)
f62c673fix(vsock): unblock guest agent loop after PID 1 forkddecd5cfix(serial): tee guest output per-byte + replay on attachfec217afix(timer): re-enable kvmclock + tsc-deadline on fresh vCPUsTest plan
make cipasses locally (fmt + clippy + test + deny + audit)Stack
This is PR 1 of 8 in the CI/CD overhaul stack. Each subsequent PR builds on this one. Merge order: 1 → 2 → … → 8.