test(ssh): add opt-in live-host SSH/sudo integration test#577
Merged
Conversation
Closes the project's biggest test blind spot: the dial, auth-ordering, and sudo -n/-S paths were only unit-tested at the command-construction level (stubbed transport), never against a real box. A wired-up host could regress and every test stay green. internal/ssh/livehost_test.go drives the REAL ssh.Dial + ssh.RunSudo — the primitives every host-talking path (scan, discovery, collector, liveness) shares — against an operator-supplied inventory: OPENWATCH_LIVE_HOSTS=/path/to/test_hosts.csv (hostname,ip,username,credential) OPENWATCH_LIVE_KEY=/path/to/id_rsa With either unset the test t.Skip()s, so it never gates normal CI; the inventory + key stay on the operator's workstation, never in the repo. The fleet is heterogeneous, so the test DISCOVERS each host's capabilities rather than demanding every method everywhere. Per host it asserts the machinery for whatever the host supports: - key auth dials -> ObservedAuth == "key" (the value the memo records) - password auth dials -> ObservedAuth == "password" - sudo mode confirmed via the `true` sentinel (nopasswd | password) - the real `sudo -S -k -p '' true` password-on-stdin path executes A server-side auth rejection (key not authorized, or PasswordAuthentication off) is a tolerated host-config fact; an unreachable host is skipped; only an unexpected protocol-level error or a wrong ObservedAuth/sudo result fails the test. A host with no usable auth is skipped. Validated against the dev fleet: 5 key+NOPASSWD hosts pass (real key dial, sudo -n, and sudo -S all exercised), key-rejecting and unreachable hosts skip. The password-AUTH assertion is live-unverified only because the dev fleet runs PasswordAuthentication=no everywhere (noted in BACKLOG); it runs as soon as one password-enabled host is in the inventory. Also drops the completed "wire SSH auth/sudo learning" backlog entry (shipped in #575 + #576).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
internal/ssh/livehost_test.go— an opt-in, self-skipping integration test that drives the realssh.Dial+ssh.RunSudoagainst actual hosts. Closes the project's biggest test blind spot: the dial, auth-ordering, andsudo -n/-Spaths were only unit-tested at the command-construction level (stubbed transport), never against a live box, so a wired-up host could regress with every test still green.How it runs
With either env var unset it
t.Skip()s — never gates normal CI. The inventory + key stay on the operator's workstation, never in the repo.What it validates
The fleet is heterogeneous, so the test discovers each host's capabilities rather than demanding every method on every host. Per reachable host it asserts the machinery for whatever the host supports — exactly the observations the per-host
connprofilememo records:ObservedAuth == "key"ObservedAuth == "password"truesentinel (nopasswd|password)sudo -Ssudo -S -k -p '' truepassword-on-stdin path executesTolerance rules keep it meaningful but not brittle on a real fleet:
PasswordAuthentication no) is a host-config fact → logged, that method skipped;ObservedAuth/sudo result fails the test;Proven against the dev fleet
Ran it against the 9-host dev inventory: 5 key+NOPASSWD hosts pass (real key dial →
ObservedAuth=="key",sudo -n true→ nopasswd, and thesudo -Spassword-on-stdin path all exercised end-to-end); 4 key-rejecting hosts and 1 unreachable host skip. This is live proof of the three SSH-learning PRs just merged (#566 / #575 / #576).Known gap (documented in BACKLOG): the password-AUTH branch (
ObservedAuth=="password") is live-unverified because the dev fleet runsPasswordAuthentication=noeverywhere — it runs the moment a password-enabled host is in the inventory.Also
Drops the completed "wire SSH auth/sudo learning into discovery/intelligence/liveness" backlog entry (shipped in #575 + #576) and marks the live-host-test item mostly-done.
gofmt/go vet/go build ./...clean;specter check0 errors; the test compiles and skips cleanly in the normal suite.