Skip to content

feat(ssh): wire per-host sudo-mode learning into discovery/intelligence/liveness#576

Merged
remyluslosius merged 1 commit into
mainfrom
feat/sudo-mode-learning
Jun 16, 2026
Merged

feat(ssh): wire per-host sudo-mode learning into discovery/intelligence/liveness#576
remyluslosius merged 1 commit into
mainfrom
feat/sudo-mode-learning

Conversation

@remyluslosius

Copy link
Copy Markdown
Contributor

What

Follow-up to #575 (which wired the SSH auth-method dimension). This wires the sudo-mode dimension of the connprofile memo into the same three paths:

  • liveness privilege probe (internal/sshprivilege)
  • OS discovery firewall probe (internal/intelligence/discovery)
  • OS intelligence collector (internal/intelligence/collector)

Each now leads sudo with the host's recorded mode (skipping the doomed sudo -n on a password-sudo host) and records the mode confirmed to work.

Division of labour

  • Liveness probe = authoritative learner. It runs an innocuous true sentinel every ~5 min, so it reliably confirms the mode regardless of any real command's exit code.
  • Discovery + collector = opportunistic. They learn from their existing real sudo commands — no extra round-trip. To avoid misrecording, a mode is recorded only on a confirmed exit-0 of a given sudo form, never inferred from a command that failed for its own reasons. (The scan's true-sentinel rule, C-04, binds only the scan; these paths defer to the liveness probe for authoritative confirmation.)

Mechanics

  • ssh.RunSudo (collector's shared primitive) gains prefer (in) + observed (out). With prefer=SudoPassword and the password gate satisfied it leads with sudo -S, issuing zero sudo -n calls; observed reports the confirmed form. Plain-string tokens (SudoNopasswd/SudoPassword) keep the ssh package decoupled from connprofile, mirroring PreferKey/PreferPassword.
  • discovery.runSudoWithFallback + probeFirewall thread prefer and return the learned mode; the discovery Service reads it (cfg.prefer) and records it.
  • sshprivilege.Probe extracts probeSudo: leads with sudo -S on a known password host, records the confirmed mode, and preserves the exact AC-18/19/21 error shapes.
  • collector threads the recorded mode across the cycle's sudo commands and records once at the end.
  • cmd/openwatch wires the shared connprofile store into the collector + discovery services (the probe already had it from feat(ssh): wire per-host auth-method learning into discovery/intelligence/liveness #575).

Safety

The sudo password gate (kill-switch + auth-method ∈ {password, both}) is unchanged — leading with sudo -S is allowed only when a password may already be fed. Learning is best-effort: a store miss/error escalates in the default order and never fails the connection; a stale mode self-heals (a sudo -S miss falls back to sudo -n).

Spec / tests

  • system-connection-profilev1.2.0: adds C-07 and AC-10 (RunSudo primitive), AC-11 (discovery firewall probe), AC-12 (liveness probe).
  • New tests: TestRunSudo_SudoModeLearning (lead-with, observe nopasswd/password, no-observation-when-ambiguous, stale-hint self-heal), TestProbeFirewall_SudoModeLearning, TestPrivilegeProbe_SudoModeLearning. Existing sudo/firewall tests updated for the new signatures + learning assertions.
  • gofmt/go vet/go build ./... clean; touched-package suites green against the isolated test DB; specter check 0 errors; system-connection-profile 12/12 ACs have results, PASSES tier 2.

With this, the auth-method and sudo-mode learning the compliance scan already had (#566) now covers all four host-talking paths.

…ce/liveness

Follow-up to PR #575 (auth-method learning). The same three paths that
talk to a managed host still probed sudo mode from scratch every cycle —
running a doomed `sudo -n` on a password-sudo host before retrying
`sudo -S`. Extend the connprofile memo to the SUDO dimension so each path
leads with the host's recorded mode and records the mode confirmed to
work.

Division of labour:
- The liveness privilege probe is the AUTHORITATIVE sudo-mode learner: it
  runs an innocuous `true` sentinel every ~5 min, so it reliably confirms
  the mode regardless of any real command's exit code.
- OS discovery (firewall probe) and OS intelligence (collector) learn
  OPPORTUNISTICALLY from their existing real sudo commands — no extra
  round-trip. To avoid misrecording, a mode is recorded ONLY on a
  confirmed exit-0 of a given sudo form, never inferred from a command
  that failed for its own reasons.

Mechanics:
- ssh.RunSudo (collector's shared primitive) gains a prefer (in) + observed
  (out): when prefer=SudoPassword and the password gate is satisfied it
  leads with `sudo -S`, skipping the doomed `sudo -n`; observed reports the
  confirmed form. Plain-string tokens (SudoNopasswd/SudoPassword) keep the
  ssh package decoupled from connprofile, matching PreferKey/PreferPassword.
- discovery.runSudoWithFallback + probeFirewall thread prefer + return the
  learned mode; the discovery Service reads it (cfg.prefer) and records it.
- sshprivilege.Probe extracts probeSudo: leads with `sudo -S` on a known
  password host, records the confirmed mode (preserving the AC-18/19/21
  error shapes).
- collector threads the recorded mode across the cycle's sudo commands and
  records once at the end.
- cmd/openwatch wires the shared connprofile store into the collector and
  discovery services (the probe already had it from #575).

The sudo password GATE (kill-switch + auth-method) is unchanged: leading
with `sudo -S` is allowed only when a password may already be fed. Learning
stays best-effort — a store miss/error escalates in the default order and
never fails the connection; a stale mode self-heals (sudo -S miss falls
back to sudo -n).

Spec system-connection-profile -> v1.2.0: C-07, AC-10 (RunSudo primitive),
AC-11 (discovery firewall probe), AC-12 (liveness probe).
@remyluslosius remyluslosius merged commit 767b1cd into main Jun 16, 2026
13 checks passed
remyluslosius added a commit that referenced this pull request Jun 16, 2026
Closes the project's biggest test blind spot: the dial, auth-ordering,
and sudo -n/-S paths were only unit-tested at the command-construction
level (stubbed transport), never against a real box. A wired-up host
could regress and every test stay green.

internal/ssh/livehost_test.go drives the REAL ssh.Dial + ssh.RunSudo —
the primitives every host-talking path (scan, discovery, collector,
liveness) shares — against an operator-supplied inventory:

  OPENWATCH_LIVE_HOSTS=/path/to/test_hosts.csv  (hostname,ip,username,credential)
  OPENWATCH_LIVE_KEY=/path/to/id_rsa

With either unset the test t.Skip()s, so it never gates normal CI; the
inventory + key stay on the operator's workstation, never in the repo.

The fleet is heterogeneous, so the test DISCOVERS each host's
capabilities rather than demanding every method everywhere. Per host it
asserts the machinery for whatever the host supports:

  - key auth dials      -> ObservedAuth == "key"      (the value the memo records)
  - password auth dials -> ObservedAuth == "password"
  - sudo mode confirmed via the `true` sentinel (nopasswd | password)
  - the real `sudo -S -k -p '' true` password-on-stdin path executes

A server-side auth rejection (key not authorized, or PasswordAuthentication
off) is a tolerated host-config fact; an unreachable host is skipped; only
an unexpected protocol-level error or a wrong ObservedAuth/sudo result
fails the test. A host with no usable auth is skipped.

Validated against the dev fleet: 5 key+NOPASSWD hosts pass (real key dial,
sudo -n, and sudo -S all exercised), key-rejecting and unreachable hosts
skip. The password-AUTH assertion is live-unverified only because the dev
fleet runs PasswordAuthentication=no everywhere (noted in BACKLOG); it
runs as soon as one password-enabled host is in the inventory.

Also drops the completed "wire SSH auth/sudo learning" backlog entry
(shipped in #575 + #576).
@remyluslosius remyluslosius deleted the feat/sudo-mode-learning branch June 17, 2026 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant