M18.2: plumb ConnectTimeout/ConnectionAttempts; wrap connect with retry (FR-80)#29
Merged
Merged
Conversation
…; wrap connect with retry (FR-80)
Closes the M12.6 ConnectTimeout / ConnectionAttempts deferral loop.
The primary `AnvilSession::connect` path now retries transient TCP
failures with jittered exponential backoff per the M18.1 retry
module; per-attempt connect timeouts wrap russh's `client::connect`
in `tokio::time::timeout` when configured.
src/config.rs:
- AnvilConfig: three new public fields:
connect_timeout: Option<Duration>
connection_attempts: Option<u32>
max_retry_window: Option<Duration>
Each None falls through to retry::RetryPolicy::default() at
session-build time.
- AnvilConfigBuilder: matching three setters. CLI applies them
AFTER apply_ssh_config so flags beat config (matches OpenSSH
precedence).
- apply_ssh_config: now consumes resolved.connect_timeout and
resolved.connection_attempts, but only if the builder field is
still None (preserves CLI-wins precedence). max_retry_window
is CLI-only — not in OpenSSH's grammar.
- warn_unhonored_directives helper REMOVED. Every ssh_config(5)
directive Anvil's resolver parses today is now consumed:
HostKeyAlgorithms / KexAlgorithms / Ciphers / MACs landed in
M17; ConnectTimeout / ConnectionAttempts in M18.
src/session.rs:
- AnvilSession: new private field `retry_history:
Vec<RetryAttempt>` capturing per-attempt failures during the
connect path. Surfaced via the new
pub fn AnvilSession::retry_history(&self) -> &[RetryAttempt]
accessor for `gitway --test --json`'s data.retry_attempts
envelope (FR-83). Empty when first attempt succeeded.
- connect(): wrapped in retry::run. Each attempt:
1. Build fresh HandlerPieces (russh consumes the handler).
2. Call client::connect(...), wrapped in tokio::time::timeout
when policy.connect_timeout is Some. Elapsed → mapped to
AnvilErrorKind::Io(io::ErrorKind::TimedOut) which the
FR-82 classifier treats as Retry.
3. Return (handle, ConnectArtifacts).
Retry history flows from retry::run into the AnvilSession
field. Auth / host-key / protocol errors are fatal per FR-82
and surface immediately without retry.
- New private struct ConnectArtifacts and helper
retry_policy_from_config(config) extracted to keep the closure
signature concise.
- connect_via_proxy_command and connect_via_jump_hosts (final
hop + per-hop) construct AnvilSession with retry_history:
Vec::new() for now. The plan PR body documents this
scope-narrowing: the ProxyCommand subprocess lifecycle and
per-hop direct-tcpip channels make retry semantics murkier
than the primary path; deferred to a follow-up sub-milestone.
FR-80..FR-83 are met for `gitway --test` against a direct
target host today; the proxy / jump paths fall back to the
pre-M18 single-attempt behaviour and surface the same opaque
IO error they did before.
Public API: pure additive (three new pub fields on AnvilConfig +
three new builder setters + new retry_history accessor).
build_russh_config signature unchanged.
207 lib tests + integration tests still pass.
Plan: M18.2 of anvil-gitway-milestone-plan.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
UnbreakableMJ
added a commit
that referenced
this pull request
May 4, 2026
…t) (#30) Final Anvil-side slice of M18. Bumps anvil-ssh from 0.8.0 to 0.9.0 to publish the M18.1 + M18.2 work as a single crates.io release. The Gitway-side CLI flags + retry_attempts JSON envelope (M18.4 + M18.5) land against this 0.9.0; the M18.X PRD doc PR closes the milestone with Gitway v1.0.0-rc.9. Cargo.toml: - version "0.8.0" -> "0.9.0" Cargo.lock: - regenerated locally; reflects the 0.9.0 version. CHANGELOG.md: - 0.9.0 entry covering the new anvil_ssh::retry module (RetryPolicy, classify, run, RetryAttempt), the new CAT_RETRY tracing category, the AnvilError::io_kind + is_transient predicates, the three new AnvilConfig fields + builder setters, the apply_ssh_config consumption of ConnectTimeout / ConnectionAttempts, the AnvilSession::connect retry+timeout wrap, and the AnvilSession::retry_history accessor. Documents the proxy/jump scope-narrowing and the HTTP 429/503 out-of-scope decision. Stacked after PRs #28 (M18.1, merged) and #29 (M18.2, merged). Plan: M18.3 of anvil-gitway-milestone-plan.md. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the M12.6
ConnectTimeout/ConnectionAttemptsdeferral loop. The primaryAnvilSession::connectpath now retries transient TCP failures with jittered exponential backoff per the M18.1 retry module; per-attempt connect timeouts wrap russh'sclient::connectintokio::time::timeoutwhen configured.AnvilConfigadditionsconnect_timeout: Option<Duration>(FR-80)connection_attempts: Option<u32>(FR-80)max_retry_window: Option<Duration>(FR-81 — CLI-only)AnvilConfigBuildersettersapply_ssh_confignow consumesconnect_timeout+connection_attempts(only when the builder field isNone— CLI-wins precedence preserved)warn_unhonored_directiveshelper removed — every parsedssh_configdirective is now consumed (M17 + M18 closed every deferral)AnvilSessionadditionsretry_history: Vec<RetryAttempt>fieldpub fn retry_history(&self) -> &[RetryAttempt]accessor for thegitway --test --jsonenvelope (FR-83)connect()wrapped inretry::run— each attempt rebuildsHandlerPieces(russh consumes the handler), callsclient::connectinsidetokio::time::timeoutwhenconnect_timeoutisSome, surfacesElapsedasIo(TimedOut)so the FR-82 classifier retries it. Auth / host-key / protocol errors are fatal and surface immediatelyScope-narrowing (documented as deferred to a follow-up)
connect_via_proxy_commandandconnect_via_jump_hosts(per-hop + final) constructAnvilSessionwith emptyretry_historyfor now — the ProxyCommand subprocess lifecycle and per-hopdirect-tcpipchannels make retry semantics murkier than the primary path. FR-80..FR-83 are met forgitway --testagainst a direct target host today; the proxy / jump paths fall back to the pre-M18 single-attempt behaviour.Public API: pure additive. Version bump to 0.9.0 lands in M18.3.
Test plan
cargo fmt --all -- --checkcargo clippy --all-targets --all-features --locked -- -D warningscargo test --lib --tests --locked— all green (existing 207 lib + integration set)🤖 Generated with Claude Code