fix(keepkey-webusb): correct stalled-read recovery + time-bound USB transfers#44
Closed
BitHighlander wants to merge 3 commits into
Closed
fix(keepkey-webusb): correct stalled-read recovery + time-bound USB transfers#44BitHighlander wants to merge 3 commits into
BitHighlander wants to merge 3 commits into
Conversation
readChunk() reads the IN endpoint (transferIn), but on a "stall" status it
called clearHalt("out", ...) -- the wrong endpoint and direction. On WinUSB
the halted IN pipe then stays halted and every subsequent read re-stalls
until a physical replug. It also fell through and returned the stalled
transfer's (empty/short) buffer, which the framing parser rejected as
"message not valid".
Clear the halt on the IN pipe that actually stalled and throw a retryable
"bad read" instead of returning a non-packet. Applied to both the Node
(nodewebusb) and browser (webusb) transports.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…can't hang forever transferIn/transferOut were submitted with no timeout (libusb defaults to infinite), so a device that stops responding -- mid-reboot, suspended by Windows USB selective-suspend, or unplugged mid-write -- left the read/write promise pending forever, holding the claimed interface and producing a false ConflictingApp on the next pair. Wrap each transfer in a timeout: DEFAULT_TIMEOUT (5s) for writes (which never legitimately block) and LONG_TIMEOUT (5min) for reads (which may wait on a device button press for PIN/passphrase/tx confirmation, so they must not be cut short). On timeout we throw; the caller's existing disconnect path closes the device, aborting the orphaned native transfer and releasing the interface. Note: the read ceiling is LONG_TIMEOUT because the transport cannot see the per-call timeout (msgTimeout in hdwallet-keepkey is set but never plumbed down). A follow-up could thread the call timeout through readChunk for tighter bounds on normal (non-button) operations. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…sfer timeout The prior comment overstated that the caller's disconnect path releases the interface on timeout. It does on the syncState/getFeatures recovery path but NOT on the sign/address RPC paths. Document the real behavior + the remaining options (harden caller, or release in-transport).
Collaborator
Author
|
Superseded by the focused, lower-risk split: #45 contains just the clearHalt IN/OUT correctness fix. The transfer-timeout half of this PR is deferred (it touches the shared transport more invasively and needs its own cross-platform validation); it can come back as a separate PR if/when Windows USB diagnostics confirm a real wedged-transfer problem. Closing to keep the correctness fix clean and independently reviewable. |
BitHighlander
added a commit
that referenced
this pull request
Jun 7, 2026
…#45) readChunk() reads the IN endpoint (transferIn) but on a "stall" it cleared clearHalt("out", ...) -- the wrong endpoint and direction. A halted IN pipe is only resumed by resetting that same pipe, so it stayed halted. The browser transport also fell through and returned the stalled (empty/short) buffer, which the framing parser then rejected as "message not valid". Reset the IN pipe and throw a retryable "bad read" instead of surfacing a non-packet. Applied to both hdwallet-keepkey-nodewebusb (Node/libusb) and hdwallet-keepkey-webusb (browser). Split out from #44 (which also added transfer timeouts); this is the isolated, low-risk correctness fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two robustness fixes to the KeepKey WebUSB transports, surfaced by a Windows USB audit of keepkey-vault (companion PR #227). These address the "device froze / had to restart the app" class of Windows reports (distinct from the "device not detected" symptom, which is fixed app-side in the vault).
Commit 1 —
clearHalton the correct endpoint when a read stalls (both transports)readChunk()reads the IN endpoint (transferIn), but on a"stall"status it calledclearHalt("out", …)— the wrong endpoint and direction. On WinUSB a halted IN pipe is only resumed by resetting that same pipe, so the IN pipe stayed halted and every subsequent read re-stalled until a physical replug. It also fell through and returned the stalled transfer's empty/short buffer, which the framing parser then rejected as"message not valid".Fix:
clearHalt("in", …)on the pipe that actually stalled, and throw a retryable"bad read"instead of returning a non-packet. Applied to bothhdwallet-keepkey-nodewebusb(Node/libusb) andhdwallet-keepkey-webusb(browser).Commit 2 — time-bound USB transfers (
hdwallet-keepkey-nodewebusb)transferIn/transferOutwere submitted with no timeout (libusb defaults to infinite). A device that stops responding — mid-reboot, suspended by Windows USB selective-suspend, or unplugged mid-write — left the promise pending forever, holding the claimed interface and producing a falseConflictingApp(LIBUSB code 19) on the next pair.Fix: wrap each transfer in a timeout —
DEFAULT_TIMEOUT(5 s) — a write never legitimately blocks;LONG_TIMEOUT(5 min) — a read may legitimately wait on a device button press (PIN, passphrase, tx/seed confirmation), so it must not be cut short.On timeout we
throw; the caller's existing disconnect path closes the device, which aborts the orphaned native transfer and releases the interface.Scope / notes
LONG_TIMEOUT(5 min) because the transport can't see the per-call timeout —msgTimeoutinhdwallet-keepkeyis set but never plumbed down toreadChunk. A follow-up could thread the call timeout throughreadChunk/writeChunkfor tighter bounds on normal (non-button) operations. The current bound is strictly better than infinite and safe for all button flows.webusbtransport gets theclearHaltfix only. The same timeout pattern could be mirrored there.withTimeoutpattern used in keepkey-vault, andcore.DEFAULT_TIMEOUT/core.LONG_TIMEOUTare already used byhdwallet-keepkey/src/transport.ts.Testing requested before merge
Exercise on real hardware on Windows: PIN entry, passphrase, a tx signing left to sit >60 s, and seed verification — confirm none time out — plus an unplug-mid-operation to confirm clean recovery (no false "device in use" on reconnect).
After merge, bump the
modules/hdwalletsubmodule pointer in keepkey-vault.🤖 Generated with Claude Code