Skip to content

Add missing timeouts to put.io requests that can hang and stall all downloads (fixes #32)#33

Open
bugrax wants to merge 4 commits into
wouterdebie:mainfrom
bugrax:timeout-on-send
Open

Add missing timeouts to put.io requests that can hang and stall all downloads (fixes #32)#33
bugrax wants to merge 4 commits into
wouterdebie:mainfrom
bugrax:timeout-on-send

Conversation

@bugrax

@bugrax bugrax commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Fixes #32.

Problem

Several put.io HTTP calls could block indefinitely if put.io accepts the connection but stalls before responding (which happens under load), because they had no request timeout — only the shared download client's connect_timeout.

The worst offenders are list_files and url, called from get_download_targets for every transfer. When they hang, the orchestration worker that called them is stuck before any download starts; once the few workers are stuck the whole process stops pulling while transfers pile up COMPLETED on put.io. A restart only clears it temporarily.

Observed live: after a restart with ~100 COMPLETED transfers, a handful log generating targets and then nothing — get_download_targets is parked inside list_files/url with no timeout.

Fix

  • Add .timeout(30s) to list_files, url, and account_info (the put.io calls that were missing it; the others already had one).
  • Wrap the download req.send() in a 60s tokio::time::timeout as well, so a stalled download request (after get_download_targets succeeds) also fails and is retried/resumed by the existing loop instead of hanging.

Together these ensure no single hung put.io request can freeze a worker indefinitely.

…e#32)

fetch only set a connect_timeout and a per-chunk timeout around the byte
stream. The req.send() call (connect + waiting for response headers) had
no timeout of its own, so if put.io accepted the connection but stalled
before sending headers, send() blocked forever. The download worker
parked there, the orchestration worker blocked on its done channel, and
once the few download workers were stuck the whole process stopped
pulling (a restart only cleared it temporarily).

Wrap req.send() in a 60s timeout; a stalled request now errors and the
existing retry loop resumes it instead of hanging.
Copilot AI review requested due to automatic review settings June 10, 2026 07:51

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a request-level timeout around reqwest’s send() to prevent stalled connections from blocking download workers indefinitely.

Changes:

  • Wrap req.send() in tokio::time::timeout to bound time waiting for response headers.
  • Add explanatory inline comment describing the stall scenario and retry behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/download_system/download.rs Outdated
Comment thread src/download_system/download.rs Outdated
Comment thread src/download_system/download.rs
These three put.io API calls were missing the .timeout() the other
put.io calls already have. list_files and url are called from
get_download_targets for every transfer; when put.io accepted the
connection but stalled, they hung with no timeout, freezing the
orchestration worker before any download even started — the real cause
of the recurring download stall (the download send() timeout in the
previous commit covers a later point in the same path).
@bugrax bugrax changed the title Time out the download request, not just the connect (fixes #32) Add missing timeouts to put.io requests that can hang and stall all downloads (fixes #32) Jun 10, 2026
Every put.io call built a fresh reqwest::Client, so connections were
never reused. Resuming a large account generates targets for every
transfer (list_files + url each), and the resulting client/socket churn
exhausts file descriptors until requests hang with no progress and no
error. Share a single connection-pooled client via OnceLock instead.
Extract the request and stream-idle timeouts into REQUEST_TIMEOUT and
STREAM_IDLE_TIMEOUT constants, and include the duration (and target url)
in the timeout messages instead of a hard-coded host/duration.
@bugrax

bugrax commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Addressed (commit 2137007): extracted the two timeouts into REQUEST_TIMEOUT and STREAM_IDLE_TIMEOUT constants, and made the timeout messages generic — they now include the duration and the target URL instead of a hard-coded host/60s. Left the #32 reference in the comment as a short pointer; happy to drop it if you prefer the code stay issue-agnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Downloads can hang forever in reqwest send() (no request timeout), freezing all download workers

2 participants