Fix worker-death stalls and pull orphaned files from a watch folder (fixes #34)#35
Fix worker-death stalls and pull orphaned files from a watch folder (fixes #34)#35bugrax wants to merge 8 commits into
Conversation
…outerdebie#34) The Imported-message route to cleanup ran on an orchestration worker, which can be permanently busy in handle_queued while downloads are active, so orphan put.io deletes never fired. Do the delete in the watch_for_import task itself (which already deletes the local copy).
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds support for scanning configurable put.io “watch folders” to discover and process orphaned completed files (files that exist on put.io without a corresponding transfer record), and hardens worker orchestration so individual task failures don’t stall the system.
Changes:
- Introduces
OrphanFiletracking in state and reports these orphans through the Transmission-compatibletorrent-getendpoint so *arr apps can import them normally. - Adds
watch_foldersconfig and a periodic scan that queues synthetic transfers for orphaned files. - Refactors download orchestration and download workers to avoid cascading worker shutdowns on channel/task failures.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/state.rs | Adds in-memory orphan tracking (OrphanFile, orphans map) to support reporting/import lifecycle. |
| src/services/putio.rs | Extends file listing response with size (defaulted) to support torrent-get reporting. |
| src/main.rs | Adds watch_folders config to enable orphan scanning. |
| src/http/handlers.rs | Extends torrent-get results to include orphan downloads for *arr import. |
| src/download_system/transfer.rs | Adds orphan transfer construction + watch-folder scanning and queuing logic. |
| src/download_system/orchestration.rs | Refactors queued-download handling to avoid killing workers on errors; adds orphan-specific import cleanup path. |
| src/download_system/download.rs | Prevents download worker from dying when status reporting channel is gone. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- handle_queued: skip transfers with no targets instead of vacuously marking them complete - scan: O(1) HashSet membership for active_file_ids/orphan_seen - synthetic hash from u64 so it stays a clean 40-hex value - log the folder id (not the file id) on a state-save failure - only add_orphan after the queue send succeeds - torrent-get: keep orphan total_size/left_until_done consistent
|
Thanks for the thorough review — addressed the substantive points (commit 8b2fab5):
On the synthetic torrent |
- Track in-progress orphans via state.has_orphan instead of an ever-growing orphan_seen set; a failed orphan is dropped from it (orchestration) so a later scan retries it - Route orphans with an explicit base dir (get_download_targets_in) instead of persisting per-orphan TransferState that never got cleaned - Skip non-media / negative-id watch-folder entries up front - torrent-get: checked u64::try_from for the orphan id - looks_like_episode: case-insensitive 'season' check without allocating
- handle_queued reuses precomputed t.targets when present so watch-folder orphans download to their intended category dir (get_download_targets_in) instead of being mis-routed by a stateless get_download_targets() - torrent-get uses the already-validated u64 id for the struct field too - throttle watch-folder scans to a 60s interval instead of every poll - log (not swallow) delete_file failures in the already-imported path - debug_assert the non-negative file_id invariant in from_orphan
…erval - torrent-get: report a non-zero remaining amount for an incomplete orphan even when put.io omitted its size, so 0/0 isn't read as complete - get_download_targets[_in]: return an error instead of panicking when a transfer has no file_id (shared generate_targets helper) - make the watch-folder scan interval configurable via watch_folder_interval_secs (default 60), documented in the template
- add request timeouts to putio::list_files and putio::url so a hung connection during a watch-folder scan can't stall transfer production - clamp watch_folder_interval_secs to >= 1s so a misconfigured 0 doesn't become a zero Duration that scans every poll - derive OrphanFile.hash from the transfer's hash instead of formatting file_id a second time (single source of truth) - clone only the Sender, not the unused Receiver, per done channel
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
src/services/putio.rs:1
sizeis modeled asi64, but file sizes are naturally non-negative and are typically represented asu64to avoid sentinel negatives and overflow edge cases. Since downstream code now has to clamp negatives (max(0)), consider switchingsize(and correspondingOrphanFile.size) tou64and performing a checked conversion when parsing API responses if needed.
use anyhow::{bail, Result};
Fixes #34. Two related parts.
Part 1 — workers no longer die (and silently stall the whole process)
work()in both the orchestration and download loops ended on the first?that errored, and the spawned task'sResultis dropped, so a dying worker logged nothing. The death also cascaded: one failingget_download_targets(e.g. a put.io error, common when a transfer is removed mid-flight) ends its orchestration worker → itsdonechannels drop → download workers feeding it die onsend()→ orchestration workers waiting on them die onrecv()→ … until every worker is gone and the process sits idle (~0.1% CPU, ~7 fds, no error logged). Files would be fully downloaded on disk yetdownload donenever fired (the download worker finished, then died at the reporting step). A restart "fixed" it until the next error.QueuedForDownloadbody moved intohandle_queued(); the worker loop logs its error and continues instead of propagating.done-statussend()is logged and ignored rather than?-propagated.This (together with the request timeouts in #32/#33, which turn a hung put.io call into an error this change then survives) is what keeps the pipeline up under load. Verified live: a previously-stalling backlog now runs to the NAS write ceiling and stays there.
Part 2 — pull orphaned files from a watch folder
putioarr only discovers work from
transfers/list. If a transfer is removed before it's pulled — most commonly with put.io's "clear completed transfers" enabled, which deletes the transfer record while leaving the file in the folder — the file can never be pulled and is stranded (I had ~500 GB sitting in the download folder with no transfers).New optional
watch_foldersconfig (list of put.io folder ids). Each poll, any video file in those folders with no active transfer and not already imported is pulled like a normal download:torrent-get(a synthetic transfer) so it imports normally,Empty by default, so it's a no-op unless configured.
Testing
Live, against a real Sonarr/Radarr setup with ~56 orphaned files (~500 GB): they were scanned, queued, pulled with correct tv/movie routing, imported by the *arr (no "No files found eligible"), the local copies removed, and the files deleted from put.io. End-to-end at the NAS write ceiling, no stalls.