Skip to content

Harden upstream checksum fetching and validation#1235

Open
ideaship wants to merge 5 commits into
mainfrom
harden-checksum-fetching
Open

Harden upstream checksum fetching and validation#1235
ideaship wants to merge 5 commits into
mainfrom
harden-checksum-fetching

Conversation

@ideaship

@ideaship ideaship commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Follow-up to the review of the checksum_url feature:

These robustness gaps predate that PR — it merely duplicated existing patterns onto the new checksum_url path — so they are fixed here uniformly across both checksum paths instead of being requested as changes on the contribution.

Changes

Harden both checksum fetch sites. get_checksum_from_checksums_url and get_checksum_from_checksum_url called requests.get() with no timeout, status check, or exception handling, outside any try/except on the process_image path. A connection failure or invalid URL aborted the entire run, a hung server blocked it indefinitely, and an HTTP error page was parsed as a checksums file, producing a misleading "could not find checksum" message. Both sites now use a bounded timeout, call raise_for_status(), and catch requests.RequestException by logging and returning an empty checksum, so a failing checksum URL only skips that image.

Skip cleanup of outdated images when processing errors occur. An image that fails to process (e.g. its upstream checksum could not be determined) is missing from the managed images set, so manage_outdated_images() would treat its healthy cloud images as removal candidates and could hide, deactivate, or delete them when the corresponding flags are enabled. A checksum lookup failure now sets exit_with_error (consistent with the other skip paths in process_image), and main() no longer runs the cleanup when that flag is set — cleanup resumes on the next error-free run. Note the behavior change: runs with processing errors now defer cleanup, and a checksum lookup failure now causes a non-zero exit status.

Validate hexadecimal content in is_checksum(). It previously accepted any 32/40/64/128-character string without a period (e.g. "x" * 64). Since this value is stored as the upstream_checksum property and drives latest-image change detection, fixed-length junk could suppress a needed update or force a spurious re-import.

Restrict checksum URL schemes to HTTP(S). The schema accepted ftp:///ftps:// for checksum_url, checksums_url, and latest_checksum_url, but requests.get() has no FTP adapter and raises InvalidSchema, so a schema-valid configuration could abort the manager. No existing image definition uses FTP URLs.

Add unit tests for is_checksum, both fetch functions (bare digest, manifest matching, junk body, HTTP error, connection failure, timeout usage), the schema URL scheme restriction, and the error-handling paths (checksum lookup failure sets the error flag; cleanup is skipped on errors).

Document the field contract: checksums_url expects a manifest containing the image filename; checksum_url expects a single bare digest (e.g. Alpine sidecar files).

🤖 Generated with Claude Code

@ideaship ideaship marked this pull request as draft June 11, 2026 05:37
ideaship added 5 commits June 11, 2026 07:44
The two checksum fetch paths used for latest-image change detection
(get_checksum_from_checksums_url and get_checksum_from_checksum_url)
called requests.get() without a timeout, status check, or exception
handling. Since these calls run outside any try/except on the
process_image path, a connection failure, DNS error, or invalid URL
aborted the entire run, and an unresponsive server could block it
indefinitely. An HTTP error response was silently parsed as a
checksums file, producing a misleading "could not find checksum"
message instead of a clear HTTP error.

Both fetch sites now use a bounded timeout, call raise_for_status(),
and catch requests.RequestException by logging an error and returning
an empty checksum, so a single unreachable checksum URL only skips
that image instead of aborting the run.

Add unit tests for both functions covering the bare-digest and
manifest-matching success paths, a junk response body, an HTTP error
status, a connection failure, and the timeout argument.

These weaknesses predate the checksum_url feature (#1202); that
change merely duplicated the existing pattern onto the new path.

Assisted-by: Claude:claude-fable-5
Signed-off-by: Roger Luethi <luethi@osism.tech>
is_checksum() accepted any 32/40/64/128-character string without a
period as a valid digest, e.g. "x" * 64 passed as a SHA-256 checksum.
Because the extracted value is stored as the upstream_checksum image
property and drives latest-image change detection, a fixed-length
junk response from a checksum URL could suppress a needed update or
force a spurious re-import.

Validate the digest with re.fullmatch against hexadecimal content of
the four supported lengths (MD5, SHA-1, SHA-256, SHA-512). The
previous "." exclusion is subsumed by the hex match.

This logic was extracted, not introduced, by #1202; the same gap
existed in the old inline get_checksum code on the checksums_url
path, so this hardens both paths at once.

Assisted-by: Claude:claude-fable-5
Signed-off-by: Roger Luethi <luethi@osism.tech>
The schema accepted ftp:// and ftps:// URLs for checksum_url,
checksums_url, and latest_checksum_url, but requests.get() has no FTP
adapter and raises InvalidSchema for such URLs. A schema-valid
configuration could therefore abort the manager at runtime.

Restrict the regex on these three fields to HTTP(S), so the schema
only accepts URLs the implementation can actually fetch. No image
definition in this repository uses FTP URLs in these fields. The
latest_url, url, mirror_url, and source fields are not fetched
through this code path and are left unchanged.

Add a schema unit test asserting that the three fields accept http
and https URLs and reject ftp and ftps URLs.

Assisted-by: Claude:claude-fable-5
Signed-off-by: Roger Luethi <luethi@osism.tech>
The distinction between checksums_url and checksum_url was not
documented anywhere: checksums_url expects a checksums file that
contains the image filename (e.g. the SHA256SUMS manifests published
by Ubuntu), while checksum_url, added in #1202, expects a file
containing a single bare digest and nothing else (e.g. the .sha512
sidecar files published by Alpine). The schema regex can only assert
that the value is a URL, not which content format the file must
have, so without documentation users are likely to pick the wrong
field and have their latest images silently skipped.

Spell out the contract in the README and in the docstrings of the
two checksum extraction functions.

Assisted-by: Claude:claude-fable-5
Signed-off-by: Roger Luethi <luethi@osism.tech>
When process_image() fails to determine the upstream checksum of a
latest image, it returns early without adding the image's name to
the managed images set. main() then ran manage_outdated_images()
regardless, which treats every cloud image missing from that set as
a removal candidate. A transient failure fetching a checksum file
could therefore get a perfectly healthy image hidden, deactivated,
or deleted when the corresponding cleanup flags are enabled. The
hardening of the checksum fetching widened this window: connection
errors and timeouts that previously crashed the run before cleanup
could start now continue into cleanup.

The same gap existed for the other skip paths in process_image()
(local file missing, download URL check failure): they set
exit_with_error, but main() only evaluated the flag after the
cleanup had already run.

Fix this in two places. The checksum lookup failure now sets
exit_with_error, consistent with the other skip paths. main() no
longer calls manage_outdated_images() when exit_with_error is set,
because the managed images set is incomplete at that point; cleanup
resumes on the next error-free run.

Note this changes behavior for runs with errors: cleanup is deferred
instead of operating on an incomplete managed images set, and a
checksum lookup failure now causes a non-zero exit status.

Assisted-by: Claude:claude-fable-5
Signed-off-by: Roger Luethi <luethi@osism.tech>
@ideaship ideaship force-pushed the harden-checksum-fetching branch from d5d7e73 to 70764ad Compare June 11, 2026 06:37
@ideaship ideaship marked this pull request as ready for review June 12, 2026 07:54
@ideaship ideaship requested a review from berendt June 12, 2026 07:54
@ideaship ideaship moved this from Ready to In review in Human Board Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

2 participants