Implement WHATWG URL spec change for Windows drive letter paths (whatwg/url#874)#1121
Implement WHATWG URL spec change for Windows drive letter paths (whatwg/url#874)#1121renezander030 wants to merge 1 commit into
Conversation
Per whatwg/url#874 (still open at time of writing), the URL parser's scheme start state recognizes `<ASCII alpha> : \` (a Windows drive letter pattern) as a Windows drive path rather than a single-letter URL scheme. The parser sets scheme to "file", host to empty string, and transitions to path state, producing a URL of the form `file:///<drive>:/path`. Forward-slash drive paths (`c:/foo`) are intentionally not covered: the `<alpha> : /` shape is ambiguous with single-letter scheme URLs (e.g. `c:` scheme, `a://example.net`) and rewriting them would regress legitimate scheme URLs. Includes 8 unit tests covering basic conversion, drive-letter case preservation, mixed separators, percent-encoded path components, query/fragment, base-URL override, and regression guards for single-letter scheme URLs that must NOT be rewritten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@renezander030 Did you use generative AI to produce the code in this pull request or the PR description? |
|
Yes. I used Claude (an LLM) to help draft both the code and the PR description, working from my earlier Deno-side PR (denoland/deno#33097) that @crowlKats redirected here. The substance is mine: the conservative scope (backslash-only, not forward-slash drive paths) and the regression guards for Happy to comply with whatever policy Servo / rust-url has on AI-assisted contributions — if you'd prefer I rewrite the PR description by hand, or if there's a contributor attestation you'd like me to add, please let me know. |
|
Here's Servo's AI use policy: https://book.servo.org/contributing/getting-started.html#ai-contributions It's fine to use LLMs to help understand an issue, find bugs, or to plan / architect a change, but all code, issue text, and PR text must be your own (apart from translation and transcription). |
I decline. Closing this PR now. |
Summary
Implements whatwg/url#874: in the URL parser's scheme start state,
<ASCII alpha> : \is recognized as a Windows drive letter path and parsed into afile:///<drive>:/...URL.Conservative scope — only the backslash shape triggers the conversion. Forward-slash drive paths (
c:/foo) are intentionally not rewritten because<alpha>:/is ambiguous with single-letter scheme URLs (c:scheme,a://example.net,h://.,w://x:0). Tests guard against regressing those.Status: draft — pending whatwg/url#874 merge
The spec PR is still open (last activity 2025-11-28). Opening this as a draft so the implementation can be reviewed in parallel and land quickly once the spec merges. The spec PR cites multi-implementer interest (Ladybird, jsdom/Node, Chromium, Gecko, WebKit, Deno).
Context: a Deno-side port (denoland/deno#33097) was redirected upstream to rust-url by @crowlKats — "we would rather have this land in rust-url even if we have to wait a bit." This PR is that upstream landing.
Implementation
url/src/parser.rs:starts_with_windows_drive_letter_pathpeeks 3 chars for<alpha>:\(placed alongside the existingstarts_with_windows_drive_letter_segmentfamily).parse_windows_drive_letter_pathruns the spec's path state directly: pushesfile://, sets empty host, dispatches toparse_pathwith the original input.parse_pathhandles\→/normalization for special schemes, producing/C:/path/file.txt.parse_urlchecks the new helper beforeparse_scheme, short-circuiting only when the pattern matches.The Windows-drive shortcut deliberately drops any base URL — the spec's scheme start state ignores base when scheme is set, and
<alpha>:\is unambiguously absolute.Tests (
url/tests/unit.rs)8 new unit tests:
windows_drive_path_basic—C:\path\file.txt→file:///C:/path/file.txtwindows_drive_path_different_drives—D:,Z:windows_drive_path_preserves_drive_case— lowercase + uppercase preservedwindows_drive_path_mixed_separators—\and/interchangeable in bodywindows_drive_path_percent_encodes_spaces— path encoding still applieswindows_drive_path_drops_base— base URL ignored when shortcut fireswindows_drive_path_with_query_and_fragment—?q=1#fragflows correctlywindows_drive_path_does_not_rewrite_scheme_urls— regression guard forc:/foo,a://example.net,h://.,w://x:0Full suite (
cargo test -p url) and WPT (cargo test --test url_wpt -p url) pass with no regressions.Test plan
urltestdata.json)cargo fmt --checkcleancargo clippy --all-targetscleanurltestdata.jsononce web-platform-tests/wpt#53459 lands and the spec PR merges