diff --git a/url.bs b/url.bs index c69d2349..b2834297 100644 --- a/url.bs +++ b/url.bs @@ -111,7 +111,7 @@ valid input. User agents, especially conformance checkers, are encouraged to rep [[UTS46]]
If details about Unicode ToASCII errors are recorded, user agents are encouraged to pass those along. -
"https://exa%23mple.org"
Unicode ToUnicode records an error. [[UTS46]] -
The same considerations as with domain-to-ASCII apply. -
Let result be the result of running Unicode ToASCII - with domain_name set to domain, CheckHyphens set to beStrict, - CheckBidi set to true, CheckJoiners set to true, UseSTD3ASCIIRules set to - beStrict, Transitional_Processing set to false, VerifyDnsLength set to - beStrict, and IgnoreInvalidPunycode set to false. [[!UTS46]] - -
If beStrict is false, domain is an ASCII string, and
- strictly splitting domain on U+002E (.) does not produce any
- item that starts with an ASCII case-insensitive match for
- "xn--", this step is equivalent to ASCII lowercasing domain.
-
-
If result is a failure value, domain-to-ASCII validation error, - return failure. +
If beStrict is true: + +
Let result be the result of running + Unicode ToASCII with domain_name set to domain, + CheckHyphens set to true, CheckBidi set to true, CheckJoiners set to true, + UseSTD3ASCIIRules set to true, Transitional_Processing set to false, + VerifyDnsLength set to true, and IgnoreInvalidPunycode set to false. [[!UTS46]] + +
If result is a failure value, domain-to-ASCII validation error, + return failure. + +
Return result. +
Let result be null.
If beStrict is false: +
If domain is an ASCII string:
If result is the empty string, domain-to-ASCII validation error, - return failure. +
If running Unicode ToASCII with domain_name set to + domain, CheckHyphens set to false, CheckBidi set to true, + CheckJoiners set to true, UseSTD3ASCIIRules set to false, + Transitional_Processing set to false, VerifyDnsLength set to false, and + IgnoreInvalidPunycode set to false is a failure value, domain-to-ASCII + validation error. [[!UTS46]] + +
Set result to domain, lowercased. +
If result contains a forbidden domain code point, - domain-invalid-code-point validation error, return failure. +
When beStrict is false and domain is an ASCII string,
+ Unicode ToASCII failures only result in validation errors
+ (instead of failing the whole algorithm) due to web compatibility. IgnoreInvalidPunycode
+ is not sufficient on its own, as Punycode can decode successfully yet still fail validity
+ criteria. E.g., xn--8i7caa decodes to www, whose code points have
+ status "mapped". [[UTS46]]
+
+
Otherwise: -
Due to web compatibility and compatibility with non-DNS-based systems the - forbidden domain code points are a subset of those disallowed when - UseSTD3ASCIIRules is true. See also - issue #397. +
Set result to the result of running + Unicode ToASCII with domain_name set to domain, + CheckHyphens set to false, CheckBidi set to true, CheckJoiners set to true, + UseSTD3ASCIIRules set to false, Transitional_Processing set to false, + VerifyDnsLength set to false, and IgnoreInvalidPunycode set to false. [[!UTS46]] + +
If result is a failure value, domain-to-ASCII validation error, + return failure.
If result is the empty string, domain-to-ASCII validation error, + return failure. +
Assert: result is not the empty string and does not contain a - forbidden domain code point. +
If result contains a forbidden domain code point, + domain-invalid-code-point validation error, return failure. -
Unicode IDNA Compatibility Processing guarantees this holds when - beStrict is true. [[UTS46]] +
Due to web compatibility and compatibility with non-DNS-based systems the + forbidden domain code points are a subset of those disallowed when UseSTD3ASCIIRules + is true. See also issue #397.
Return result.
Signify domain-to-Unicode validation errors for any returned errors, and then, - return result. +
If an error was recorded, then return domain. + +
Because domain can only result from the host parser, any recorded
+ errors will already have been signified as validation errors. Returning domain
+ ensures domain to ASCII and domain to Unicode roundtrip on input such as
+ xn--8i7caa.
+
+
Return result. @@ -4167,6 +4193,7 @@ Ian Hickson, Ilya Grigorik, Italo A. Casas, Jakub Gieryluk, +James C. Wise, James Graham, James Manger, James Ross,