Skip to content

Expose a URLHost class to JavaScript#288

Open
annevk wants to merge 2 commits into
mainfrom
annevk/urlhost
Open

Expose a URLHost class to JavaScript#288
annevk wants to merge 2 commits into
mainfrom
annevk/urlhost

Conversation

@annevk

@annevk annevk commented Mar 31, 2017

Copy link
Copy Markdown
Member
  • At least two implementers are interested (and none opposed):
  • Tests are written and can be reviewed and commented upon at:
  • Implementation bugs are filed:
    • Chromium: …
    • Gecko: …
    • WebKit: …
    • Deno: …
    • Node.js: …
  • MDN issue is filed: …

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

@annevk annevk mentioned this pull request Mar 31, 2017
@annevk

annevk commented Mar 31, 2017

Copy link
Copy Markdown
Member Author

@tabatkins it seems a little weird that just because toJSON is the same as the stringification behavior, it needs to be annotated as a stringifier rather than a method, is that really how this should work?

Comment thread url.bs Outdated
Comment thread url.bs Outdated
@tabatkins

Copy link
Copy Markdown
Contributor

it seems a little weird that just because toJSON is the same as the stringification behavior, it needs to be annotated as a stringifier rather than a method, is that really how this should work?

Just depends on how you want it to link. To get stringifier to link, define "stringification behavior". To get toJSON() to link, define the toJSON() method. Or do both.

The "stringification behavior" thing is mostly to handle anonymous stringifiers. I've thought about also just making it an implicit toString() method if no explicit one exists. (Even if you say stringifier toJSON(), you still get a toString() defined by it as well.)

@annevk

annevk commented Apr 1, 2017

Copy link
Copy Markdown
Member Author

@tabatkins I defined both, but toJSON() doesn't link and I end up with duplicate IDs somehow.

@a2sheppy

a2sheppy commented May 28, 2019

Copy link
Copy Markdown

@annevk - Is this something expected to land. I am working on updating documentation around stuff defined in this spec and it would be nice to be clear on what the state of things is. :)

@annevk

annevk commented May 28, 2019

Copy link
Copy Markdown
Member Author

I don't expect this to land in this form. If something moves here I'll add the documentation team to make sure you all are informed.

@annevk

annevk commented Apr 26, 2020

Copy link
Copy Markdown
Member Author

@tabatkins so the weird thing is that in stringifier attribute USVString href; href is seen as an attribute but in stringifier USVString toJSON () toJSON() is seen as a stringifier instead of a method. Why is that?

@tabatkins

Copy link
Copy Markdown
Contributor

Hmm, I catch stringifier specially, so likely I just didn't catch it in the attribute form. I'll look into it.

@annevk

annevk commented May 4, 2020

Copy link
Copy Markdown
Member Author

@tabatkins wouldn't going that way break existing specifications? I guess that's another way to go though...

Base automatically changed from master to main January 15, 2021 07:41
@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Oct 19, 2021
@jasnell

jasnell commented Mar 28, 2022

Copy link
Copy Markdown
Collaborator

@annevk ... at this point, what is the likelihood of this moving forward?

@annevk

annevk commented Apr 1, 2022

Copy link
Copy Markdown
Member Author

This still seems like something the web platform should offer, but I'd rather wait until browsers have more aligned URL parsers and IDNA handling before making another push to expose this API.

@annevk

annevk commented Feb 17, 2023

Copy link
Copy Markdown
Member Author

@valenting @ricea is there interest from Gecko and Chromium in this API addition? Now that we're close with IDNA this seems like a nice improvement. Note that this intentionally does not expose ToUnicode. Doing that responsibly requires a separate effort.

@ricea

ricea commented Feb 17, 2023

Copy link
Copy Markdown

I don't feel like I could confidently write an explainer for this.

@valenting

Copy link
Copy Markdown
Collaborator

I don't know if there's enough benefit to add it in its current form. If I'm reading it correctly then what it's bringing is an easy way to check whether a host is IPv4/v6/a domain. Is there much need for that?

@annevk

annevk commented Feb 17, 2023

Copy link
Copy Markdown
Member Author

It also gives an easy way to parse a host. Which can be useful if your chosen scheme always gives you an opaque host (or IPv6). And more ergonomic than something like new URL("https://" + host); which might also do the wrong thing for certain inputs.

@ricea

ricea commented Feb 17, 2023

Copy link
Copy Markdown

It seems you can get much the same functionality by abusing URL.protocol.host:

const c = new URL('https://example.com')
c.host = '😀'
'😀'
c.host
'xn--e28h'
c.host = 564
564
c.host
'0.0.2.52'

Not that I'd call that a good API, but if the functionality is only needed by a small minority of developers, it might be good enough?

@annevk

annevk commented Feb 17, 2023

Copy link
Copy Markdown
Member Author

Eww. Maybe? There definitely seems to be merit to this if we expose more of IDNA or the PSL: https://www.npmjs.com/search?q=domain%20parser and https://www.npmjs.com/search?q=idna. (And given the number of downloads of the packages there I'm not sure if it's a small minority that cares about hosts.)

@valenting

Copy link
Copy Markdown
Collaborator

If it were to (safely) include ToUnicode or other new IDNA functionality it would be easy to say we should add it. But right now it seems to be more like syntactic sugar. I'm not strongly against it, but I don't think there's a strong case for it right now.

@annevk

annevk commented Mar 6, 2023

Copy link
Copy Markdown
Member Author

To address an earlier question, there is demand for checking whether a string is an IP address: #696. And judging from https://www.npmjs.com/package/ipaddr.js this is very popular. Given how many strings can be turned into IP addresses offering an authoritative answer to that question would be good I think.

@anonrig

anonrig commented Jun 26, 2023

Copy link
Copy Markdown
Contributor

We are definitely interested in implementing this in Ada & Node.js.

@karwa

karwa commented Sep 12, 2023

Copy link
Copy Markdown
Contributor

FYI I have implemented this in my Swift library: documentation.

For low-level networking applications, you'll find that they often pass the hostname through inet_pton to decide whether they have an IP address or should use getaddrinfo (or equivalent) to look up the name. That's not really ideal - the URL parser already knows this information, so we can just tell them directly what kind of host they're looking at. At the same time, we don't need to traffic in string values -- we can give them values using a rich IPv4Address type. That means things like filtering become a lot easier:

if case .ipv4Address(let address) = url.host,
   case (10, 0, 0, _) = address.octets {
  // URL has host "10.0.0.???"
}

It's quite nice. I'm very happy with it. Possibly less useful on the web, but I could imagine NodeJS could make use of something like this.

Another facet to this API that is quite useful is the ability to parse an opaque hostname in the context of a known URL scheme. In the documentation, I give the example of processing ssh: URLs -- the standard says their hostnames are opaque, but in reality, applications will want to process them as if they were part of an http: URL (so they get IDNA and IPv4 detection).

// 🚩 "http:" URLs use a special Unicode -> ASCII conversion
//    (called "IDNA"), designed for compatibility with existing
//    internet infrastructure.

let httpURL = WebURL("http://alice@أهلا.com/data")!
httpURL       // "http://alice@xn--igbi0gl.com/data"
              //               ^^^^^^^^^^^
httpURL.host  // ✅ .domain(Domain { "xn--igbi0gl.com" })

// 🚩 "ssh:" URLs have opaque hostnames, so Unicode characters
//    are just percent-encoded. The URL Standard doesn't even know
//    this a network address, so we don't get any automatic processing.

let sshURL = WebURL("ssh://alice@أهلا.com/data")!
sshURL       // "ssh://alice@%D8%A3%D9%87%D9%84%D8%A7.com/data"
             //              ^^^^^^^^^^^^^^^^^^^^^^^^
sshURL.host  // 😐 .opaque("%D8%A3%D9%87%D9%84%D8%A7.com")

// 🚩 Using the WebURL.Host initializer, we can interpret our
//    SSH hostname as if it were in an HTTP URL.

let sshAsHttp = WebURL.Host(sshURL.hostname!, scheme: "http")
// ✅ .domain(Domain { "xn--igbi0gl.com" })

IPv4 support:

let url = WebURL("ssh://user@192.168.15.21/data")!
url       // "ssh://user@192.168.15.21/data"
url.host  // 😐 .opaque("192.168.15.21")
          //     ^^^^^^

WebURL.Host(url.hostname!, scheme: "http")
// ✅ .ipv4Address(IPv4Address { 192.168.15.21 })

Scenarios like that may be more broadly useful on the web.

@jasnell

jasnell commented Aug 8, 2025

Copy link
Copy Markdown
Collaborator

@annevk ... any updates on this an whether it might ever advance?

@annevk

annevk commented Aug 15, 2025

Copy link
Copy Markdown
Member Author

@jasnell maybe. @mikewest is exploring some ideas around exposing origin or site, and some of the ideas around public suffix and registrable domains overlap. Not sure about IP addresses vs domains though. https://github.com/mikewest/origin-api Not quite sure what to make of that yet, especially with "partition" (for lack of a better word) becoming the security boundary for state in browsers.

The other thing that concerns me a little bit is that I've learned that WebRTC wants something like this but for host + port. And so I've wondered if not addressing port is problematic.

@jasnell

jasnell commented Aug 15, 2025

Copy link
Copy Markdown
Collaborator

Port certainly is less important for workers but I can see use for it in node use cases. I'm certainly not opposed to supporting it.

@mikewest

Copy link
Copy Markdown
Member

@jasnell maybe. @mikewest is exploring some ideas around exposing origin or site, and some of the ideas around public suffix and registrable domains overlap. Not sure about IP addresses vs domains though. https://github.com/mikewest/origin-api

Regardless of where the Origin discussion goes:

  1. It seems reasonable to me to create some kind of exposure for questions around IP addresses. I don't think that distinction is particularly relevant in the context of the Origin proposal (which intentionally hides details like scheme, host, and port behind high-level comparison methods), but could fit well into something like the URLHost proposal here.

  2. It also seems pretty reasonable to expose a user agent's understanding of the PSL somewhere in the context of URL. I could imagine doing that a few ways, via isSameSite()/isSchemelesslySameSite() methods, registrableDomain getters, etc.

Not quite sure what to make of that yet, especially with "partition" (for lack of a better word) becoming the security boundary for state in browsers.

Perhaps we should try to piece together what that thing might look like? As I mentioned in WebKit/standards-positions#538 (comment), my feeling is that it would be something that held an Origin along with other bits and pieces, but I could imagine other representations.

The other thing that concerns me a little bit is that I've learned that WebRTC wants something like this but for host + port. And so I've wondered if not addressing port is problematic.

Do you have a pointer to that discussion? I'd like to understand the use case.

@annevk

annevk commented Sep 30, 2025

Copy link
Copy Markdown
Member Author

@mikewest

Copy link
Copy Markdown
Member

See step 4 of https://w3c.github.io/webrtc-pc/#dfn-validate-an-ice-server-url. (Some discussion in https://bugs.webkit.org/show_bug.cgi?id=164508 and w3c/webrtc-pc#2660.)

Thanks. I agree that something cleaner than step 4 of https://w3c.github.io/webrtc-pc/#dfn-validate-an-ice-server-url would be nice to have, though I wonder how much of a one-off it would end up being.

For STUN and TURN specifically, the prevalence of those particular protocols might justify baking a scheme-specific parsing rule into URL more directly, such that new URL("stun:server.goes.here:1234") results in a URL object with a host and port.

@annevk

annevk commented Sep 30, 2025

Copy link
Copy Markdown
Member Author

I think we have to be very careful about playing favorites with schemes. The fact that we have special schemes to begin with is rather unfortunate. I'd rather figure out building blocks that work well for everyone.

@mikewest

Copy link
Copy Markdown
Member

I think we have to be very careful about playing favorites with schemes. The fact that we have special schemes to begin with is rather unfortunate. I'd rather figure out building blocks that work well for everyone.

That's a reasonable desire. I feel like STUN and TURN scheme are defensible to add to the platform more centrally given how widely relied-upon they are, but I'm also in favor of providing the primitives that let them define behavior without inexplicable string concatenation.

Do you think both a URLHost and URLHostAndPort would be helpful? Or would you like to provide some spelling of the latter whose port could simply be ignored if irrelevant to the use case?

@annevk

annevk commented Sep 30, 2025

Copy link
Copy Markdown
Member Author

I don't have a great handle on it. It's not as easy as splitting the string on : due to IPv6, which makes me think that it would probably be desirable to have this operation available. Some other schemes that would benefit from having something here include irc and nntp. I have seen more come by, but I don't remember them.

If you don't want to allow ports you could check for that to be null. So maybe URLHostAndPort is indeed the way. It would be nice to avoid having two objects for mostly the same operation.

@mikewest

Copy link
Copy Markdown
Member

If you don't want to allow ports you could check for that to be null. So maybe URLHostAndPort is indeed the way. It would be nice to avoid having two objects for mostly the same operation.

That's also how I'd evaluate it, FWIW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: api

Development

Successfully merging this pull request may close these issues.

10 participants