Skip to content

Parts of a URL

Published Updated 4 min read

This guide comprehensively describes the parts of a URL, across the browser standards and other specs.

URL vocabulary is split across three sources, each authoritative in its domain:

The explorer below is interactive and allows you to explore the parts of a URL, and how they are represented in the different standards. Type or paste in a URL to see how it is parsed:

URL anatomy
Legend:RFC 3986Public Suffix Listunmarked = WHATWG
username
protocolpasswordhostnamehostportpathnamepathsearchhash
schemesubdomaintldqueryfragment
public suffix
registrable domain
userinfohost
authority
originorigin
protocol
https:
username
user
password
pass
hostname
www.example.co.uk
port
8080
host
www.example.co.uk:8080
pathname
/path/to/resource
search
?query=string
hash
#section
origin
https:// www.example.co.uk:8080

The parts of a URL

Components below use WHATWG names. The Rosetta table maps each to its RFC 3986 equivalent.

Rosetta table

ConceptWHATWGRFC 3986Note
SchemeprotocolschemeWHATWG includes the trailing :.
Bare host name / IPhostnamehost
Hostname plus porthostWHATWG host includes the port; RFC host does not.
Portportport
Credentialsusername + passworduserinfo
Pathpathnamepath
Query stringsearchqueryWHATWG includes the leading ?.
FragmenthashfragmentWHATWG includes the leading #.
Originoriginscheme + :// + host + port, excluding userinfo. Null for file:// and opaque-path schemes.
SubdomainFrom the Public Suffix List, not either URL spec. The labels left of the registrable domain.
Public suffixFrom the PSL. Longest suffix under which a registrar will let anyone register.
Registrable domainFrom the PSL. Public suffix plus one more label. The unit cookies, CORS, and same-site policies care about.

URI, URL, URN

The Public Suffix List

Cookies, same-site checks, and CORS compare hostnames by registrable domain. For kai.github.io, the registrable domain is kai.github.io and the public suffix is github.io. github.io isn’t a TLD, but it acts like one for registration. A TLD list misses this. Browsers consult the Public Suffix List instead: a community-maintained list of every public suffix, including com, co.uk, github.io, and vercel.app. The PSL introduces three new terms:

Edge cases

A few URL shapes break the usual scheme://host/path mental model.

Parsing oddities

Schemes without // authority, like mailto:, urn:, data:, javascript:, have an opaque path, where everything after the scheme is treated as a single string.

Spec disagreements

IPv6 zone identifiers. Given https://[fe80::1%25eth0]/:

Browsers follow WHATWG, so zone IDs in URLs don’t work in practice.

Unencoded @ in userinfo. Given https://a@b@example.com:

Browsers follow WHATWG. To stay compatible with both specs, percent-encode @ as %40 inside userinfo.

References

Keep reading

  • Base Encoding Guide

    A comprehensive guide to base encoding formats including Base64, Base32, Base58, and their use cases.