Format and Parse Amazon S3 URLs
Find the right Amazon S3 URL
Published Mar 11, 2024
Amazon S3 URLs come in different flavors. There are those starting with
s3:
, http:
, or https:
. Then, there are the ones with s3.amazonaws.com
, s3.us-east-1.amazonaws.com
, or even s3-us-west-2.amazonaws.com
(note the dash instead of the dot between s3 and the region code). And where do you put the bucket: is it <bucket>.s3.us-east-1.amazonaws.com/<key>
or s3.us-east-1.amazonaws.com/<bucket>/<key>
? And when it comes to static website hosting, of course, there is also <bucket>.s3-website-us-east-1.amazonaws.com
and <bucket>.s3-website-us-east-1.amazonaws.com
(again, note the dash and the dot).There are even more when you include the dual-stack, FIPS, access point, and S3 control endpoints. Here's the full list of Amazon S3 endpoints. But for this post, I will focus on the more common URLs that I mentioned before.
The global URL has the simplest format with the following structure:
s3://<bucket>/<key>
. This URL is also displayed by the AWS management console.The difference between path-style and virtual-hosted-style URLs is how the bucket name is included in the URL. Path-style URLs have the bucket name in the pathname of the URL:
On the other hand, virtual-hosted-style URLs have the bucket name in the hostname of the URL:
Having the bucket name in the host has the advantage of using DNS to route different buckets to different IP addresses. If the bucket name is in the path, all requests have to go to one IP address even for different buckets. That is the reason path-style URLs are deprecated, and support for this style was supposed to end in 2020, but AWS changed their plan and continues to support this style for buckets created on or before September 30, 2020. There's an interesting blog post about the background: Amazon S3 Path Deprecation Plan – The Rest of the Story
Some regions like US East (N. Virginia)
us-east-1
have a legacy global endpoint that doesn't need a region code in the hostname:If you use this type of URL for other regions that don't support it, you might either get an HTTP 307 Temporary Redirect or, in the worst case, an HTTP 400 Bad Request error, depending on when the bucket was created.
AWS recommends always using the regional endpoints with the region code in the hostname:
But also here is a caveat: some regions used to have a dash
-
instead of a dot .
between s3
and <region>
:For example, the US West (Oregon)
us-west-2
region would support the legacy dash-style URL like https://s3-us-west-2.amazonaws.com/<bucket>/<key>
. Nevertheless, the standard format https://s3.us-west-2.amazonaws.com/<bucket>/<key>
is also available for these outliers.All the URL formats we have seen so far, except the global S3 URL, are called REST endpoints. They are hosted on either the
s3.amazonaws.com
or s3.<region>.amazonaws.com
hostname, but more importantly, they support secure HTTPS connections. That means all these URLs work with https://
as the protocol.Amazon S3 also has a website endpoint for static website hosting. The website endpoint does not support HTTPS, only HTTP. These URLs have the following formats:
Again, depending on the region, there is a dash
-
or a dot .
separating s3-website
and <region>
. To see which one is right for your region, you have to check the list of Amazon S3 website endpoints.Depending on how you interact with Amazon S3, you might use one of the previous URLs. For example, the AWS CLI for S3 expects the S3 URL in the global format
s3://<bucket>/<key>
. Other clients and SDKs probably use the regional REST endpoint with the bucket name either in the hostname or pathname.If you're using the wrong format or endpoint, you might get an error like this:
The right URL really depends on the individual client and how it is requesting from S3. To lift some of this burden, I created a tiny JavaScript library to check, format, and parse S3 URLs in the various formats I described earlier.
At the moment, the library exports only three functions:
formatS3Url
, parseS3Url
, and isS3Url
.The library does only rudimentary URL validation on the structure of the URL, but it doesn't validate the bucket name, object keys, and regions. Also, it doesn't support the dual-stack, FIPS, access point, control, and website endpoints yet. But I'm happy to welcome any external contribution.