Static Websites on AWS S3

Stanislav Saprankov
AWS Tip
Published in
9 min readFeb 12, 2022

--

The target group of this article is developers who are not certain about what goes into setting up a personal website from scratch, as well as those who like to compare their own experience with that of another person.

We will talk about the AWS services that are necessary or very useful for setting up a static website. We will also touch on the CI / CD topic in case your site is somewhat advanced and can make use of it.

I also dedicated a small section to calling AWS services from a browser. It is not directly related to the topic but rather to the static websites in general, as in such websites there is no backend by definition and any API calls should only rely on what is available in the browser environment.

I will assume the reader understands the acronyms that denote some of the famous AWS services and what those services are for. Otherwise, the best advice is to make use of the official AWS docs in order to quickly catch up.

Let us define a static website. Simply put — a static website is nothing but a bunch of files that get served to a user as is. Dynamic websites, on the contrary, serve their pages after they’ve been generated on a user’s request.

Even dynamically generated pages are normally cached and hence served without prior generation. However, their origin is still the backend code that generates them. Whereas static websites are usually uploaded and served from some cloud file storage and are not built on request.

The use case for a static website might be as simple as a business card or more complex, like a photo gallery with animations and contact forms.

Let us define a couple of general requirements for the abstract static website on S3 that we are going to discuss here:

  1. It should be accessible via domain name owned by us.
  2. It should only be accessible via HTTPS.
  3. Its assets should be cached close to the user (think CDN).

Those are all optional requirements, but I’d like to show all the main features that make hosting static sites on AWS so attractive. Let’s go through the setup process step by step.

Domain name and Hosted Zone

First things first — we need a domain name, as AWS won’t allow us to use its own.

I believe you to be familiar with Route53. It is a popular AWS service that combines multiple features including DNS server and routing configuration.

A couple of things to know in regard to domain management with Route53.

  1. A domain name can be purchased from Route53 or another DNS provider.
  2. You need at least one hosted zone. A hosted zone describes ways of routing traffic to your domain. Among other things a hosted zone includes routing policies and aliases.
  3. Routing policies tell Route53 how to redirect the traffic coming to your domain. For instance, you may block your site for certain countries or redirect a certain percentage of the traffic to a different endpoint.
  4. Aliases point a DNS name to an AWS service, like CloudFormation distribution or API Gateway deployment.
  5. If you purchase your domain name from Route53 — you get a hosted zone created automatically.
  6. If your domain is purchased from another provider, you will have to create a hosted zone on your own. After that—you’ll need to set the name servers for your domain on the side of the DNS provider to be the same as the ones associated with your hosted zone in Route53.

I believe nothing explains the details better than AWS’ official documentation.

Infrastructure

I’ve mentioned S3 in the title of the article, haven’t I? S3 is a popular object storage. It also provides a nice feature for hosting static websites.

However, we want to go a little bit more advanced, so that S3 alone will not satisfy all our needs.

Beside S3 — which is a place where our files are stored, we need a way to cache them and serve them. We may need to set appropriate headers for our responses as well as a certificate of authority for our website. We can also set up logging in order to at least know how many people access our website. The list is not exhaustive. We will call services that enable all this our infrastructure.

One good thing about hosting stuff on AWS is that you almost never have to search for a 3rd party solution since AWS got a dedicated service for pretty much any task related to website infrastructure and monitoring. In the following section we will see how we can set up all our infrastructure in one place.

CloudFormation

In order to make setting up multiple infrastructural components smooth and easily manageable, AWS offers a service named CloudFormation that embodies the concept of infrastructure as code (IaC). For those unfamiliar with CloudFormation —a short and simplified definition might sound as follows.

CloudFormation is a tool that creates AWS infrastructure on developers’ behalf and according to developers’ instructions. Instructions are written in a form of YAML or JSON files called templates.

It’s worth mentioning that CloudFormation is one of the fundamental services in AWS. More sophisticated and user-friendly solutions, such as SAM or AWS CDK are built on top of it.

Since it is such a fundamental service with innumerable use cases, by now there exists a huge variety of prepared templates to cover them completely or partially.

There is (at least) one template for a static website hosted on S3. The template itself is stored on a public AWS bucket and is accessible to everyone. It is also regularly updated. You can access it here.

AWS Documentation contains a description of the process incl. diagrams.

Template for static website on S3

The template itself contains 3 nested templates. You can easily download and explore any of them on your own. By default at the moment of writing those templates create the following resources:

  • S3 bucket where your website’s files will be stored. Its policy will only allow access from the CloudFront distribution*.
  • CloudFront distribution to serve your static assets.
  • Route53 records that will redirect requests to your CloudFront distribution.
  • S3 bucket for storing access logs from your root bucket.
  • A lambda function that will copy some default assets (index.html / 404.html) to the newly created bucket.
  • Certificate of authority issued by AWS.
  • ResponseHeadersPolicy for CloudFront responses. In previous versions of the template Lambda@Edge was used for that reason, which made it a bit more complicated.

* S3 doesn’t support HTTPS, so the pattern here is to make external users access CloudFront via HTTPS which in turn will connect to S3 via HTTP.

All the template files are fairly short and very well structured. It is not hard to fine-tune them to better suit your needs.

For instance, you may not need object access logging for your root bucket and thus remove the Logs bucket and references to it. Remember that any service you are using will incur costs unless its usage is under quotas covered by the free tier.

Or you might want to customize the CloudFront settings, such as TTL for different kinds of responses, response headers, the name of the index page to serve and so on.

I recommend taking some time and making yourself familiar with the template. Even if you leave everything as is — you will be more confident about the behavior of your website which might save you some surprises later.

Also if you ever find yourself in a situation where you want to change the infrastructure— the best practice is to only make changes in the template file and apply them by uploading that file to CloudFormation.

This way a mere look at your template will explain the state of your AWS services. From my experience is it hard to memorize everything you’ve set up a year ago even if the infrastructure is simple.

Aye, it goes without saying that you don’t have to use the default template. In fact, if you don’t even need HTTPS you might be fine creating a bucket and toggling the ‘Static website hosting’ setting. Don’t make it more complicated than it has to be.

Interaction with AWS in a browser

It can happen that you would need to access AWS API from the client’s code. For example, to list objects on another bucket. Feel free to skip this section if you don’t.

Normally you would use the official aws-sdk library for interactions with AWS services. For some operations you may need to authorize and be granted a role with specific permissions.

Performing auth with AWS services is an interesting topic on its own and I won’t cover but a small part of it here.

AWS has a concept of credentials providers chain. It goes through the links of the chain until it manages to pick up some credentials. In javascript package of aws-sdk code we can find the following:

/**
* The default set of providers used by a vanilla CredentialProviderChain.
*
* In the browser:
*
* ```javascript
* AWS.CredentialProviderChain.defaultProviders = []
* ```
*
* In Node.js:
*
* ```javascript
* AWS.CredentialProviderChain.defaultProviders = [
* function () { return new AWS.EnvironmentCredentials(‘AWS’); },
* function () { return new AWS.EnvironmentCredentials(‘AMAZON’); },
* function () { return new AWS.SharedIniFileCredentials(); },
* function () { return new AWS.ECSCredentials(); },
* function () { return new AWS.ProcessCredentials(); },
* function () { return new AWS.TokenFileWebIdentityCredentials(); },
* function () { return new AWS.EC2MetadataCredentials() }
* ]
* ```
*/

Had we been doing backend development with Node.js the default chain would look in every possible place starting from environment variables and ending up in EC2 metadata file — provided we are running our code on EC2.

However, accessing AWS from the browser is a bit more tricky since we have neither environment variables nor access to the file system. The default chain is empty in that case.

One of the ways we can access AWS directly from the browser is by using Cognito Identity Pools.

Identity pool allows for user federation across multiple authentication mechanisms. It can also provide a role to an unauthenticated user. All you need in order to assume such a role from a web browser is the identity pool ID.

Be sure to limit permissions available to the possessors of an unauthenticated role.

Another thing worth pointing out is that you should only import particular clients from aws-sdk as opposed to the whole module. This is crucial if you care about the size of your bundle.

Below is an example of code where we access AWS from the browser in order to list folders in a public S3 bucket:

Continuous Integration / Continuous Deployment

We are going to do some improvements to our website every now and then. That means our codebase will change and naturally we will want to let our users enjoy those changes as soon as possible.

Let’s pick up GitHub as our code repository. It got a CI/CD feature called GitHub actions. Gitlab has a very similar concept called simply CI, by the way.

You may think — why don’t we use AWS CI/CD services which there are plenty of? We can, but since all we do is a couple of simple CLI commands directed at S3 and CloudFront, a free solution from GitHub will be more than enough.

AWS services such as CodePipeline might’ve been a better option if we had a complex deployment workflow involving manual actions, canary deployments, very heavy tests or other complexities.

Let’s come back to GitHub. We would need to create a GitHub repo and upload our assets there. For sake of example let’s assume our code is something that needs to be built — like a React application — and not just plain HTML with linked styles and js files.

Now we want to make sure that the tip of the main branch corresponds to what CloudFront is serving our customers. Even before that we need to check the quality of the code we merge to the main branch.

In Github actions when new code gets pushed or merged into the main branch we will do the following:

  1. Perform production build of the application.
  2. Run the processes necessary to ensure code quality (e.g. linting, tests).
  3. Understand whether the build brings user-impacting changes. Changes to tsconfig or test files do not require cache invalidation, neither do those files need to be uploaded to S3 hosting our site.
  4. If the build has no user-impacting changes— do nothing.
  5. If it does— upload the production files to S3 bucket and invalidate them in CloudFront distribution.

Accessing AWS services will require credentials. You may use GitHub secrets to securely store the user access key and respective key id of the IAM user that performs deployments. Gitlab has a similar feature called variables.

I highly recommend creating a new user in AWS for the GitHub job with minimal necessary permissions. Using admin user’s credentials for that purpose is discouraged. If you have multiple projects that require interaction with AWS during deployment — I’d also recommend a separate user for each of them.

Below is a simplified example of a GitHub Actions workflow written in YAML format.

Outro

We have discussed how to set up a static website on AWS using a predefined CloudFormation template and what AWS resources are involved in the lifecycle of its assets.

We’ve also seen an example of CI/CD pipeline that would allow us to quickly and safely perform code changes.

Besides, we saw how we can perform AWS API calls in a client browser if we ever need to do so.

I hope this was helpful and wish you all the best coding!

--

--