Amazon Cloudfront: Caching Deep dive

Published on: Sun Nov 06 2022

Series

Content

Introduction

Amazon cloudfront caching is no different than other caching technologies.

However, like many services within AWS, there are a few subtle nuances that you should know in order get it to do what you want it to do.

Cloudfront cache options
Cloudfront cache options

When we talk about caching, there are two things that we are typically interested in:

  1. What is being cached

  2. The cache expiration (TTL)

Of course, there are other options but these are by far the most important.

What are our option for changing the cache expiration (TTL) ?

The options available are:

  1. HTTP Caching (origin server) - This is the Cache-Control value returned from the origin server

  2. Cache policy caching - The cache policy has caching configurations (max, min, default ttl)

  3. Cache behavior - The cache policy has caching configurations (max, min, default ttl)

  4. Response header policy - The response header policy can change the Cache-Control settings for the viewer request

These are the few options that can affect the cache expiration (TTL).

In my previous articles, I talked these topics in-depth.

If you would like a review of these topics, feel free to check it here:

In this guide, we’ll start off by going over the cache behavior, and how they relate to the cloudfront caching.

Then, dig into the nuances of how cloudfront handle resolution of multiple options being set on the cache expiration (TTL).

Let’s dive right in!

Cache behaviors

Cache behavior has a set of configuration that is used for caching, and it is matched based on a configured URL path pattern.

This is different from Cloudfont’s cache policy; This policy’s function is to define the cache key.

You can think of the cache behavior as a container that hold the rules which are applied to a matched request.

Further more, these behaviors are divided into two types.

Two types of cache behaviors:

  • Default Cache behavior

  • Ordered Cache Behavior

As far as the configuration go, both of these are identical.

Here are some settings:

  • Path pattern
  • Origin request policy Id
  • cache policy Id
  • Default TTL
  • Min TTL
  • Allowed Methods (ie GET , HEAD )
  • many more

These two are very similar but the function they perform is slightly different. Their name actually describe their function pretty well.

Default Cache behavior

Default caching behavior
Example of Cloudfront’s default caching behavior

This cache behavior will be defined when a distribution is created, and it will automatically forward all requests to your origin server.

As the name suggests, it’s the default.

If there is no existing or non-matching cache behavior, this will be the cache behavior that is used.

So, it is critical to keep this behavior as general as possible because it will be used as a “baseline” for caching the requests that come into the cloudfront distribution.

Lastly, for the default cache behavior, you need as many of these as you have origins (ie 2 default cache behaviors for 2 origins).

💡 Tip: When coming up with your Default cache behavior, make it as generic as possible.

Think of what is the baseline that you are aiming for when setting the caching in this interaction, and also try to tie that to the impact of the end-user.

Ordered Cache Behaviors

Ordered caching behavior
Example of Cloudfront’s ordered caching behavior

This cache behavior is defined after the distribution is created.

The way it works is exactly the same as the default cache beavhior.

The only difference is that you can have multiple ordered behaviors, and there is an ordering to the request matching.

Here are a few key things you should know about it:

Few things you need to know about the ordered cache behavior
Few things you need to know about the ordered cache behavior

Caching TTL configurations

So, now we know more about the cache behaviors, and the places the cache expiration can be set, there is still a lingering question of...

How do all these different caching configurations mix together in practice ?

More specifically, which cache expiration (TTL) will be set if multiple options are used ?

For the most part, it is quite straight foward.

Cloudfront will leave any HTTP Cache-Control headers sent to the client (browser) cache untouched.

On the other hand, when it comes to the shared cache on Cloudfront, it tries make the best decision based on the different configurations.

There are a few gotchas. Let’s go through them.

Minimum TTL and Cache-Control headers

Cloudfront treats HTTP cache directives (or instructions) as a first class citizen, meaning these rules will take precedence in most cases.

This is assuming that the Cache-Control: max-age remains within the the minimum and maximum caching time defined in cloudfront.

If these limits are not defined, then Cache-Control: max-age is used.

Cloudfront cache control example
Cloudfront cache control example

Otherwise, Cloudfront would make the choice that makes the most sense.

The caching selection will fall into three choices:

  1. Use the Cache-Control within the limit range

  2. If it is out of range (high), use the maximum caching ttl defined by Cloudfront caching

  3. If it is out of range (low), Use the minimum caching ttl defined by Cloudfront caching

Cloudfront cache control: limits
Illustration of Cache-Control limits with cloudfront

It makes a lot of sense if you think about it like this.

If you go by the AWS documentation, it does take a few reads of to really understand the essence of how this works.

❗️Important:

when storing sensitive (Cache-Control: no-store) and personalized (Cache-Control: private) be sure to check the minimum TTL setting within cloudfront.

This is important because if you don’t it may unintentionally cause you to cache these information in the shared cache.

To learn more about these HTTP cache controls, please see:

Response header policy

Another setting that can change the cache expiration sent to the client is the response header policy.

How does Response header policy interact with Cache-Control ?

With response header policy, you can provide explicit option to “override” the value.

It will override the value from the origin server if it is set to true.

So, it can have an affect on the Cache-Control depending if the override is set to true.

This setting will only affect the browser (client) caching and not the shared cache (cloudfront).

Conclusion

So, there you have it, a brief overview of how cache expiration works in Cloudfront.

In general, cloudfront will make the best decision when negotiating between competing caching options.

These options are:

  • HTTP Caching (origin server)

  • Cache policy caching

  • Cache behavior

  • Response header policy

When working with cloudfront, make sure to evaluate whether or not adding minimum and maximum TTL is the necessary.

These two options can change the way caching typically works because the TTL set through Cache-Control will be bound to this limit.

That’s all for this guide.

If you want to dive deeper, be sure to check out my other articles related to this topic:

Finally, if you found this helpful or learned something new, please do share this article with a friend or co-worker 🙏❤️ (Thanks!)


Enjoy the content ?

Then consider signing up to get notified when new content arrives!

Jerry Chang 2023. All rights reserved.