Amazon Cloudfront: Cache policy (tutorial)

Published on: Sun Nov 20 2022

Series

Content

Introduction

One of the key benefits of using Amazon Cloudfront is for caching the different resources from your origin server.

The caching wtihin Cloudfront can be configured using Cache policies.

Much like the origin request policies, these are also attached onto the cache behaviors.

In this tutorial, we will go through the steps on how to set this up using terraform!

For this tutorial, we will be building on our code in Amazon Cloudfront: Origin request policy (tutorial).

Let’s jump right into it

Before you start

Before you start going through the tutorial, make sure you are using the respository from the previous tutorial.

The respository can be found here - amazon-cloudfront-origin-request-policy.

It will be the base from which we will build from!

A Review

Here is a big picture overview of some theory and background before we go into the nitty gritty details.

Cache policy

By default, cloudfrount uses this as the cache key:

  • domain (www.example.com)
  • URL (/index.html)

Cache key settings:

  • Header
  • Cookies
  • Query strings
  • Compression Support (This will be based on the Accept-Encoding value)

So, if you want adjust the caching beyond the defaults then that’s when you’d need a cache policy.

The Architecture

Illustration of the sandbox architecture with Cloudfront and Lambda
Illustration of the sandbox architecture with Cloudfront and Lambda

The steps

  1. The Viewer (client) request comes in

  2. Cloudfront forwards the request to the origin server (Lambda URL + Lambda)

  3. The logs from the request are added into Cloudwatch Logs

  4. The Lambda function returns a response

  5. The Viewer (client) receives the response

Cloudfront cache policy

In our example, let’s setup the our cloudfront distribution to cache based on a query string language .

This is a common scenario as some websites may provide localized versions of the resources depending on the language.

1. Add the cache policy policy resource

In terraform, there is a resource just for this.

We’re going to set the default_ttl (expiration) to 300 seconds (or 5 minutes), this is just for testing purposes.

Add the following changes:

// infra/main.tf

resource "aws_cloudfront_cache_policy" "custom" {
  name        = "test-cache-policy"
  comment     = "test cache policy"
  min_ttl     = 0
  default_ttl = 300
  max_ttl     = 604800
}

Helpful Reference

2. Add the cache key settings

This is the setting that will define our cache keys using language as a query string.

Add the following changes:

// infra/main.tf

resource "aws_cloudfront_cache_policy" "custom" {
  name        = "test-cache-policy"
  comment     = "test cache policy"
  min_ttl     = 0
  default_ttl = 300
  max_ttl     = 604800
  parameters_in_cache_key_and_forwarded_to_origin {
    query_strings_config {
      query_string_behavior = "whitelist"
        query_strings {
          items = ["language"]
        }
    }
  }
}

💡 Remember: These properties that are included in the cache key will also be forwarded to the origin server!

Helpful Reference

3. Include the header and cookies cache key configs

By default, these are required fields, so we’ll need to define their behavior.

We’ll set it to none , meaning we won’t be using them.

Add the following changes:

// infra/main.tf

resource "aws_cloudfront_cache_policy" "custom" {
  name        = "test-cache-policy"
  comment     = "test cache policy"
  min_ttl     = 0
  default_ttl = 300
  max_ttl     = 604800
  parameters_in_cache_key_and_forwarded_to_origin {
    cookies_config {
      cookie_behavior = "none"
    }
    headers_config {
      header_behavior = "none"
    }
    query_strings_config {
      query_string_behavior = "whitelist"
        query_strings {
          items = ["language"]
        }
    }
  }
}

4. Create a new ordered cache behavior

Our default cache behavior will manage all other request, while our new ordered cache behavior will be based on the path /test .

So, the custom cache policy we defined will only be defined if it matches /test URL (or any other variants).

// infra/main.tf

resource "aws_cloudfront_distribution" "cf_distribution" {
  origin {
    # This is required because "domain_name" needs to be in a specific format
    domain_name = replace(replace(aws_lambda_function_url.test.function_url, "https://", ""), "/", "")
    origin_id = module.lambda_origin.lambda[0].function_name

    custom_origin_config {
      https_port = 443
      http_port = 80
      origin_protocol_policy = "https-only"
      origin_ssl_protocols = ["TLSv1.2"]
    }
  }

  default_cache_behavior {
    allowed_methods          = ["GET", "HEAD", "OPTIONS"]
    cached_methods           = ["GET", "HEAD", "OPTIONS"]
    target_origin_id         = module.lambda_origin.lambda[0].function_name
    viewer_protocol_policy   = "redirect-to-https"
    min_ttl                  = local.min_ttl
    default_ttl              = local.default_ttl
    max_ttl                  = local.max_ttl
    origin_request_policy_id = aws_cloudfront_origin_request_policy.custom.id
    # A cache policy is required when attaching, we’ll use a managed policy id for now
    cache_policy_id          = "658327ea-f89d-4fab-a63d-7e88639e58f6"
  }

  ordered_cache_behavior {
    path_pattern              = "/test"
    allowed_methods           = ["GET", "HEAD", "OPTIONS"]
    cached_methods            = ["GET", "HEAD", "OPTIONS"]
    target_origin_id          = module.lambda_origin.lambda[0].function_name
    viewer_protocol_policy    = "redirect-to-https"
  }

  price_class = var.cf_price_class
  enabled = true
  is_ipv6_enabled     = true
  comment             = "origin request policy test"
  default_root_object = "index.html"

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  tags = {
    Environment = "production"
  }

  viewer_certificate {
    cloudfront_default_certificate = true
  }
}

5. Attach the cache policy

The final touch to our ordered cache behavior is attaching our custom cache policy.

// infra/main.tf

resource "aws_cloudfront_distribution" "cf_distribution" {
  origin {
    # This is required because "domain_name" needs to be in a specific format
    domain_name = replace(replace(aws_lambda_function_url.test.function_url, "https://", ""), "/", "")
    origin_id = module.lambda_origin.lambda[0].function_name

    custom_origin_config {
      https_port = 443
      http_port = 80
      origin_protocol_policy = "https-only"
      origin_ssl_protocols = ["TLSv1.2"]
    }
  }

  default_cache_behavior {
    allowed_methods          = ["GET", "HEAD", "OPTIONS"]
    cached_methods           = ["GET", "HEAD", "OPTIONS"]
    target_origin_id         = module.lambda_origin.lambda[0].function_name
    viewer_protocol_policy   = "redirect-to-https"
    min_ttl                  = local.min_ttl
    default_ttl              = local.default_ttl
    max_ttl                  = local.max_ttl
    origin_request_policy_id = aws_cloudfront_origin_request_policy.custom.id
    # A cache policy is required when attaching, we’ll use a managed policy id for now
    cache_policy_id          = "658327ea-f89d-4fab-a63d-7e88639e58f6"
  }

  ordered_cache_behavior {
    path_pattern              = "/test"
    allowed_methods           = ["GET", "HEAD", "OPTIONS"]
    cached_methods            = ["GET", "HEAD", "OPTIONS"]
    target_origin_id          = module.lambda_origin.lambda[0].function_name
    viewer_protocol_policy    = "redirect-to-https"
    cache_policy_id           = aws_cloudfront_cache_policy.custom.id
  }

  price_class = var.cf_price_class
  enabled = true
  is_ipv6_enabled     = true
  comment             = "origin request policy test"
  default_root_object = "index.html"

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  tags = {
    Environment = "production"
  }

  viewer_certificate {
    cloudfront_default_certificate = true
  }
}

6. Change the response html

Let’s also include the language query string in the response just to change the response from our Lambda function.

Make the following change:

import {
  CloudFrontFunctionsEvent,
  APIGatewayProxyResult
} from 'aws-lambda';

// Default starter
export const handler = async(
  event: CloudFrontFunctionsEvent
): Promise<APIGatewayProxyResult> => {
  const language: string = event?.queryStringParameters?.language;
  console.log('Event : ', JSON.stringify({
    event,
  }, null, 4));
  return {
    statusCode: 200,
    headers: {
      'Content-Type': 'text/html',
    },
    body: `<html>
      <body>
        <p>Hello there</p>
        <p> language \${language} </p>
      </body>
    </html>`,
  };
}

7. Apply the infrastructure

Now we have all our definition in Terraform, all that is left is to generate it.

Run the following:

// This will re-generate the assets
pnpm run generate-assets --filter "@function/*"

export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>
export AWS_DEFAULT_REGION=us-east-1

terraform init
terraform plan
terraform apply -auto-approve

If the infrastructure applied successfully then you should see something like this:

Illustration of the Terraform outputs
Illustration of the Terraform outputs

Testing it out

1. Make a request

Curl:

curl -vvv -H "<cf_distribution_domain_url>/test?language=en"

Postman:

Initial Request - Cache Miss

Illustration of using Postman for the request with cache miss
Illustration of using Postman for the request with cache miss

Follow-up request - Cache Hit

Illustration of using Postman for the request with cache hit
Illustration of using Postman for the request with cache hit

🧠 Breaking it down:

When you change the language query string within the request, Cloudfront will first check the cache to see if it exists.

If it exists, that’s considered a Cache Hit .

Otherwise, it will forward that request to the origin server which is a Cache Miss.

2. Check the logs

Apart from the response from cloudfront, be sure to check the logs of your lambda function (our origin server).

Make you are able to see it as it will be useful later when we start to integrate the origin request policies.

Here is an example of the event that is logged from our origin server.

Request Event Example:

Illustration of the logs when adding a cache policy
Illustration of the logs when adding a cache policy
Click to view Request Event (in JSON format)
{
  "event": {
    "version": "2.0",
    "routeKey": "$default",
    "rawPath": "/test",
    "rawQueryString": "",
    "headers": {
      "x-amzn-tls-version": "TLSv1.2",
      "x-forwarded-proto": "https",
      "x-forwarded-port": "443",
      "x-forwarded-for": "2001:569:7f36:4000:59d2:4451:b85c:dd8d",
      "via": "2.0 d7782b26e589b8e1397d352f4daf0d58.cloudfront.net (CloudFront)",
      "x-amzn-tls-cipher-suite": "ECDHE-RSA-AES128-GCM-SHA256",
      "x-amzn-trace-id": "Root=1-6379c843-5a977794388130833ee2d137",
      "host": "qlhr2r65yczsyuiwzwqwdckdqq0eukxl.lambda-url.us-east-1.on.aws",
      "x-very-secret-client-token": "secret-token",
      "cache-control": "no-cache",
      "accept-encoding": "br,gzip",
      "x-amz-cf-id": "e7D4qcWlIgzSWXm-EMof5oqJi0tr6Ii21E8RrmlJBiulN0GH27fslg==",
      "user-agent": "Amazon CloudFront"
    },
    "requestContext": {
      "accountId": "anonymous",
      "apiId": "qlhr2r65yczsyuiwzwqwdckdqq0eukxl",
      "domainName": "qlhr2r65yczsyuiwzwqwdckdqq0eukxl.lambda-url.us-east-1.on.aws",
      "domainPrefix": "qlhr2r65yczsyuiwzwqwdckdqq0eukxl",
      "http": {
        "method": "GET",
        "path": "/test",
        "protocol": "HTTP/1.1",
        "sourceIp": "64.252.71.170",
        "userAgent": "Amazon CloudFront"
      },
      "requestId": "7f3331cd-c417-495a-ab56-72891d04c5a4",
      "routeKey": "$default",
      "stage": "$default",
      "time": "20/Nov/2022:06:25:07 +0000",
      "timeEpoch": 1668925507591
    },
    "isBase64Encoded": false
  }
}

For reference, here is the repository of completed tutorial - Github: amazon-cloudfront-cache-policy.

Conclusion

Now you know how to create a cache policy, let’s do a quick review of some key points of a cache policy.

Review:

  • The properties you can use in a cache key are: headers, cookies, query strings and compression (Accept-Encoding )

  • A cache policy is attached onto a cache behavior

  • The properties defined as the cache key are also automatically forwarded to the origin server

And... that’s all for now, stay tuned for more!

If you found this helpful or learned something new, please do share this article with a friend or co-worker 🙏❤️ (Thanks!)

🔥 Challenge:

Try to add caching based on cookie and header values.

See if you can get it to work!


Enjoy the content ?

Then consider signing up to get notified when new content arrives!

Jerry Chang 2023. All rights reserved.