AWS Aurora Technical Series

Published on: Tue Jan 18 2022

Series

Last Updated on: Thu Feb 10 2022

Content

Take aways

AWS Aurora optimizes the database for performance and durability.

It maintains forks of the commercial version of MySQL and PostgreSQL, and provides compatibility with each. However, only thing to be aware of is that it sometimes may not have the latest versions in new releases as it is a separate fork.

Some key features:

  • Performance Improvements (5x faster than MySQL, 3x faster than PostgreSQL)
  • Read scale out (up to 15 read replicas)
  • Fault tolerance and self healing storage
    • minimum of 2 copies of the data stored across 3 Avaibility zones (AZ)
    • Global Tables available for Regional replication (across multiple regions and data in a total 6+ AZ)
  • Continuous backups synced to S3
  • Fully managed by AWS including continuous backup & point-in-time recovery synced to S3, software patching etc
  • Performance monitoring (CloudWatch Logs, Cloudwatch, granular SQL statement optimization)
  • Automatic Storage growth up to 128 TiB (pay what you use AWS Aurora storage volume)

Security:

  • Encryption
    • In Transit - SSL/TLS support
    • At Rest - snapshot and backups encrypted via AWS KMS (when enabled)
  • IAM authentication
  • Security groups and Network ACL
    • Restrict network connection access
  • Auditing and monitoring via Cloudwatch and Cloudtrail
  • DB event notifications

All that being said, AWS Aurora is likely not something you’d use for a hobby project. It is a service engineered for durability and scalability. If you have serious data compliance requirements then it is worth the consideration.

Introduction

AWS Aurora is AWS’s solution for a fully managed relational database.

It is designed for performance, durability and security with MySQL and PostgreSQL compatibility.

AWS maintains a separate forks of the databases, and offers 5x and 3x performance for MySQL and PostrgreSQL, respectively.

In addition, the AWS Aurora architecture, is core to its availability, performance, durability.

AWS Aurora also offers additional functionality with integration to AWS services natively in the database using SQL (S3, Lambda, Machine learning services, etc).

Beyond just the native function integration, the service is fully managed by AWS from continuous backups, snapshots (which are synced to AWS S3) and software updates.

Conveniently, AWS Aurora integrates with all the other services like IAM, Cloudwatch, Cloudwatch logs.

Aurora Architecture

Example Architecture with AWS Aurora illustrating a write operation into the Aurora’s shared storage volume distributed across 3 AZ.

The Aurora Architecture is how it is able to achieve its durability, performance, continuous backups and quick point-in-time recovery.

AWS Aurora took a different approach to traditional databases where they took the storage layer, and uses a distributed storage volume instead.

These consists of machines which have blocks of 10GB storage blocks that the database servers write the changes to. The storage volume is replicated at a minimum of 2 copies across 3 availibility zones (AZ).

The storage volume grows automatically up to 128 Tib, and you are only charged for the space you use. These backups are automatically synced to S3.

A database cluster consists of a writer node and reader node. When a fail over event occurs, you can specify the node to fail over to reader node based on the priority in the configuration.

Under the hood, when a writer node writes a change, it is asynchronously applied to the storage volume, AWS Aurora ensures 4/6 writes (in blocks across AZ) are confirmed before it is considered to be done.

Security

Infrastructure

Running AWS Aurora (much like AWS RDS) within the AWS ecosystem, you get the benefit of running these storage services within AWS VPC which keeps the services private.

In addition to the VPC, AWS also offer Security groups and IAM to limit the access to the database.

In terms of monitoring and logging, all the metrics and logging are available through Cloudwatch, Cloudwatch logs, Cloudtrail logs, event notifications and more.

Data Protection

When configured to do so, the data at rest is encrypted via AWS KMS. So, this means the underlying storage, backups and snapshots are encrypted.

In addition, AWS Aurora also offers encryption in-transit (TLS/SSL), and you can take it a step further by forcing all connection to use SSL in your cluster using a database parameter group.

Example of cluster parameter group:

resource "aws_rds_cluster_parameter_group" "default" {
  name        = "${var.project_id}-cluster-parameter-group"
  family      = "aurora-postgresql12"
  description = "Postgresql RDS default cluster parameter group"

  parameter {
    name         = "rds.force_ssl"
    value        = "1"
    apply_method = "immediate"
  }
}

📝 Helpful reference:

Features

Multi-master

AWS Aurora supports multi-master (writer nodes) for scaling writes.

The nodes will asynchonously sync with each other.

Note: This is only supported on MySQL version 5.6 at this time. It could change in the future.

Consistency models within Aurora:

  • Instance Read-After-Write (INSTANCE_RAW): a transaction can observe all transactions previously committed on the instance, and transactions executed on other nodes (subjected to lag)
  • Regional Read-After-Write (REGIONAL_RAW): cluster wide consistency, a transaction can observe all transactions previously committed in all instances in the cluster

Just keep in mind of these configuration depending on your use case.

📝 Helpful reference:

Backups

As mentioned, AWS Aurora backs up your cluster volume automatically. However, you can also configure to take daily backups with a specific rentetion period.

There are two important configurations to keep in mind of:

  • backup retention period - How long you want to keep the back up around for ? (range: 1 - 35 days)
  • backup window - When to take the backup (ie between 8:00 - 9:00 pm) ?

Backups in AWS Aurora are always taken continuously and automatically. So, this will an additional backup.

Backtrack

You can “rewind” the state of your database to a time you specify.

Few use cases:

  • Undo mistakes (bad database migration)
  • Exploring earlier data - You can backtrack to a DB cluster back and forth in time

RDS Proxy

Not strictly an Aurora feature, but it is also supported by AWS Aurora.

Using the proxy gives you the ability to better manage your database connections security and scalability.

It will load balance traffic between your read replicas, and prevent connection count exhaustion.

When the RDS proxy reaches its connection limit, it will throttle the traffic, even though latency will inrease, it will not create a fail over event on your database.

AWS service integration

with AWS Aurora, you can use the standard SQL to call other AWS services like S3, Lambda, and even Machine Learning services.

These can be used to create manual backups or exports, aggregrations, fraud detection and product recommendation on your data depending on your use case.

Technical Series

1. AWS Aurora - Setting up VPC with AWS Aurora

AWS Aurora Technical Series Part I - AWS Aurora & VPC

In this module we will be setting up a VPC which will host our AWS Aurora database cluster and instances. These will be within our private subnets.

aws aurora architecture just db

In addition, we will have a Bastion Host which admins (us) we can ssh into in order to do some management of the database (migration, etc). This will be publicly accessible, however, its access is restricted.

2. AWS Aurora - Setting up bastion hosts and testing via CLI

AWS Aurora Technical Series Part II - Bastion host

In this module, we will be securing our bastion host, which we will be using for the database adminstration (user management, migrations etc).

This includes both, network access from the internet and access to specific resoruces required to perform management tasks (AWS SSM Parameter store, S3 etc).

Network Access Management:

bastion host network access

Resources IAM Management:

bastion host aws access

3. AWS Aurora - PostgreSQL Data Modelling

AWS Aurora Technical Series Part III - PostgreSQL data modelling

In this module, we will model out our data for our blog site using next.js.

Rather than a static blog, we will make the content dynamic so authors can update it any time they want.

The focus of this series is not on PostgreSQL, but nonetheless, we will dig into some basic SQL like creating database, tables and database triggers.

blog data modelling

The entities include:

  • Blogs
  • Comments
  • Users (Authors / Guests)
  • Social Accounts (User’s social media handles)

4. AWS Aurora - PostgreSQL integration with Next.js

AWS Aurora Technical Series Part IV - PostgreSQL integration with Next.js

In this module, we will look at setting up the local PostgreSQL for development.

The features will already implemented but we will take a tour through it to go through some parts of it.

In addition, we will walk through how to setup node-pg package to interface with PostgreSQL from Next.js locally, and also considerations for a production environment (in AWS).

Most of this section will be a walk through but I will go through the different parts that are relevant to AWS Aurora and PostgreSQL.

nextjs and postgresql local development

5. AWS Aurora - Integration with AWS ECS

AWS Aurora Technical Series Part V - Integrate with AWS ECS

In this module, we will integrate AWS Aurora into our AWS ECS setup (in the other series).

These include things like:

  • Updating task definitions
  • Injecting the password into AWS ECS environments from AWS Paramter store (SSM)
  • Providing correct access between resources (Network and IAM)
  • and other miscellaneous tasks
aws aurora with ecs

Enjoy the content ?

Then consider signing up to get notified when new content arrives!

Jerry Chang 2023. All rights reserved.