Published on: Wed Feb 02 2022
In this module, we will take a look at how we can connect and manage our AWS Aurora instances within the AWS VPC.
One way to achieve this is by setting up a bastion host within the VPC to allow for connections to the database.
First, we will examine what this infrastructure will look like and how to secure the access points for this particular resource. Since this will be our data, we want to make sure it is properly gated.
Next, we will look at how to properly setup access for our bastion host instance to access other resources like AWS S3, and SSM parameter store. This is mainly so we can connect to the database and set it up when we are initializing our EC2 instance.
Finally, we tie it all together by going into the bastion host and verifying everything. Additionally, we briefly take a look at SSL modes when connecting to AWS Aurora (PostgreSQL).
I hope you are ready. It’s going to be a long one, Let’s get started!
The design we will be going with is that we will host our bastion host (EC2 instance) within the public subnet (accessible to the internet). Then, we will only allow access via SSH if the request is from our IP address.
In addition, the access is also gated by the security group Ids where we only allow hosts with a particular security group to access it.
Bastion host (IP and Security Group)
As part of the databae initialization, we will need to connect to the AWS Aurora (writer) instance to write some SQL scripts.
So, we will attach an instance role to the bastion host which will grant it permission to access the s3 bucket (with migration scripts) and AWS SSM parameter store (database password).
Bastion host (Instance profile):
Since these database instances are within a VPC, and we have specified only certain resources can access them in the security group.
We will be creating a Bastion host (or Jump box) to allow us administer these instances if needed.
A file named key.pem
will be created in your current directory which we will use to ssh into the bastion host.
## Key Pair
resource "tls_private_key" "dev" {
algorithm = "RSA"
rsa_bits = 4096
}
resource "aws_key_pair" "dev" {
key_name = "dev-key"
public_key = tls_private_key.dev.public_key_openssh
}
# Create a static version of task definition for CI/CD
resource "local_file" "key_pem" {
content = tls_private_key.dev.private_key_pem
file_permission = "400"
filename = "./key.pem"
}
We will restrict the SSH access (port 22) only to our IPV4 address (var.ip_address
).
That will come from our environment variable when we apply our resources (ie TF_VAR_ip_address
).
# main.tf
## Security Group - bastion
resource "aws_security_group" "bastion" {
name = "sg_bastion"
description = "Custom default SG for Bastion"
vpc_id = module.networking.vpc_id
ingress {
description = "TLS from VPC"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [var.ip_address]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "web-bastion-sg"
}
}
Update the database security group to include the security group ID for the bastion host.
# main.tf
resource "aws_security_group" "private_db_sg" {
name = "aurora-custom-sg"
description = "Custom default SG"
vpc_id = module.networking.vpc_id
ingress {
description = "TLS from VPC"
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [
# For the AWS ECS - we will deal with this later in the series
aws_security_group.ecs_sg.id,
# For the bastion host
aws_security_group.bastion.id
]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "PostgreSQL-db-access-${var.project_id}-${var.env}"
}
}
In the initialization of the EC2 instance, we are able to provide a script to setup our environment. We can take advantage that and also setup our database as well.
In our setup, we will sync the migration scripts (sql files) to s3 which will be used by our EC2 Instance (Bastion host) to run these scripts against our AWS Aurora when initializing our instance.
The scripts will live in data/
, specifically, 01_create.sql
and 02_seed.sql
.
⚠️ Note: You may need to change S3 bucket name as it has to be unique
Here we are just syncing our create and seed SQL scripts into our S3 bucket.
resource "aws_s3_bucket" "bucket" {
bucket = "db-assets-${var.project_id}-123"
acl = "private"
}
resource "aws_s3_bucket_object" "create_sql" {
bucket = aws_s3_bucket.bucket.id
key = "01_create.sql"
source = "../data/sql/01_create.sql"
}
resource "aws_s3_bucket_object" "seed_sql" {
bucket = aws_s3_bucket.bucket.id
key = "02_seed.sql"
source = "../data/sql/02_seed.sql"
}
Let’s add some data into our database setup scripts so we can verify it once we apply our infrastructure.
-- 01_create.sql
-- Clear table if it already exists
DROP TABLE IF EXISTS social_accounts;
-- Tables
CREATE TABLE social_accounts(
id serial PRIMARY KEY
, username VARCHAR(255)
, social_type VARCHAR(255)
, created_at timestamptz DEFAULT current_timestamp
, updated_at timestamptz DEFAULT current_timestamp
);
-- Updated at trigger - refresh the timestamp during an update
CREATE OR REPLACE FUNCTION updated_at_trigger()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ language 'plpgsql';
-- Triggers
CREATE TRIGGER update_social_accounts_timestamp BEFORE UPDATE ON social_accounts FOR EACH ROW EXECUTE PROCEDURE updated_at_trigger();
-- 02_seed.sql
-- Feel free to change it if you like (this is only for testing purposes)
INSERT INTO social_accounts (username, social_type) VALUES ('Jareechang', 'Github');
INSERT INTO social_accounts (username, social_type) VALUES ('probablynull', 'Twitter');
Within our basthion host, we need permissions to access to the AWS SSM Parameter store and S3 to access the database and run migrations.
Additionally, access to AWS KMS decrypt is needed as well since the database password is a SecureString
encrypted a KMS key we created.
Here we are defining the IAM policies as data source to be used later for AWS SSM, KMS and S3.
Likewise, we have narrowed down the resources only to the one we have defined rather than all resources.
# main.tf
data "aws_iam_policy_document" "instance_assume_role_policy" {
statement {
actions = [
"sts:AssumeRole",
]
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
}
}
data "aws_iam_policy_document" "bastion" {
version = "2012-10-17"
statement {
actions = [
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath"
]
effect = "Allow"
resources = [
"arn:aws:ssm:${var.aws_region}:${data.aws_caller_identity.current.account_id}:parameter/web/${var.project_id}/*"
]
}
statement {
actions = [
"kms:Decrypt"
]
effect = "Allow"
resources = [
aws_kms_key.default.arn
]
}
statement {
actions = [
"s3:GetObject",
"s3:ListBucket"
]
effect = "Allow"
resources = [
aws_s3_bucket.bucket.arn,
"${aws_s3_bucket.bucket.arn}/*"
]
}
}
Here are just creating a role with these custom policies then creating an instance role using that custom role.
# main.tf
resource "aws_iam_policy" "bastion_host" {
name = "bastion"
policy = data.aws_iam_policy_document.bastion.json
}
resource "aws_iam_role" "instance_role" {
name = "instance-bastion-role"
assume_role_policy = data.aws_iam_policy_document.instance_assume_role_policy.json
}
resource "aws_iam_instance_profile" "bastion" {
name = "instance-bastion-profile"
role = aws_iam_role.instance_role.name
}
resource "aws_iam_policy_attachment" "attach_policy_to_role_instance" {
name = "instance-role-attachment-${var.project_id}-${var.env}"
roles = [aws_iam_role.instance_role.name]
policy_arn = aws_iam_policy.bastion_host.arn
}
Finally, we create our Bastion host passing our custom script (to handle database setup and migration) as the user_data.
## EC2 Instance
data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
filter {
name = "owner-alias"
values = ["amazon"]
}
filter {
name = "name"
values = ["amzn2-ami-hvm*"]
}
}
data "template_file" "bootstrap" {
template = "${file("./bootstrap.sh.tpl")}"
vars = {
db_host = [for i, instance in aws_rds_cluster_instance.cluster_instances : instance if instance.writer == true][0].endpoint
ssm_db_param_path = aws_ssm_parameter.db_password.name
db_username = local.db_username
db_name = local.db_name
s3_bucket = aws_s3_bucket.bucket.id
}
}
# EC2 - Public
resource "aws_instance" "bastion" {
ami = data.aws_ami.amazon_linux_2.id
iam_instance_profile = aws_iam_instance_profile.bastion.name
instance_type = "t2.micro"
subnet_id = module.networking.public_subnets[0].id
vpc_security_group_ids = [
aws_security_group.bastion.id
]
associate_public_ip_address = true
user_data = data.template_file.bootstrap.rendered
key_name = aws_key_pair.dev.key_name
tags = {
Name = "Bastion-Instance"
}
}
In order to authenticate to our AWS Aurora we need the correct connection parameters (host, dbname, username, and password etc).
To simplify the process, we will be using a terraform template I have created called bastion-environment.tftpl
which will output all the required environment variable exports needed for starting a psql
session in our bastion host.
# output.tf
output "aurora_cluster_endpoint" {
value = aws_rds_cluster_endpoint.static.endpoint
}
output "bastion_ip" {
value = aws_instance.bastion.public_ip
}
output "bastion_environment" {
value = templatefile("./bastion-environment.tftpl", {
db_host = [for i, instance in aws_rds_cluster_instance.cluster_instances : instance if instance.writer == true][0].endpoint
ssm_db_param_path = aws_ssm_parameter.db_password.name
db_username = local.db_username
db_name = local.db_name
s3_bucket = aws_s3_bucket.bucket.id
})
}
Creating the AWS Aurora database will take some time (~15-20 mins). So, just keep that in mind.
export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>
export AWS_DEFAULT_REGION=us-east-1
export TF_VAR_ip_address=<your-ip-address>
terraform init
terraform plan
terraform apply -auto-approve
ssh -i ./key.pem ec2-user@<bastion-public-ip>
In the output you should see a field called bastion_environment
, these are the environment variables we need to set to access our AWS Aurora instance.
Copy and paste that into your terminal session in your bastion host.
Since we set most of the parameters via the environment variable, we can simply run psql
.
The -p 5432
is optional, the default for PostgreSQL is 5432
.
psql
Command:
\dt
You should see the following:
List of relations
Schema | Name | Type | Owner
--------+-----------------+-------+-----------
public | social_accounts | table | read_user
(1 row)
Command:
select * from social_accounts;
You should see the following:
If you didn’t modify the seed data then you shold see the same output.
blog=> select * from social_accounts;
id | username | social_type | created_at | updated_at
----+--------------+-------------+-------------------------------+-------------------------------
1 | Jareechang | Github | 2022-02-01 05:32:54.548331+00 | 2022-02-01 05:32:54.548331+00
2 | probablynull | Twitter | 2022-02-01 05:32:54.556633+00 | 2022-02-01 05:32:54.556633+00
(2 rows)
If it looks all good then the infrastructure setup went as expected. If you see different results, please double re-trace the steps to see if you missed anything. Feel free to use the completed repo as a reference.
This is just a quick checkpoint before we add in more changes in the later modules.
Now, remember we added a database parameter to add an additional rds.force_ssl
option.
I am a big fan of seeing things in action to really understand how it all works.
Let’s test and verify that too!
Here we are just doing a quick check when we connect without SSL.
you can change the SSL using the PGSSLMODE
environment variable. In the case of no SSL, it will just be:
Note: You will need to quit out of your previous psql session by running \q
export PGSSLMODE=disable
Try connecting again with:
psql
You should an error on connecting:
psql: FATAL: no pg_hba.conf entry for host "<redacted-ip>", user "read_user", database "blog", SSL off
So, looks like it is working as expected ✅.
Here we are connecting directly with an AWS Aurora writer instance.
Get latest rds-ca-2019 root certificates:
curl https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem -O
Set the cert verification and location then connect:
export PGSSLMODE=verify-full
export PGSSLROOTCERT="$(pwd)/rds-ca-2019-root.pem"
psql
Here we are connecting to the RDS proxy (static endpoint) we created for our read only instances.
Get latest global bundle certificates:
curl https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem -O
Set the host to use the static endpoint:
This should be available in your terraform output under the same name highlighted below.
export PGHOST=<aurora_cluster_endpoint>
Set the cert verification and location then connect:
export PGSSLMODE=verify-ca
export PGSSLROOTCERT="$(pwd)/global-bundle.pem"
psql
I didn’t want to get too into the weeds with the nitty griddy of the SSL but I mainly wanted to emphasis the difference in SSL setup between using the writer endpoint (directly) and the static endpoint (RDS Proxy for our read only instances).
We will be using the static endpoint to wire up our application to read the data. So, the setup will be almost identical but just in Node.js.
💡 Note: With proxy endpoints on Aurora and RDS, the limitation is that you can only connect SSL mode using REQUIRED and VERIFY_CA. Hence, why we are using a different PGSSLMODE here.
If you’d like a reference of the finished result, it is available at aws-aurora-part-2.
To summarize, looked at how we can indirectly access our AWS Aurora. This is mainly done via our Bastion host.
The bulk of the setup is to ensure our bastion host has the proper infrastructure (security groups) and resource (s3 and SSM) and access.
In the initialization, we leveraged the EC2 user_data
to setup our database. This is just one approach, perhaps another one is run a database setup script via S3-Lambda trigger.
There are various ways to do this, but it just depends on your need. I personally like a bastion host as I like having shell access to check or administer things.
Beyond that, we also took a brief look at the AWS AURORA (PostgreSQL) SSL modes, just remember that when using a RDS proxy endpoint, it only supports REQUIRED
and VERIFY-CA
.
Of course, ideally, you want something more automated rather than manually doing this every time.
Feel free to take that on as an exercise to think about how’d you would go about handling that!
In the next module, we will dive a little deeper into PostgreSQL and see how we will model our data in our blog.
See you in the next module!
Then consider signing up to get notified when new content arrives!