AWS Aurora Technical Series Part III - PostgreSQL data modelling

Published on: Mon Feb 14 2022

Series

Content

Introduction

In this module, we will take a look at how to setup our data model.

Since this a series mainly focused on AWS Aurora, We will just be setting up the basics in this section.

We’ll look at setting up the model, database triggers and go through a quick overview running the local postgreSQL instance inside docker.

Let’s get started.

Understanding the data

Here is the data modelling for our applicaton.

Brief explanation of the entities

  • Users: Users within our application
    • These can be author (individual who can publish posts) or just null
  • Posts: Posts created by the users (authors)
  • Social Accounts: These are the social media accounts linked to the user’s profile
    • These just contain string values of the user’s social media handles
  • Comments: These are the comments left by the user’s viewing a particular post
blog data modelling

Create database and tables

When we do a quick overview of the application code in the next modules, we will see how it all works out.

1. Clear the tables

Since we may be running our scripts a few times in our database, let’s be sure that we don’t have issues with duplicates. So, to start, we’ll create statements to drop any tables if they exist.

-- Clear table if it already exists
DROP TABLE IF EXISTS social_accounts;
DROP TABLE IF EXISTS users;
DROP TABLE IF EXISTS posts;
DROP TABLE IF EXISTS comments;

2. Create Social Accounts table - "social_accounts"

In this case, the username is the user’s social media handle, and the social_type is the type of social media (ie Twitter, Github, Linkedin).

-- 01_create.sql

-- Tables
CREATE TABLE social_accounts(
    id serial PRIMARY KEY
    , username VARCHAR(255)
    , social_type VARCHAR(255)
    , created_at timestamptz DEFAULT current_timestamp
    , updated_at timestamptz DEFAULT current_timestamp
);
Note: we are using "timestamptz" here but a simple "date" would also work for our purpose but just be mindful that when working with production databases that this is more suitable for cases where time zones makes a big difference like a scheduling and calendar app.

3. Create Users table - "users"

We are referencing social_id to our social_accounts table to establish a relationship (1:N, one-to-many).

These social_accounts records belong to a user .

-- 01_create.sql

CREATE TABLE users(
    id serial PRIMARY KEY
    , name VARCHAR(255)
    , email VARCHAR(255)
    , password VARCHAR(255)
    , user_type VARCHAR(255)
    , social_id INTEGER REFERENCES social_accounts(id)
    , created_at timestamptz DEFAULT current_timestamp
    , updated_at timestamptz DEFAULT current_timestamp
);

4. Create Posts table - "posts"

Same with the social_accounts , a user can have many posts .

-- 01_create.sql

CREATE TABLE posts(
    id serial PRIMARY KEY
    , title VARCHAR(255)
    , content TEXT
    , image_url VARCHAR(255)
    , slug VARCHAR(255)
    , user_id INTEGER REFERENCES users(id)
    , created_at timestamptz DEFAULT current_timestamp
    , updated_at timestamptz DEFAULT current_timestamp
);

5. Create Comments table - "comments"

Similar as the above, a post can have many comments and a comment belongs to a user (so we can identify who made the comment).

-- 01_create.sql

CREATE TABLE comments(
    id serial PRIMARY KEY
    , content VARCHAR(255)
    , user_id INTEGER REFERENCES users(id)
    , post_id INTEGER REFERENCES posts(id)
    , created_at timestamptz DEFAULT current_timestamp
    , updated_at timestamptz DEFAULT current_timestamp
);

Creating database triggers

These are essentially "callbacks" in javascript for SQL. The trigger associated with a table will run whenever an event occurs.

In our case, we will be adding the UPDATE trigger to update our updated_at in our tables to ensure we track when entries were last updated.

1. Defining the trigger function

-- 01_create.sql

-- Updated at trigger - refresh the timestamp during an update
CREATE OR REPLACE FUNCTION updated_at_trigger()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = now();
    RETURN NEW;
END;
$$ language 'plpgsql';

2. Defining the triggers

Whenever we update, we will run the trigger function defined above.

-- 01_create.sql

-- Triggers
CREATE TRIGGER update_social_accounts_timestamp BEFORE UPDATE ON social_accounts FOR EACH ROW EXECUTE PROCEDURE updated_at_trigger();
CREATE TRIGGER update_users_timestamp BEFORE UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE updated_at_trigger();
CREATE TRIGGER update_posts_timestamp BEFORE UPDATE ON posts FOR EACH ROW EXECUTE PROCEDURE updated_at_trigger();
CREATE TRIGGER update_comments_timestamp BEFORE UPDATE ON comments FOR EACH ROW EXECUTE PROCEDURE updated_at_trigger();

📝 Helpful reference:

Adding seed data

Here, we are just going to add some data into the database and tables we have created.

-- 02_seed.sql

INSERT INTO social_accounts (username, social_type) VALUES ('Jareechang', 'Github');
INSERT INTO users (name, email, password, user_type, social_id) VALUES ('Jerry', 'jerry@world.com', 'test123', 'author', 1);
INSERT INTO users (name, email, password, user_type) VALUES ('Bob', 'bob@world.com', 'test123', 'guest');
INSERT INTO users (name, email, password, user_type) VALUES ('Susan', 'susan@world.com', 'test123', 'guest');

INSERT INTO posts (title, content, image_url, slug, user_id) VALUES (
    'Two Forms of Pre-rendering',
    '
Next.js has two forms of pre-rendering: **Static Generation** and **Server-side Rendering**. The difference is in **when** it generates the HTML for a page.

- **Static Generation** is the pre-rendering method that generates the HTML at **build time**. The pre-rendered HTML is then _reused_ on each request.
- **Server-side Rendering** is the pre-rendering method that generates the HTML on **each request**.

Importantly, Next.js lets you **choose** which pre-rendering form to use for each page. You can create a "hybrid" Next.js app by using Static Generation for most pages and using Server-side Rendering for others.
',
    'https://images.unsplash.com/photo-1502759683299-cdcd6974244f?auto=format&fit=crop&w=440&h=220&q=60',
    'pre-rendering',
    1
);
INSERT INTO comments (content, user_id, post_id) VALUES ('What about Incremental Static Regeneration (ISR) ? Can you write about that ?', 2, 1);
INSERT INTO comments (content, user_id, post_id) VALUES ('When should I know to use SSG over SSR ? ', 3, 1);


INSERT INTO posts (title, content, image_url, slug, user_id) VALUES (
    'When to Use Static Generation v.s. Server-side Rendering',
    'We recommend using **Static Generation** (with and without data) whenever possible because your page can be built once and served by CDN, which makes it much faster than having a server render the page on every request.

You can use Static Generation for many types of pages, including:

- Marketing pages
- Blog posts
- E-commerce product listings
- Help and documentation

You should ask yourself: "Can I pre-render this page **ahead** of a users request?" If the answer is yes, then you should choose Static Generation.

On the other hand, Static Generation is **not** a good idea if you cannot pre-render a page ahead of a user''s request. Maybe your page shows frequently updated data, and the page content changes on every request.

In that case, you can use **Server-Side Rendering**. It will be slower, but the pre-rendered page will always be up-to-date. Or you can skip pre-rendering and use client-side JavaScript to populate data.',
    'https://images.unsplash.com/photo-1502759683299-cdcd6974244f?auto=format&fit=crop&w=440&h=220&q=60',
    'ssr-vs-ssg',
    1
);

INSERT INTO comments (content, user_id, post_id) VALUES ('Great post', 2, 1);

Running locally with docker

Now, that we have the database, tables and seed data ready. Let’s actually get a local postgreSQL instance running in our docker to test it out.

1. Setting up docker-compose configuration

We will be using postgres:12 image and credentials as shown below.

To avoid conflict with existing postgreSQL instances on your machine we will map default PostgreSQL port (5432 ) to 5438 .

As part of the configuration, we will also copy over our SQL scripts to be ran on initialization inside the docker-entrypoint-initdb.d/* .

Remember to create a new file in your root directory called docker-compose.yml (command: touch docker-compose.yml ).

# docker-compose.yml

version: '3.7'
services:
  postgres:
    image: postgres:12
    restart: always
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=test123
    logging:
      options:
        max-size: 10m
        max-file: "3"
    ports:
      - '5438:5432'
    volumes:
      # Copy over db create sql
      - ./data/sql/01_create.sql:/docker-entrypoint-initdb.d/01_create.sql
      # Copy over seed sql
      - ./data/sql/02_seed.sql:/docker-entrypoint-initdb.d/02_seed.sql

📝 Helpful reference:

2. Running the docker image

Let’s get the local docker image running.

Run command:

docker-compose up

Run command (Rebuild image and init again):

You may need to run this command if you made changes to the SQL scripts as docker will cache the image and the volume.

docker-compose up --build --force-recreate -V

3. Connect to the database

export PGPASSWORD=test123
psql -p 5438 --host 0.0.0.0 --username postgres

4. Test out a query

You should see psql cli display the comments in the database after running the command below.

You can also try other queries like (select * from posts; or select * from users; ).

They should all work as expected (assuming the the setup was done correctly!).

\c postgres;

select * from comments;

Conclusion

to summarize, we first took some time to understand the data model used in our application, how to implement it in using SQL and also added triggers to keep track of when it was last updated.

In addition, we also setup a local docker image which would intialize a database which ran the SQL scripts we have defined.

Now we have a working development database, the next step is to run through our applications to see how it gets connected.

Also, it would be a good time to discuss the differences between running the database locally versus in production (on AWS ECS).

The code is already available in our repository but we will do a quick walk through to highlight the main parts! See you in the next module!


Enjoy the content ?

Then consider signing up to get notified when new content arrives!

Jerry Chang 2022. All rights reserved.