How to Setup PostgreSQL High Availability with Patroni – Complete Guide

PostgreSQL-DB-HA-Patroni

PostgreSQL is one of the widely used open-source database management systems. High Availability is one of the must-have requirements for almost all applications nowadays. In this article, we will look at how to setup PostgreSQL high availability with Patroni.

Patroni is an open-source tool to deploy and manage highly available PostgreSQL clusters. However, Patroni developers call it a template as you can create your very own high-availability solution using Python by extending it.

How Patroni Works?

In order to understand how Patroni works, you will have to first know what all components are at play in a Patroni HA setup. Let’s look at them:

  • PostgreSQL – The DBMS
  • Patroni Agent – This is the Patroni agent that lives on each PostgreSQL node
  • DCS – Distributed Configuration Store. This is a distributed system for storing Patroni configuration. We will be using etcd for this setup.
  • Load Balancer – Balances the load on the Patroni cluster. We will be using HAProxy for our setup.

This is the basic workflow for how Patroni manages the PostgreSQL DB HA setup:

  1. The Patroni agent identifies if a failure of the primary node occurred.
  2. The Patroni agent will promote a sufficiently healthy replica node to the primary node.
  3. The Load Balancer now routes requests to the new primary node.
  4. If the failed primary comes back online it is added to the cluster as a replica node.

How to Setup PostgreSQL High Availability with Patroni?

Prerequisites

We will be using four machines for this setup, however, you can setup etcd and haproxy on the same machine if you are just trying things out. These are the machines we will be using:

ServerIPApplication
Node110.171.40.1PostgreSQL, Patroni
Node210.171.40.2PostgreSQL, Patroni
Node310.171.40.3ETCD
Node410.171.40.4HAProxy
Installing Packages

1. Let’s first install PostgreSQL and other required packages on Node1 and Node2. Run these commands to install them.

$ sudo apt update
$ sudo apt install postgresql postgresql-contrib libpq-dev python3-dev-y

Stop the PostgreSQL service on both nodes after the installation. Now create the symbolic links for the PostgreSQL binaries in the /usr/sbin directory as they will be used by Patroni.

$ ln -s /usr/lib/postgresql/12/bin/* /usr/sbin/

2. Now install Patroni and python packages for etcd and PostgreSQL on the nodes Node1 and Node2.

python3 -m pip install patroni python-etcd psycopg2

3. Now install etcd on Node3.

$ apt install etcd -y

4. Finally install HAProxy on the Node4.

$ apt install haproxy -y
Configuring ETCD

Edit the etcd configuration file located at /etc/default/etcd and set the configuration as follows:

ETCD_LISTEN_PEER_URLS="http://127.0.0.1:7001, http://10.171.40.3:2380"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379, http://10.171.40.3:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.171.40.3:2380"
ETCD_INITIAL_CLUSTER="etcd0=http://10.171.40.3:2380,"
ETCD_ADVERTISE_CLIENT_URLS="http://10.171.40.3:2379"
ETCD_INITIAL_CLUSTER_TOKEN="node1"
ETCD_INITIAL_CLUSTER_STATE="new"

Note: Don’t forget to replace the IP 10.171.40.3 with your node’s IP address.

Restart the etcd service after making the changes to apply them.

$ systemctl restart etcd
Configuring Patroni

Let’s now configure Patroni on the nodes Node1 and Node2. We will first start with Node1.

Create a file /etc/patroni.yml with the following configuration.

scope: postgres-ha
name: node1

restapi:
    listen: 10.171.40.1:8008
    connect_address: 10.171.40.1:8008

etcd:
    host: 10.171.40.3:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 10.171.40.1/0 md5
    - host replication replicator 10.171.40.2/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: adminpass
            options:
                - createrole
                - createdb

postgresql:
    listen: 10.171.40.1:5432
    connect_address: 10.171.40.1:5432
    data_dir: /mnt/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: rep-pass
        superuser:
            username: postgres
            password: postgres-pass
    parameters:
        unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

The data_dir config specifies the directory Patroni will use to store the database data. You will have to create this directory and set appropriate permissions for Patroni to be able to use it.

$ mkdir -p /mnt/patroni
$ chown postgres:postgres /mnt/patroni
$ chmod 750 /mnt/patroni

You can mount a separate disk in the /mnt directory and use it as Patroni’s data_dir as database files grow quite large.

Repeat these same steps for Node2, and create the /etc/patroni.yml configuration file and data directory with correct permissions and ownership. However, you will have to use this Patroni configuration on Node2.

scope: postgres-ha
name: node2

restapi:
    listen: 10.171.40.2:8008
    connect_address: 10.171.40.2:8008

etcd:
    host: 10.171.40.3:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 10.171.40.1/0 md5
    - host replication replicator 10.171.40.2/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 10.171.40.2:5432
    connect_address: 10.171.40.2:5432
    data_dir: /mnt/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: rep-pass
        superuser:
            username: postgres
            password: postgres-pass
    parameters:
        unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

The scope needs to same for all the Patroni nodes however the name has to be distinct for every node. Ensure that all the Patroni nodes point to the same etcd server.

Now let’s create a systemd service for Patroni to manage it. Repeat these steps on both Patroni nodes. Create the service file /etc/systemd/system/patroni.service with the following configuration.

[Unit]
Description=PostgreSQL High Availability Setup with Patroni
After=syslog.target network.target

[Service]
Type=simple

User=postgres
Group=postgres

ExecStart=/usr/local/bin/patroni /etc/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.target

Now reload the daemon unit files.

$ systemctl daemon-reload

Finally, start the Patroni service and enable it so that it comes up automatically during system boot up.

$ systemctl start patroni
$ systemctl enable patroni

Also, disable Postgresql and let Patroni manage the database cluster.

$ systemctl disable postgresql
Configuring HAProxy

Now let’s setup the load balancer for our PostgreSQL HA setup on Node4. Edit the HAProxy configuration file at /etc/haproxy/haproxy.cfg. Remove the default configuration and edit it as follows:

global
    maxconn 100

defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_10.171.40.1_5432 10.171.40.1:5432 maxconn 100 check port 8008
    server postgresql_10.171.40.2_5432 10.171.40.2:5432 maxconn 100 check port 8008

Save the file and restart the HAProxy service to apply the configuration.

$ systemctl restart haproxy

Now the PostgreSQL High Availability Load Balanced cluster is finally up. Let’s now look at the ways how you can access it.

Accessing PostgreSQL Patroni HA Setup

1. Web View

You can view your cluster in the browser using your load balancer’s IP and the port specified in the HAProxy configuration. The port is 7000 for the above setup, hence we can access it using the URL http://10.171.40.4:7000.

This page only shows us the cluster status with the current primary node highlighted in GREEN and the replicas highlighted in RED.

2. PostgresSQL Shell

You can access the PostgreSQL shell using its psql command. Go to one of your Patroni nodes and use the psql utility to access your PostgreSQL HA setup.

You can access the PostgreSQL cluster using the load balancer this way. Notice the port is 7000 which is what we set in the HAProxy configuration. HAProxy passes on the requests to the Patroni nodes on port 5432.

$ psql -h 10.171.40.4 -p 7000 -U postgres -W

Note: Use the IP of your load balancer node instead of 10.171.40.4.

If you want to access the PostgreSQL shell using your Patroni node’s IP use port 5432.

$ psql -h 10.171.40.1 -p 5432 -U postgres -W

Use the password set for the postgres user in the /etc/patroni.yml configuration file on the password prompt. It will launch you into your PostgreSQL shell.

Note: The Patroni cluster runs separately from the default PostgreSQL cluster hence your databases will not show up in the default PostgreSQL shell. You can access the Patroni databases by launching a PostgreSQL shell managed by Patroni. You can use one of the above commands for it.

Conclusion

Patroni is a very handy tool and makes it very easy to setup a PostgreSQL DB HA cluster. You can setup a big PostgreSQL cluster very quickly using this tool. It is a very easy-to-use tool however if you do face any issues let me know in the comments section below.

I also write interesting articles about Linux and would highly recommend you to give them a read.

Leave a Reply

Your email address will not be published.

Previous Post
Deploy Flutter Web App on Github Pages

How to Host your Flutter Web app on GitHub Pages?