Streaming replication with PostgreSQL 9.6, 10 & 11 - PostgreSQL Standby Databases

Introduction

Various replication modes are available with PostgreSQL.

PostgreSQL Replication modes

In this article, a PostgreSQL 9.6 streaming replication is implemented on Linux Ubuntu 18.04 servers, it’s very easy. In the streaming replication mode, the standby connects to the primary, which streams WAL records (Write Ahead Log) to the standby as they’re generated, without waiting for the WAL file to be filled. Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping.

  • The standby server can be in read only mode for reporting purposes for example.
  • A replication slot ensures that the needed WAL files for the standby are not removed in the primary server before the standby server processes them. Multiple replication slots can be defined depending on the number of the standby servers.
PostgreSQL Streaming replication

(LSN : Log Sequence Number)

The procedure below is valid for PostgreSQL 9.6, 10 and 11. For PostgreSQL version 12 and above, the setup is slightly different and not discussed here. It is not important when configuring a PostgreSQL streaming replication, but default WAL files location is different on PostgreSQL 9.6 and PostgreSQL 10/11.

PostgreSQL 9.6$PGDATA/pg_xlog
PostgreSQL 10/11$PGDATA/pg_wal

The context is the following : 1 primary server, 1 standby server.

Primary server : VPSFRSQLPAC1 Standby server : VPSFRSQLPAC2 Binaries (PostgreSQL 9.6.15) : /opt/postgres/pgsql-9.6/bin $PATH : /opt/postgres/pgsql-9.6/bin:$PATH $PGLIB : /opt/postgres/pgsql-9.6/lib Port : 30001 $PGDATA : /sqlpac/postgres/srvpg1 $CFG : /opt/postgres/dba/srvpg1/cfg Configuration files : $CFG/postgresql.conf $CFG/pg_hba.conf $CFG/pg_ident.conf Controlling the PostgreSQL Server :
pg_ctl start|stop|restart…  -D $CFG

Preparing the primary server

System parameters

The primary server must be restarted, especially with PostgreSQL 9.6, to apply at least the following static parameters :

  • listen_addresses : *
  • wal_level : the wal level is replica for streaming replication.
  • max_replication_slots : at least 1 replication slot (1 standby). Higher values of replication slots if more standby or logical servers will be configured.
  • max_wal_senders : at least 3 wal senders (1 standby + 2 for pg_basebackup). Higher values if more standby servers will be configured.
$CFG/postgresql.conf
listen_addresses = '*'
wal_level=replica
max_replication_slots=3
max_wal_senders=3

With PostgreSQL 10 and 11, the default values are already adjusted for replication. However, check the settings.

PostgreSQL 10 / 11Default values
wal_level max_replication_slots max_wal_sendersreplica 10 10
postgres@vpsfrsqlpac1$ pg_ctl restart -D $CFG

Replication role

Create a role with the replication privilege, this role will be used by the standby server to connect to the primary server :

create role repmgr with replication login encrypted password '***********';

Add the role in the primary server file pg_hba.conf with the standby IP address server, this will allow connections from the standby server. Don’t forget to manage existing firewall rules.

$CFG/pg_hba.conf
host    replication     repmgr          51.xxx.xxx.xxx/32           md5

Here, SSL connections are not implemented.

Reload the configuration :

postgres@vpsfrsqlpac1$ pg_ctl reload -D $CFG

Replication slot

Create a replication slot in the primary server.

select * from pg_create_physical_replication_slot('standby1');

 slot_name | xlog_position
-----------+---------------
 standby1  |
(1 row)

          
select slot_name, restart_lsn from pg_replication_slots;
          
 slot_name | restart_lsn
-----------+-------------
 standby1  |
(1 row)

The replication slot (restart_lsn) will be initialized during the primary server backup with pg_basebackup.

Starting with PostgreSQL 11, it is not mandatory to create manually the replication slot, this one can be created and initialized with pg_basebackup.

Primary server backup (pg_basebackup)

The primary server backup is performed with pg_basebackup.

postgres@vpsfrsqlpac1$ pg_basebackup -D /sqlpac/postgres/backup/srvpg1 \
                         -X stream \
                         --write-recovery-conf \
                         --slot=standby1 \
                         --dbname="host=localhost user=postgres port=30001"

Starting with,PostgreSQL 11, add the argument --create-slot if the replication slot has not been previously created.

With the option --slot giving the replication slot name : that way, it is guaranteed the primary server does not remove any necessary WAL data in the time between the end of the base backup and the start of streaming replication.

When the backup is completed, the replication slot standby1 is then defined :

select slot_name, restart_lsn from pg_replication_slots;
          
 slot_name | restart_lsn
-----------+-------------
 standby1  | 0/33000000

The option --write-recovery-conf (or -R) writes a file recovery.conf in the root backup directory. This file will prevent any user error when starting the standby server, this file indeed indicates a standby server, and the slot name is given :

/sqlpac/postgres/backup/srvpg1/recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=postgres host=localhost port=30001 sslmode=prefer sslcompression=1 krbsrvname=postgres'
primary_slot_name = 'standby1'

Standby server activation

Install the primary server backup previously performed in the standby data directory ($PGDATA).

recovery.conf

Be sure the file recovery.conf is installed in the standby server root data directory with the option standby_mode = 'on' and the replication slot name.

Update the connection info parameters to the primary server in this file.

$PGDATA : /sqlpac/postgres/srvpg1/recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=repmgr host=vpsfrsqlpac1 port=30001 password=************'
primary_slot_name = 'standby1'

postgresql.conf

If read only connections are allowed, check the parameter hot_standby is set to on on the standby server (on by default starting with PostgreSQL 10):

$CFG/postgresql.conf
hot_standby = on

Starting the standby server

Now the standby server can be started.

postgres@vpsfrsqlpac2$ pg_ctl start -D $CFG

When there is no error, in the standby server log file :

LOG:  entering standby mode
LOG:  redo starts at 0/33000028
LOG:  consistent recovery state reached at 0/34000000
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/34000000 on timeline 1

The standby server is in recovery mode :

postgres@vpsfrsqlpac2$ psql -p30001

select pg_is_in_recovery();

 pg_is_in_recovery
-------------------
t

If the replication slot has not been defined and the needed WAL files removed in the primary server, an error occurs :

FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000000000010 has already been removed

Testing replication

Create a table heartbeat in the primary server, this table will be updated every minute :

postgres@vpsfrsqlpac1$ psql -p30001

create table heartbeat ( reptimestamp timestamp );

insert into heartbeat values (now());

select * from heartbeat;

        reptimestamp
----------------------------
 2019-11-22 09:04:36.399274

Check the replication to the standby server :

postgres@vpsfrsqlpac2$ psql -p30001

select * from heartbeat;

        reptimestamp
----------------------------
 2019-11-22 09:04:36.399274

Pause / Resume replication

To pause/resume replication, on the standby server :

postgres@vpsfrsqlpac2$ psql -p30001
PostgreSQL 9.6PostgreSQL 10 / 11
Pause replication
select pg_xlog_replay_pause();
2019-11-22 19:58:33.686 CET LOG:  recovery has paused
select pg_wal_replay_pause();
Resume replication
select pg_xlog_replay_resume();
select pg_wal_replay_resume();
Replication paused ?
select pg_is_xlog_replay_paused();

 pg_is_xlog_replay_paused
--------------------------
 f
select pg_is_wal_replay_paused();

 pg_is_wal_replay_paused
--------------------------
 f

Essential replication informations

This article does not deal with replication monitoring, however below the essential informations about the replication state.

Standby server : pg_stat_wal_receiver

In the standby, use the view pg_stat_wal_receiver :

postgres@vpsfrsqlpac2$ psql -p30001

\x on;
select * from pg_stat_wal_receiver;

-[ RECORD 1 ]---------+-----------------------------------------------------
pid                   | 2262
status                | streaming
receive_start_lsn     | 0/97000000
receive_start_tli     | 1
received_lsn          | 0/99000920
received_tli          | 1
last_msg_send_time    | 2019-11-22 18:17:46.355579+01
last_msg_receipt_time | 2019-11-22 18:17:46.355835+01
latest_end_lsn        | 0/99000760
latest_end_time       | 2019-11-22 18:15:46.232277+01
slot_name             | standby1
conninfo              | user=repmgr password=******** dbname=replication host=vpsfrsqlpac1 port=30001 …

The WAL receiver process id is 2262 :

postgres@vpsfrsqlpac2$ ps -ef | grep 'postgres' | grep 2262

postgres  2262 32104  0 18:35 ?        00:00:04 postgres: wal receiver process   streaming 0/99000920

Primary server : pg_stat_replication and pg_replication_slots

In the primary server, use pg_stat_replication and pg_replication_slots :

postgres@vpsfrsqlpac1$ psql -p30001

\x on;
select * from pg_stat_replication;

-[ RECORD 1 ]----+-----------------------------
pid              | 6247
usesysid         | 16384
usename          | repmgr
application_name | walreceiver
client_addr      | 51.xxx.xxx.xxx
client_hostname  |
client_port      | 41354
backend_start    | 2019-11-22 09:35:42.41099+01
backend_xmin     |
state            | streaming
sent_location    | 0/99000920
write_location   | 0/99000920
flush_location   | 0/99000920
replay_location  | 0/99000840
sync_priority    | 0
sync_state       | async

\x on;
select * from pg_replication_slots;

-[ RECORD 1 ]-------+-----------
slot_name           | standby1
plugin              |
slot_type           | physical
datoid              |
database            |
active              | t
active_pid          | 6247
xmin                |
catalog_xmin        |
restart_lsn         | 0/99000920
confirmed_flush_lsn |

The WAL sender process id is 6247 :

postgres@vpsfrsqlpac1$ ps -ef | grep 'postgres' | grep 6247

postgres  6247  5576  0 18:35 ?        00:00:00 postgres: wal sender process repmgr 51.xxx.xxx.xxx(41354) streaming 0/99000920

Conclusion

Installing a streaming replication with PostgreSQL 9.6 is very easy, maybe one of the easiest replication architecture.

Do not forget replication slots ! Only replication slots guarantee standby servers won’t run out of sync from the primary server.