Enterprise Data Architect for Mitel Networks
Outages, corruption, and human error happen in the real world
Untargeted (and targeted) bitlocker-style ransom attacks are increasingly common
Generally we are referring to a Primary node (often called master) and at least one standby node (replica, slave)
A standby server can be cold, warm, or hot
A very little, promise ;-)
The Write-Ahead Log ensures that transactional changes are written to disk only after being written to a transaction log
This is what makes your database recoverable, and another word for this is durable(the D in ACID)
A logical backup (from
pg_dump) won't be enough
pg_basebackup command to seed a binary replica of the running primary database
The command can write the backup directly to another remote host. See the docs for more options.
wal_level = archive archive_mode = on archive_command = 'cp %p /path_to/archive/%f'
on standby host
standby_mode = true restore_command = 'cp /path_to/archive/%f %p' archive_cleanup_command = 'pg_archivecleanup /path_to/archive %r'
CREATE ROLE replicator WITH LOGIN REPLICATION;
# allow the standby host to connect host replication replicator 192.168.0.42/32 md5
wal_level = hot_standby full_page_writes = on wal_log_hints = on max_wal_senders = 5 wal_keep_segments = 32
on standby host
hot_standby = on
standby_mode = 'on' primary_conninfo = 'host=192.168.0.23 port=5432 user=replicator'
What if there's a gap in processing - maybe a network problem - and the WAL files turn over before replication has read the transactions?
ERROR: requested WAL segment 00000001000000010000002D has already been removed
With a streaming replica, you can always enable log-shipping as a fall-back plan.
Added in 9.4
They give the Primary a persistent record of the state of replicas
This means that the primary will keep ALL WAL records until slot is received from.
SELECT * FROM pg_create_physical_replication_slot('standby_replication_slot_alpha');
max_replication_slots = 5
primary_slot_name = 'standby_replication_slot_alpha'
Consider another what if. . .
What if an outage occurs after a transaction is logged to the Primary, but before the replica has accepted it?
We need a guaranty that the transaction has been written to one (or more than one) additional host.
synchronous_commit = on synchronous_standby_names = "1 (named_host1, named_host2)"
There is no formula to determine latency when syncronous replication is enabled. It is a function of your network conditions and the health and processing power of the available replicas
Basic monitoring is done on the Primary through
Additional checks for lag can be done by writing scripts, or using check_postgres.pl
From the introduction in the docs
This means that the subscriber can write to other tables while still receiving updates from the subscription
You can have any number of subscribers for a publication
TRUNCATEis not replicated
Done through DDL on Publisher
CREATE PUBLICATION pub_one -- FOR ALL TABLES FOR TABLE accounts, users, order_* WITH (publish = 'insert, update, delete') ;
A subscription requires a replication slot, but this is usually created for you, unless you specify
create_slot = false
CREATE SUBSCRIPTION sub_one CONNECTION 'host=192.168.0.23 port=5432 user=somebody dbname=some_db' PUBLICATION pub_one -- WITH (enabled = false) ;
A conflict on the local node will HALT replication. It must be corrected manually
Trigger based logical: Slony, Bucardo, xDB
Handling a failover event
Taking advantage of replicas for read-only queries
Building a datamart or reporting IDW
This presentation was built in reveal.js