PostgreSQLInactivePhysicalReplicationSlot #
Meaning #
Alert is triggered when a PostgreSQL physical replication slot is inactive.
Impact #
A non-running replication slot forces PostgreSQL to keep all WAL files on its local storage.
It could lead to:
- Disk space saturation on the PostgreSQL server
- Replication slot will no longer be usable if it reaches its max allowed storage
Diagnosis #
Physical replication is only used by AWS to replicate RDS instances.
A newly created RDS instance may need time to replay WAL files since the last full backup. The replication slot will not be used until the replicas have replayed all the WAL files.
Prioritize. Look at the replication slot disk space consumption trend in
Replication slot available storage
panel of theReplication slot dashboard
to estimate the delay before reaching storage space saturationFind the RDS instance that uses the physical replication slot
- Identify which replication slot is consuming disk space
- Extract the AWS RDS resource_id from the slot name (rds_[aws_region]_db_[resource_id])
- Found the RDS instance in RDS instances dashboard
Check lag of RDS replica in
RDS instance details dashboard
Check replica instance logs in AWS Cloudwatch
You may see replaying WAL file messages
Mitigation #
If an RDS replica instance was just created
If the RDS primary instance doesn’t risk disk space saturation, wait until RDS initialization is finished
Otherwise, delete the RDS replica that owns the non-running replication slot
Recreate the RDS replica after a full RDS snapshot and in a low activity period to limit WAL files to replay
Increase disk space on the primary instance
Open AWS support case to report non-running physical replication