You are on 0.1.0-beta.1 documentation which is outdated, the latest version is 0.2.2. Please upgrade to a newer release!
Inactive physical replication slot

PostgreSQLInactivePhysicalReplicationSlot #

Meaning #

Alert is triggered when a PostgreSQL physical replication slot is inactive.

Impact #

A non-running replication slot forces PostgreSQL to keep all WAL files on its local storage.

It could lead to:

  • Disk space saturation on the PostgreSQL server
  • Replication slot will no longer be usable if it reaches its max allowed storage

Diagnosis #

Physical replication is only used by AWS to replicate RDS instances.

A newly created RDS instance may need time to replay WAL files since the last full backup. The replication slot will not be used until the replicas have replayed all the WAL files.

  1. Prioritize. Look at the replication slot disk space consumption trend in Replication slot available storage panel of the Replication slot dashboard to estimate the delay before reaching storage space saturation

    Find the RDS instance that uses the physical replication slot
    1. Identify which replication slot is consuming disk space
    2. Extract the AWS RDS resource_id from the slot name (rds_[aws_region]_db_[resource_id])
    3. Found the RDS instance in RDS instances dashboard
  2. Check lag of RDS replica in RDS instance details dashboard

  3. Check replica instance logs in AWS Cloudwatch

    You may see replaying WAL file messages

Mitigation #

  1. If an RDS replica instance was just created

    • If the RDS primary instance doesn’t risk disk space saturation, wait until RDS initialization is finished

    • Otherwise, delete the RDS replica that owns the non-running replication slot

      Recreate the RDS replica after a full RDS snapshot and in a low activity period to limit WAL files to replay

  2. Increase disk space on the primary instance

  3. Open AWS support case to report non-running physical replication

Additional resources #