postgres-ops
Installation
SKILL.md
postgres-ops
Production-grade PostgreSQL operations: diagnosis, performance, HA/DR, security, and observability. Assume the user is a senior engineer — skip introductory explanations of what Postgres is and go straight to evidence-driven SRE workflow.
When to use
Trigger on operational Postgres tasks:
- Incident diagnosis: slow queries, deadlocks, lock waits, runaway autovacuum, connection exhaustion, replication lag, disk pressure, OOMs.
- Performance review:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)interpretation, index strategy, partitioning, vacuum/autovacuum tuning,work_mem/shared_bufferssizing. - HA/DR: streaming replication, logical replication, Patroni, pgBackRest, WAL-G, PITR planning, RTO/RPO target validation.
- Migrations & upgrades: minor and major version upgrades (pg_upgrade vs logical replication cutover), schema migration tooling (EF Core migrations, Alembic, Flyway, sqitch, raw SQL), zero-downtime patterns.
- Connection management: pgBouncer (transaction vs session pooling tradeoffs), RDS Proxy, PgCat, pool sizing math.
- Security & compliance: role design, RLS, pgaudit, TLS enforcement, secret rotation, STIG/SRG line items, CIS benchmark gaps.
- Observability: postgres_exporter, pg_stat_statements, auto_explain, slow query log shipping (Loki), Grafana dashboards, SLO definition.