Top 15 Essential PostgreSQL Queries Every DBA Must Master

Managing a high-volume PostgreSQL database is a demanding responsibility. As data grows into hundreds of gigabytes or even terabytes, performance tuning, monitoring, and optimization become critical. While PostgreSQL offers advanced features, the effectiveness of a database administrator largely depends on their ability to use the right diagnostic queries at the right time.

This blog covers the top fifteen essential PostgreSQL queries every DBA should know. These queries help identify performance issues, track bottlenecks, analyze growth patterns, monitor activities, understand index usage, and keep large systems stable and efficient.

Each section explains what the query does, why a DBA needs it, and when it is useful.

1. Identify Unused Indexes (idx_scan = 0)

Unused indexes increase storage, slow down write performance, and create unnecessary maintenance overhead.

Purpose

To find indexes that have never been scanned by PostgreSQL.

Query

SELECT
    s.schemaname,
    s.relname AS table_name,
    s.indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(s.indexrelid)) AS index_size,
    s.idx_scan
FROM
    pg_stat_user_indexes s
JOIN
    pg_index i ON i.indexrelid = s.indexrelid
WHERE
    s.idx_scan = 0
    AND i.indisunique = false
ORDER BY
    pg_relation_size(s.indexrelid) DESC;

Why it matters

Indexes with zero scans are candidates for removal because they:

  • Slow down INSERT, UPDATE and DELETE operations
  • Increase autovacuum load
  • Consume disk space

You can also explore the postgres system view based on the indexes

pg_stat_user_indexes  -  PostgreSQL system view that shows index usage statistics only for indexes on user-defined tables. It excludes system catalogs and internal tables.

pg_stat_all_indexes - A system view that provides index usage statistics for both user-defined tables and system catalog tables. This includes internal indexes used by PostgreSQL itself.

2. Fetch Long-Running Queries Using pg_stat_statements

Long-running queries are often the root cause of slow applications and database latency.

Before run this query, you need to install the extension named pg_stat_statements

Create extension pg_stat_statements;

You can verify the installation like this

select * from pg_available_extensions where name = 'pg_stat_statements';

Purpose

To find queries with the highest execution time, most calls, or slowest performance.

Query

SELECT
    query,
    calls,
    total_exec_time,
    mean_exec_time,
    rows
FROM
    pg_stat_statements
ORDER BY
    total_exec_time DESC
LIMIT 10;

Why it matters

This helps a DBA quickly detect:

  • Slow queries
  • Missing indexes
  • Inefficient query patterns

3. Detect Table Bloat

Table bloat leads to performance degradation and wasted storage.

Before using this query, you need to install the extension named pgstatstuple

create extension pgstatstuple;

Purpose

To estimate bloat percentage based on dead tuples.

Query

SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
    round((pgstattuple(schemaname||'.'||tablename)).free_percent::numeric, 2) AS bloat_percent
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY (pgstattuple(schemaname||'.'||tablename)).free_percent DESC;

Why it matters

High bloat requires immediate action such as:

  • VACUUM
  • ANALYZE
  • REINDEX
  • CLUSTER

4. Check Autovacuum Activity

Autovacuum ensures table health by cleaning dead tuples.

Purpose

To see what autovacuum jobs are currently running.

Query

SELECT
    pv.pid,
    pv.datname,
    c.relname AS table_name,
    pv.phase,
    pv.heap_blks_total,
    pv.heap_blks_scanned,
    pv.heap_blks_vacuumed,
    round(100.0 * pv.heap_blks_scanned / NULLIF(pv.heap_blks_total, 0), 2) AS scan_percent,
    now() - pa.xact_start AS running_time,
    pa.wait_event_type,
    pa.wait_event
FROM pg_stat_progress_vacuum pv
LEFT JOIN pg_class c ON pv.relid = c.oid
LEFT JOIN pg_stat_activity pa ON pv.pid = pa.pid
ORDER BY pa.xact_start;

Why it matters

This helps identify:

  • Stuck vacuums
  • Long-running vacuum operations
  • Tables requiring attention

5. Find the Largest Tables in the Database

Large tables are expensive to maintain and require careful planning.

Query

SELECT
    relname AS table_name,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM
    pg_catalog.pg_statio_user_tables
ORDER BY
    pg_total_relation_size(relid) DESC
LIMIT 10;

Why it matters

Useful to identify:

  • Candidates for partitioning
  • Storage-heavy tables
  • Tables requiring optimization

6. Find the Largest Indexes

Indexes can sometimes grow larger than the underlying tables.

Query

SELECT
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM
    pg_stat_user_indexes
ORDER BY
    pg_relation_size(indexrelid) DESC
LIMIT 10;

Why it matters

  • Large indexes slow down query execution because PostgreSQL must scan bigger index structures.
  • They consume more shared memory and I/O, increasing system load.
  • Autovacuum and REINDEX operations take longer and become more expensive.
  • Oversized indexes often indicate bloat or poor indexing strategy.
  • They increase disk usage and can negatively affect write performance on the table.

7. Identify Missing Indexes Using Sequential Scan Counts

Sequential scans on large tables typically indicate lack of proper indexing.

Query

SELECT
    relname AS table_name,
    seq_scan,
    seq_tup_read,
    idx_scan
FROM
    pg_stat_all_tables
ORDER BY
    seq_scan DESC
LIMIT 10;

Why it matters

  • High sequential scan counts on large tables usually indicate missing or ineffective indexes.
  • Sequential scans increase I/O load and slow down query response time.
  • They can cause unnecessary CPU usage because PostgreSQL must read more rows than needed.
  • Identifying these tables early helps prioritize index creation or optimization.
  • Reducing sequential scans leads to faster lookups and more efficient query plans.

8. Estimate Table Row Count Quickly

Counting rows in large tables using COUNT(*) is expensive.

Query

SELECT
    relname AS table,
    n_live_tup AS estimated_rows
FROM
    pg_stat_all_tables
ORDER BY
    n_live_tup DESC
LIMIT 10;

Why it matters

  • It avoids costly full-table scans that can slow down large production systems.
  • Provides quick row-count estimates using PostgreSQL’s internal statistics.
  • Helps DBAs evaluate table growth trends without impacting performance.
  • Useful for capacity planning, partitioning decisions, and storage forecasting.
  • Enables faster analysis during migrations, optimizations, and troubleshooting.

9. Monitor Active Connections

Connection overload can cause performance issues or lead to database crashes.

Query

SELECT
    datname,
    count(*) AS total_connections
FROM
    pg_stat_activity
GROUP BY 1;

Why it matters

  • Too many open connections can exhaust resources and degrade database performance.
  • Helps detect connection leaks caused by application bugs or improper session handling.
  • Identifies sudden traffic spikes that may require scaling or rate-limiting.
  • Reveals whether connection pooling is configured correctly or overused.
  • Prevents connection saturation, which can cause timeouts or database outages.

10. Check Row Locks and Blocking Sessions

Lock contention slows down queries and causes deadlocks.

Query

SELECT
    locktype,
    relation::regclass,
    mode,
    granted,
    pid,
    now() - query_start AS waiting_since,
    query
FROM pg_locks
JOIN pg_stat_activity USING (pid)
ORDER BY waiting_since DESC;

Why it matters

  • Detects blocking sessions that are slowing down or halting critical queries.
  • Helps identify long-running transactions that are holding locks unnecessarily.
  • Reveals lock types and lock modes, aiding in understanding concurrency issues.
  • Prevents deadlocks by showing queries waiting for each other in a chain.
  • Enables quick resolution of performance bottlenecks during high-load periods

11. List All Active Queries

Useful when diagnosing slow systems or high CPU usage.

Query

SELECT *
FROM pg_stat_activity
WHERE state = 'active';

Why it matters

  • Provides real-time visibility into what each session is currently executing.
  • Helps diagnose slow performance by identifying long-running or expensive queries.
  • Useful for detecting unoptimized queries during peak system load.
  • Allows DBAs to pinpoint application components generating heavy workloads.
  • Essential for immediate troubleshooting when CPU, I/O, or memory usage suddenly spikes.

12. Monitor All Progress Operations

PostgreSQL provides internal progress views for operations such as ANALYZE, COPY, CLUSTER, etc.

Purpose

To inspect the progress of internal operations.

Query

SELECT * FROM pg_stat_progress_analyze;
SELECT * FROM pg_stat_progress_basebackup;
SELECT * FROM pg_stat_progress_cluster;
SELECT * FROM pg_stat_progress_copy;
SELECT * FROM pg_stat_progress_create_index;
SELECT * FROM pg_stat_progress_vacuum;

Why it matters

  • Allows DBAs to track exactly how far maintenance tasks such as ANALYZE, VACUUM, and CREATE INDEX have progressed.
  • Helps identify operations that are stuck or running slower than expected.
  • Provides insight into long-running tasks that may be impacting system performance.
  • Useful for planning maintenance windows by estimating remaining task duration.
  • Enables quick diagnosis of bottlenecks during heavy maintenance operations.

13. Inspect Table Statistics Using pg_stats

pg_stats shows column-level statistics used by the query planner.

Query

SELECT *
FROM pg_stats
ORDER BY tablename, attname;

Why it matters

Helps understand:

  • Column distinctness
  • Histogram data
  • Most common values

Critical for diagnosing planner misestimation.

14. Identify Tables with Highest Dead Tuples

Dead tuples contribute to bloat and slow performance.

Query

SELECT
    schemaname,
    relname,
    n_dead_tup
FROM pg_stat_all_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

Why it matters

  • High dead-tuple counts indicate tables that are accumulating bloat and degrading performance.
  • Excessive dead tuples slow down sequential scans, index scans, and query execution.
  • They increase table and index size, consuming unnecessary disk space.
  • Identifying these tables early helps prioritize VACUUM or REINDEX operations.
  • Reducing dead tuples improves autovacuum efficiency and overall database stability.

15. Monitor Database-Wide Cache Hit Ratio

A high cache hit ratio indicates excellent buffer usage.

Query

SELECT
    blks_hit * 100.0 / (blks_read + blks_hit) AS cache_hit_ratio
FROM pg_stat_database
WHERE datname = current_database();

Why it matters

  • The cache hit ratio reflects how efficiently PostgreSQL serves data from memory instead of disk.
  • A low ratio indicates excessive disk reads, leading to slower query performance.
  • Helps identify memory pressure, poor caching behavior, or insufficient shared buffers.
  • Reveals indexing issues when many queries cannot be satisfied from cached index pages.
  • Monitoring this metric ensures the database is optimized for high-performance workloads.

Conclusion

A skilled PostgreSQL DBA must be able to diagnose problems quickly, identify bottlenecks, and keep large databases running efficiently. The fifteen queries in this guide form a core toolkit for managing real-world production systems. They provide insights into unused indexes, slow queries, bloat accumulation, lock contention, maintenance activity, and overall database health.

Using these queries regularly helps ensure better performance, more efficient indexing strategies, faster maintenance operations, and higher system stability. Whether you manage an ERP system, eCommerce platform, SaaS application, or analytics environment, these foundational queries will help you maintain a reliable and well-optimized PostgreSQL database.

whatsapp_icon
location

Calicut

Cybrosys Technologies Pvt. Ltd.
Neospace, KINFRA Techno Park
Kakkanchery, Calicut
Kerala, India - 673635

location

Kochi

Cybrosys Technologies Pvt. Ltd.
1st Floor, Thapasya Building,
Infopark, Kakkanad,
Kochi, India - 682030.

location

Bangalore

Cybrosys Techno Solutions
The Estate, 8th Floor,
Dickenson Road,
Bangalore, India - 560042

Send Us A Message