← All articles
DATABASES Essential PostgreSQL Extensions Every Developer Shou... 2026-02-15 · 5 min read · postgresql · extensions · pgvector

Essential PostgreSQL Extensions Every Developer Should Know

Databases 2026-02-15 · 5 min read postgresql extensions pgvector timescaledb citus database

Essential PostgreSQL Extensions Every Developer Should Know

PostgreSQL's extension system is one of its greatest strengths. Extensions let you add capabilities that would require switching databases entirely in other ecosystems. This guide covers the extensions that solve real developer problems: vector search for AI, time series optimization, scheduled jobs, geospatial queries, and performance analysis.

PostgreSQL extensions logo

Why PostgreSQL Extensions Matter

Extensions add functionality to Postgres without forking the codebase. They're:

This means you can add vector search, time series optimizations, or cron jobs without leaving SQL.

pgvector: Vector Search and Embeddings

pgvector adds vector similarity search to Postgres. This is essential for AI applications using embeddings (text, images, audio).

Installation

# Ubuntu/Debian
sudo apt install postgresql-16-pgvector

# macOS (Homebrew)
brew install pgvector

# From source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make && sudo make install

Enable in your database:

CREATE EXTENSION vector;

Usage

Store embeddings as vectors:

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding vector(1536)  -- OpenAI ada-002 dimension
);

Insert embeddings:

INSERT INTO documents (content, embedding)
VALUES
  ('PostgreSQL is a relational database', '[0.1, 0.2, ...]'),
  ('Vectors enable semantic search', '[0.3, 0.1, ...]');

Find similar documents:

-- Cosine similarity
SELECT content, 1 - (embedding <=> '[0.15, 0.18, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.15, 0.18, ...]'
LIMIT 10;

Indexing for Performance

Create an index for fast similarity search:

-- HNSW index (best for most cases)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- IVFFlat index (good for very large datasets)
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

HNSW gives better recall but slower writes. IVFFlat is faster to build but needs tuning.

Real-World Example

Semantic search over documentation:

import { OpenAI } from 'openai'
import postgres from 'postgres'

const sql = postgres(DATABASE_URL)
const openai = new OpenAI()

async function semanticSearch(query: string) {
  // Get query embedding
  const response = await openai.embeddings.create({
    model: 'text-embedding-ada-002',
    input: query
  })
  const embedding = response.data[0].embedding

  // Search database
  const results = await sql`
    SELECT content, 1 - (embedding <=> ${embedding}::vector) AS similarity
    FROM documents
    WHERE 1 - (embedding <=> ${embedding}::vector) > 0.8
    ORDER BY embedding <=> ${embedding}::vector
    LIMIT 10
  `

  return results
}

const docs = await semanticSearch("How do I install Postgres extensions?")

pgvector eliminates the need for separate vector databases (Pinecone, Weaviate) for many use cases.

TimescaleDB: Time Series Optimization

TimescaleDB optimizes Postgres for time series data (metrics, logs, sensor data) with automatic partitioning and compression.

Installation

# Ubuntu/Debian
sudo add-apt-repository ppa:timescale/timescaledb-ppa
sudo apt install timescaledb-2-postgresql-16

# macOS
brew install timescaledb

# Configure
sudo timescaledb-tune

Enable in your database:

CREATE EXTENSION timescaledb;

Usage

Convert a table to a hypertable:

CREATE TABLE metrics (
  time TIMESTAMPTZ NOT NULL,
  device_id INT,
  temperature DOUBLE PRECISION,
  humidity DOUBLE PRECISION
);

-- Convert to hypertable (auto-partitions by time)
SELECT create_hypertable('metrics', 'time');

Insert and query like normal:

INSERT INTO metrics (time, device_id, temperature, humidity)
VALUES (NOW(), 1, 72.5, 45.2);

-- Queries use time-based indexes automatically
SELECT time_bucket('1 hour', time) AS hour,
       AVG(temperature) AS avg_temp
FROM metrics
WHERE time > NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour DESC;

Continuous Aggregates

Pre-compute aggregations for fast queries:

CREATE MATERIALIZED VIEW hourly_metrics
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS hour,
       device_id,
       AVG(temperature) AS avg_temp,
       MAX(temperature) AS max_temp,
       MIN(temperature) AS min_temp
FROM metrics
GROUP BY hour, device_id;

-- Automatically refreshes
SELECT add_continuous_aggregate_policy('hourly_metrics',
  start_offset => INTERVAL '3 hours',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour');

Compression

TimescaleDB compresses old data:

ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',
  timescaledb.compress_orderby = 'time DESC'
);

-- Compress chunks older than 7 days
SELECT add_compression_policy('metrics', INTERVAL '7 days');

Compression ratios of 10-20x are common.

pg_cron: Scheduled Jobs

pg_cron runs periodic jobs inside Postgres. Use it for cleanup, aggregations, or external API calls.

Installation

# Ubuntu/Debian
sudo apt install postgresql-16-cron

# macOS
brew install pg_cron

Add to postgresql.conf:

shared_preload_libraries = 'pg_cron'
cron.database_name = 'your_database'

Restart Postgres and enable:

CREATE EXTENSION pg_cron;

Usage

Schedule a job:

-- Delete old logs every day at 3 AM
SELECT cron.schedule('delete-old-logs', '0 3 * * *', $$
  DELETE FROM logs WHERE created_at < NOW() - INTERVAL '30 days'
$$);

-- Refresh a materialized view every hour
SELECT cron.schedule('refresh-stats', '0 * * * *', $$
  REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats
$$);

-- Run a function every 5 minutes
SELECT cron.schedule('send-notifications', '*/5 * * * *', $$
  SELECT send_pending_notifications()
$$);

List jobs:

SELECT * FROM cron.job;

Unschedule:

SELECT cron.unschedule('delete-old-logs');

pg_cron eliminates the need for external cron jobs or task queues for simple periodic work.

PostGIS: Geospatial Data

PostGIS adds geometry types and spatial functions for location-based queries.

Installation

# Ubuntu/Debian
sudo apt install postgresql-16-postgis-3

# macOS
brew install postgis

Enable:

CREATE EXTENSION postgis;

Usage

Store locations:

CREATE TABLE locations (
  id SERIAL PRIMARY KEY,
  name TEXT,
  location GEOGRAPHY(POINT, 4326)  -- WGS84 lat/lon
);

INSERT INTO locations (name, location)
VALUES
  ('Office', ST_MakePoint(-122.4194, 37.7749)),  -- San Francisco
  ('Home', ST_MakePoint(-118.2437, 34.0522));    -- Los Angeles

Find nearby locations:

-- Find locations within 50km of a point
SELECT name, ST_Distance(location, ST_MakePoint(-122.0, 37.5)) / 1000 AS distance_km
FROM locations
WHERE ST_DWithin(location, ST_MakePoint(-122.0, 37.5), 50000)
ORDER BY distance_km;

Spatial index for performance:

CREATE INDEX ON locations USING GIST (location);

Real-world example (store locator):

-- Find nearest store to user
SELECT name,
       ST_Distance(location::geography, ST_MakePoint($1, $2)::geography) / 1000 AS km
FROM stores
WHERE ST_DWithin(location::geography, ST_MakePoint($1, $2)::geography, 10000)
ORDER BY location <-> ST_MakePoint($1, $2)::geography
LIMIT 5;

pg_stat_statements: Query Performance Analysis

pg_stat_statements tracks execution stats for all queries. Essential for finding slow queries.

Installation

Add to postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all

Restart Postgres and enable:

CREATE EXTENSION pg_stat_statements;

Usage

Find slowest queries:

SELECT query,
       calls,
       total_exec_time / 1000 AS total_seconds,
       mean_exec_time / 1000 AS avg_seconds,
       max_exec_time / 1000 AS max_seconds
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Find queries with high variance:

SELECT query,
       calls,
       mean_exec_time,
       stddev_exec_time,
       stddev_exec_time / mean_exec_time AS variance
FROM pg_stat_statements
WHERE calls > 100
ORDER BY variance DESC
LIMIT 10;

Reset stats:

SELECT pg_stat_statements_reset();

Use this to identify candidates for indexing or query optimization.

pgBouncer: Connection Pooling

pgBouncer is a lightweight connection pooler. Not an extension (runs as a separate process), but essential for production.

Installation

# Ubuntu/Debian
sudo apt install pgbouncer

# macOS
brew install pgbouncer

Configure (/etc/pgbouncer/pgbouncer.ini):

[databases]
mydb = host=localhost port=5432 dbname=mydb

[pgbouncer]
listen_addr = *
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

User file (/etc/pgbouncer/userlist.txt):

"myuser" "md5_hash_of_password"

Start:

sudo systemctl start pgbouncer

Connect to pgBouncer instead of Postgres:

postgres://myuser:password@localhost:6432/mydb

pgBouncer reduces connection overhead. Use transaction mode for stateless apps, session mode if you need prepared statements.

Combining Extensions

These extensions compose well:

-- Time series with geospatial data
CREATE TABLE device_readings (
  time TIMESTAMPTZ NOT NULL,
  device_id INT,
  location GEOGRAPHY(POINT),
  temperature DOUBLE PRECISION
);

SELECT create_hypertable('device_readings', 'time');
CREATE INDEX ON device_readings USING GIST (location);

-- Query: avg temp in SF over last week
SELECT time_bucket('1 hour', time) AS hour,
       AVG(temperature) AS avg_temp
FROM device_readings
WHERE time > NOW() - INTERVAL '7 days'
  AND ST_DWithin(location, ST_MakePoint(-122.4194, 37.7749), 10000)
GROUP BY hour;

Extension Management

List installed extensions:

SELECT * FROM pg_extension;

Upgrade an extension:

ALTER EXTENSION pgvector UPDATE TO '0.5.0';

Drop an extension:

DROP EXTENSION pgvector CASCADE;

Resources

PostgreSQL extensions let you stay in one database while adding vector search, time series, geospatial, and more. Before reaching for a specialized database, check if a Postgres extension solves your problem.