Skip to main content

Overview

The PostgreSQL dialect provides comprehensive support for PostgreSQL databases, including advanced features like arrays, JSON types, range types, full-text search, and more.
Version Support
  • Normal support: PostgreSQL 9.6+
  • Best effort: PostgreSQL 9+

Installation

pip install psycopg
Modern driver, supports both sync and async modes.

Connecting to PostgreSQL

Connection Strings

from sqlalchemy import create_engine

engine = create_engine(
    "postgresql+psycopg://user:password@localhost:5432/mydb"
)

Default Driver

If no driver is specified, psycopg is used by default:
engine = create_engine("postgresql://user:password@localhost/mydb")

PostgreSQL-Specific Features

ARRAY Types

PostgreSQL supports native array types:
from sqlalchemy import Table, Column, Integer, String, MetaData
from sqlalchemy.dialects.postgresql import ARRAY

metadata = MetaData()

users = Table(
    'users',
    metadata,
    Column('id', Integer, primary_key=True),
    Column('tags', ARRAY(String)),
    Column('scores', ARRAY(Integer, dimensions=2))  # 2D array
)

# Querying with arrays
from sqlalchemy.dialects.postgresql import array

stmt = users.select().where(
    users.c.tags.contains(array(['python', 'sql']))
)

JSON and JSONB Types

PostgreSQL provides JSON storage with powerful querying capabilities:
Use JSONB over JSON for better performance. JSONB supports indexing and faster querying.

Range Types

PostgreSQL supports range types for representing ranges of values:
from sqlalchemy.dialects.postgresql import (
    INT4RANGE, INT8RANGE, NUMRANGE, DATERANGE, 
    TSRANGE, TSTZRANGE
)

reservations = Table(
    'reservations',
    metadata,
    Column('id', Integer, primary_key=True),
    Column('room_id', Integer),
    Column('during', TSTZRANGE)  # Timestamp with timezone range
)

# Query with ranges
from sqlalchemy.dialects.postgresql import Range
from datetime import datetime

my_range = Range(
    datetime(2024, 1, 1),
    datetime(2024, 1, 7)
)

stmt = reservations.select().where(
    reservations.c.during.overlaps(my_range)
)

ENUM Types

Create custom enumerated types:
from sqlalchemy.dialects.postgresql import ENUM

status_enum = ENUM(
    'pending', 'active', 'inactive',
    name='status_type',
    create_type=True
)

users = Table(
    'users',
    metadata,
    Column('id', Integer, primary_key=True),
    Column('status', status_enum)
)

HSTORE

Key-value store within a single column:
from sqlalchemy.dialects.postgresql import HSTORE

settings = Table(
    'user_settings',
    metadata,
    Column('user_id', Integer, primary_key=True),
    Column('preferences', HSTORE)
)

# Query HSTORE
stmt = settings.select().where(
    settings.c.preferences['theme'] == 'dark'
)
PostgreSQL provides powerful full-text search capabilities:
from sqlalchemy.dialects.postgresql import TSVECTOR, TSQUERY
from sqlalchemy.dialects.postgresql.ext import (
    to_tsvector, to_tsquery, plainto_tsquery
)

articles = Table(
    'articles',
    metadata,
    Column('id', Integer, primary_key=True),
    Column('title', String),
    Column('content', String),
    Column('search_vector', TSVECTOR)
)

# Full-text search query
stmt = articles.select().where(
    articles.c.search_vector.match('python & database')
)

# Or using to_tsvector
stmt = articles.select().where(
    to_tsvector('english', articles.c.content)
    .match(plainto_tsquery('python database'))
)

UUID Type

Native UUID support:
from sqlalchemy.dialects.postgresql import UUID
import uuid

users = Table(
    'users',
    metadata,
    Column('id', UUID(as_uuid=True), primary_key=True, default=uuid.uuid4),
    Column('name', String)
)

Other PostgreSQL Types

from sqlalchemy.dialects.postgresql import (
    INET,        # IP addresses
    CIDR,        # Network addresses
    MACADDR,     # MAC addresses
    BIT,         # Bit strings
    BYTEA,       # Binary data
    MONEY,       # Currency amounts
    INTERVAL     # Time intervals
)

Sequences and IDENTITY

Using Sequences

from sqlalchemy import Sequence

users = Table(
    'users',
    metadata,
    Column(
        'id',
        Integer,
        Sequence('user_id_seq', start=1),
        primary_key=True
    ),
    Column('name', String)
)

IDENTITY Columns (PostgreSQL 10+)

from sqlalchemy import Identity

data = Table(
    'data',
    metadata,
    Column(
        'id',
        Integer,
        Identity(start=42, cycle=True),
        primary_key=True
    ),
    Column('value', String)
)
This generates:
CREATE TABLE data (
    id INTEGER GENERATED BY DEFAULT AS IDENTITY (START WITH 42 CYCLE),
    value VARCHAR,
    PRIMARY KEY (id)
)

INSERT with RETURNING

PostgreSQL supports RETURNING clause:
from sqlalchemy.dialects.postgresql import insert

stmt = insert(users).values(name='Alice').returning(users.c.id)
result = connection.execute(stmt)
inserted_id = result.scalar()

INSERT … ON CONFLICT (UPSERT)

from sqlalchemy.dialects.postgresql import insert

stmt = insert(users).values(
    id=1,
    name='Alice',
    email='alice@example.com'
)

# Update on conflict
stmt = stmt.on_conflict_do_update(
    index_elements=['id'],
    set_=dict(email='alice@example.com')
)

connection.execute(stmt)

EXCLUDE Constraints

from sqlalchemy.dialects.postgresql import ExcludeConstraint

reservations = Table(
    'reservations',
    metadata,
    Column('id', Integer, primary_key=True),
    Column('room', Integer),
    Column('during', TSRANGE),
    ExcludeConstraint(
        ('room', '='),
        ('during', '&&'),
        name='no_overlapping_reservations'
    )
)

Driver-Specific Features

psycopg Connection Pooling

import psycopg_pool
from sqlalchemy import create_engine
from sqlalchemy.pool import NullPool

my_pool = psycopg_pool.ConnectionPool(
    conninfo="postgresql://scott:tiger@localhost/test",
    close_returns=True
)

engine = create_engine(
    url="postgresql+psycopg://",
    poolclass=NullPool,
    creator=my_pool.getconn
)

asyncpg Prepared Statements

engine = create_async_engine(
    "postgresql+asyncpg://user:pass@host/db",
    connect_args={
        "prepared_statement_cache_size": 500
    }
)

# Disable cache
engine = create_async_engine(
    "postgresql+asyncpg://user:pass@host/db?prepared_statement_cache_size=0"
)
Asyncpg caches prepared statements and PostgreSQL type OIDs. These can become stale when DDL operations modify types (like ENUM). You may need to reconnect after DDL changes.

Performance Tips

  1. Use JSONB over JSON for better indexing and query performance
  2. Create indexes on JSONB fields using GIN indexes:
    CREATE INDEX idx_data ON products USING GIN (data);
    
  3. Use connection pooling for better connection management
  4. Set pool_recycle to handle idle connection timeouts
  5. Use prepared statements (automatically handled by psycopg and asyncpg)

See Also

Dialect Overview

Learn about SQLAlchemy’s dialect system

PostgreSQL Documentation

Official PostgreSQL documentation