Lazy loading, eager loading strategies (joinedload, selectinload), and performance optimization for ORM relationships
Efficient relationship loading is critical for ORM performance. SQLAlchemy provides multiple loading strategies to control when and how related objects are loaded from the database.
When you access a relationship on an ORM object, SQLAlchemy needs to load the related data. The loading strategy determines when and how this happens:
Lazy Loading: Load related data on-demand when accessed
Eager Loading: Load related data immediately with the parent query
Joined Loading: Use SQL JOINs
Select IN Loading: Use separate SELECT with IN clause
Subquery Loading: Use correlated subquery
N+1 Query Problem: Lazy loading can cause performance issues when accessing relationships in loops, issuing one query per parent object. Use eager loading to prevent this.
from sqlalchemy import selectfrom models import User# Initial query loads only User objectsstmt = select(User)users = session.scalars(stmt).all()# SQL: SELECT users.id, users.name, users.email FROM users
2
Access relationship triggers additional query
# Each access to .addresses triggers a new queryfor user in users: print(user.addresses) # Separate SELECT for each user! # SQL (for each user): # SELECT addresses.* FROM addresses WHERE addresses.user_id = ?
This creates N+1 queries: 1 for users + N for each user’s addresses.
You can explicitly configure lazy loading using the lazyload() option:
from sqlalchemy import selectfrom sqlalchemy.orm import lazyloadfrom models import Userstmt = select(User).options(lazyload(User.addresses))users = session.scalars(stmt).all()# Addresses will be lazy-loaded when accessedfor user in users: print(user.addresses) # Triggers SELECT for this user
# Loading one user - lazy loading is fineuser = session.get(User, 1)print(user.addresses) # Only 1 additional query
✅ Selective relationship access
users = session.scalars(select(User)).all()# Only access relationship for some usersfor user in users: if user.is_premium: print(user.addresses) # Only loads when needed
✅ Large datasets where relationships rarely accessed
# Loading 10,000 users, but only viewing addresses for a fewusers = session.scalars(select(User).limit(10000)).all()
❌ Loop access (N+1 problem)
# BAD: Creates 101 queries (1 + 100)users = session.scalars(select(User).limit(100)).all()for user in users: print(user.addresses) # New query each iteration!
❌ Multiple relationship access
# BAD: Queries multiply quicklyusers = session.scalars(select(User)).all()for user in users: print(user.addresses) # N queries print(user.orders) # N more queries print(user.preferences) # N more queries
from sqlalchemy import selectfrom sqlalchemy.orm import joinedloadfrom models import User# Load users and their addresses in one querystmt = select(User).options(joinedload(User.addresses))users = session.scalars(stmt).all()# SQL: SELECT users.*, addresses.* # FROM users LEFT OUTER JOIN addresses ON users.id = addresses.user_id# No additional queries when accessing addressesfor user in users: print(user.addresses) # Already loaded!
from sqlalchemy.orm import joinedload# LEFT OUTER JOIN - includes users with no addressesstmt = select(User).options(joinedload(User.addresses))# SQL: FROM users LEFT OUTER JOIN addresses ...
from sqlalchemy import selectfrom sqlalchemy.orm import joinedloadfrom models import User, Order, OrderItem# Load User -> Orders -> OrderItemsstmt = select(User).options( joinedload(User.orders).joinedload(Order.items))# SQL includes two JOINs:# FROM users # LEFT OUTER JOIN orders ON users.id = orders.user_id# LEFT OUTER JOIN order_items ON orders.id = order_items.order_idusers = session.scalars(stmt).all()for user in users: for order in user.orders: print(order.items) # All loaded, no additional queries
from sqlalchemy.orm import joinedload# Load multiple relationships on same objectstmt = select(User).options( joinedload(User.addresses), joinedload(User.orders), joinedload(User.preferences))# Creates multiple LEFT OUTER JOINsusers = session.scalars(stmt).all()
Cartesian Product Risk: Joining multiple one-to-many relationships can create large result sets. A user with 10 addresses and 10 orders creates 100 rows (10 × 10). Consider using selectinload() instead.
For self-referential relationships, automatically load multiple levels:
from sqlalchemy import selectfrom sqlalchemy.orm import selectinloadfrom models import TreeNode# Load tree structure 3 levels deepstmt = select(TreeNode).options( selectinload(TreeNode.children, recursion_depth=3))roots = session.scalars(stmt.where(TreeNode.parent_id.is_(None))).all()# Children, grandchildren, and great-grandchildren all loadedfor root in roots: for child in root.children: for grandchild in child.children: print(grandchild.children) # All loaded!
recursion_depth currently only supports self-referential relationships. It’s an alpha feature in SQLAlchemy 2.0.
Subquery loading uses a correlated subquery to load related objects.
from sqlalchemy import selectfrom sqlalchemy.orm import subqueryloadfrom models import Userstmt = select(User).options(subqueryload(User.addresses))users = session.scalars(stmt).all()# SQL Query 1: SELECT users.* FROM users# SQL Query 2: SELECT addresses.*, anon_1.users_id FROM addresses# JOIN (SELECT users.id AS users_id FROM users) AS anon_1# ON anon_1.users_id = addresses.user_id
subqueryload() is not recommended for new code. Use selectinload() instead. Subquery loading has compatibility issues with queries that use LIMIT/OFFSET and may require the original query to be run twice.
from sqlalchemy.orm import raiseload# Only raise if SQL would be emitted, not for identity map checksstmt = select(User).options( raiseload(User.addresses, sql_only=True))users = session.scalars(stmt).all()# If address is already in identity map, this works# Otherwise raises errorprint(users[0].addresses)
When you manually JOIN to a related table, use contains_eager() to tell SQLAlchemy to populate the relationship from the join:
from sqlalchemy import selectfrom sqlalchemy.orm import contains_eagerfrom models import User, Address# Manual join with eager populationstmt = ( select(User) .join(User.addresses) .where(Address.email_address.like("%@gmail.com")) .options(contains_eager(User.addresses)))users = session.scalars(stmt).unique().all()# User.addresses is populated from the JOINfor user in users: print(user.addresses) # Already loaded from explicit JOIN
When using contains_eager() with one-to-many joins, use .unique() on the result to deduplicate parent objects.
Choosing the right loading strategy significantly impacts performance.
Query Count
Result Set Size
Recommendations
Strategy
Parent Query
Related Queries
Total
Lazy
1
N (one per parent)
1 + N
Joined
1
0
1
Select IN
1
1 per relationship
1 + R
Subquery
1
1 per relationship
1 + R
Immediate
1
N
1 + N
Example with 100 users:
Lazy: 101 queries
Joined: 1 query
Select IN: 2 queries
Subquery: 2 queries
Immediate: 101 queries
Joined Loading Cartesian Product:
# User has 10 addresses and 10 orders# Joined loading both creates 100 rows (10 × 10)stmt = select(User).options( joinedload(User.addresses), joinedload(User.orders))# Returns 100 rows for a single user!
Select IN Loading - No Cartesian:
# Same user: 1 user row + 10 address rows + 10 order rows = 21 rowsstmt = select(User).options( selectinload(User.addresses), selectinload(User.orders) )# Much more efficient!
from sqlalchemy.orm import selectinload, load_only# Load users with addresses, but only load email from addressesstmt = select(User).options( selectinload(User.addresses).load_only( Address.email_address, Address.id # Always include primary key ))
from sqlalchemy.orm import with_loader_criteria, selectinload# Only load active addressesstmt = select(User).options( selectinload(User.addresses), with_loader_criteria(Address, Address.is_active == True))users = session.scalars(stmt).all()# user.addresses only contains active addresses
import logging# Log all SQL statementslogging.basicConfig()logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)# Run your code and watch for repeated queriesusers = session.scalars(select(User)).all()for user in users: print(user.addresses) # You'll see individual SELECTs in logs
from sqlalchemy.orm import raiseload# Raise error on any lazy loadstmt = select(User).options(raiseload("*"))users = session.scalars(stmt).all()for user in users: print(user.addresses) # InvalidRequestError - tells you to add eager loading!