Database Connection Pooling Low-Level Design: PgBouncer and Pool Internals

Connection pooling maintains a set of pre-established database connections that are reused across requests, rather than opening and closing a connection for every query. Opening a PostgreSQL connection involves TCP handshake, TLS negotiation, authentication, and session setup — typically 20-100ms. With 1,000 requests/second, that’s 1,000 new connections/second, each consuming ~5MB of DB server memory. Connection pooling reduces this to tens of persistent connections handling millions of queries.

Core Data Model

-- Pool state tracked per-process (in-memory, not in DB)
-- Each pool manages connections to one database + user combination

class ConnectionPool:
    pool_size: int          # max total connections
    min_idle: int           # minimum idle connections to keep warm
    max_idle: int           # maximum idle connections to retain
    acquire_timeout_ms: int # how long to wait for an available connection
    idle_timeout_s: int     # close idle connections older than this
    max_lifetime_s: int     # close connections older than this (force refresh)

    _idle: deque            # available connections, LIFO for connection warmth
    _in_use: set            # connections currently held by requests
    _lock: threading.Lock   # protect shared state
    _not_empty: threading.Condition  # signal when a connection is returned

Pool Implementation

import threading
import time
from collections import deque
import psycopg2

class ConnectionPool:
    def __init__(self, dsn: str, pool_size: int = 20, min_idle: int = 5,
                 acquire_timeout_ms: int = 5000, max_lifetime_s: int = 3600):
        self.dsn = dsn
        self.pool_size = pool_size
        self.min_idle = min_idle
        self.acquire_timeout_ms = acquire_timeout_ms
        self.max_lifetime_s = max_lifetime_s
        self._idle: deque = deque()
        self._in_use: set = set()
        self._lock = threading.Lock()
        self._not_empty = threading.Condition(self._lock)
        self._total_created = 0

        # Pre-warm with min_idle connections
        for _ in range(min_idle):
            conn = self._create_connection()
            self._idle.append(conn)

        # Start background maintenance thread
        threading.Thread(target=self._maintenance_loop, daemon=True).start()

    def _create_connection(self) -> dict:
        conn = psycopg2.connect(self.dsn)
        conn.autocommit = False
        self._total_created += 1
        return {'conn': conn, 'created_at': time.monotonic(), 'last_used': time.monotonic()}

    def acquire(self) -> psycopg2.extensions.connection:
        deadline = time.monotonic() + self.acquire_timeout_ms / 1000
        with self._not_empty:
            while True:
                if self._idle:
                    # LIFO: take most recently used connection (warmer, less likely stale)
                    entry = self._idle.pop()
                    if self._is_expired(entry):
                        entry['conn'].close()
                        entry = self._create_connection()
                    self._in_use.add(id(entry['conn']))
                    entry['last_used'] = time.monotonic()
                    return entry['conn']

                total = len(self._idle) + len(self._in_use)
                if total < self.pool_size:
                    # Room to grow — create a new connection
                    entry = self._create_connection()
                    self._in_use.add(id(entry['conn']))
                    return entry['conn']

                # Pool exhausted — wait for a connection to be returned
                remaining = deadline - time.monotonic()
                if remaining  bool:
        age = time.monotonic() - entry['created_at']
        return age > self.max_lifetime_s

    def _maintenance_loop(self):
        """Periodically close excess idle connections and validate stale ones."""
        while True:
            time.sleep(30)
            with self._lock:
                now = time.monotonic()
                to_close = []
                live_idle = deque()
                for entry in self._idle:
                    age = now - entry['last_used']
                    if age > 600 and len(live_idle) >= self.min_idle:
                        to_close.append(entry['conn'])
                    else:
                        live_idle.append(entry)
                self._idle = live_idle
            for conn in to_close:
                try:
                    conn.close()
                except Exception:
                    pass

from contextlib import contextmanager

@contextmanager
def get_connection(pool: ConnectionPool):
    conn = pool.acquire()
    try:
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        pool.release(conn)

PgBouncer: Production-Grade Pooling

Don’t implement connection pooling in your application when PgBouncer exists. PgBouncer is a lightweight PostgreSQL-specific proxy that sits between your application and the database, managing pooling transparently. Three pooling modes: session (one server connection per client session — same as no pooling), transaction (server connection held only for the duration of a transaction — most efficient), statement (connection returned after each statement — cannot use multi-statement transactions).

# PgBouncer configuration (pgbouncer.ini)
[databases]
appdb = host=postgres-primary port=5432 dbname=appdb

[pgbouncer]
pool_mode = transaction          # most efficient for stateless apps
max_client_conn = 10000          # clients can open up to 10K connections to PgBouncer
default_pool_size = 25           # PgBouncer opens at most 25 connections to Postgres
min_pool_size = 5
reserve_pool_size = 5            # emergency connections for bursts
reserve_pool_timeout = 3
server_idle_timeout = 600
# Result: 10,000 app connections -> 25 Postgres connections

Key Interview Points

PostgreSQL has a hard limit on concurrent connections (typically 100-500). Without pooling, each app thread holds a connection — 100 servers × 50 threads = 5,000 connections, exceeding the limit. PgBouncer multiplexes 5,000 app connections onto 25 database connections.
LIFO eviction (take the most recently returned connection) keeps a subset of connections warm and lets idle excess connections expire naturally — better than FIFO which would age all connections equally.
Always rollback on release: if the caller forgets to commit or roll back, returning a connection with an open transaction poisons the next request that uses it.
Max lifetime prevents connections from staying open indefinitely — databases, proxies, and firewalls may silently close long-lived TCP connections. Proactively cycling connections prevents “broken pipe” errors.
Pool timeout errors are the signal to increase pool size or scale the database — they indicate the database cannot keep up with query throughput, not a pool implementation bug.
In transaction mode, PgBouncer cannot support prepared statements or advisory locks (they are session-scoped, not transaction-scoped). Disable prepared statements in your driver when using PgBouncer in transaction mode.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why does connection pooling matter at scale and what problem does it solve?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Opening a PostgreSQL connection involves TCP handshake, TLS negotiation, authentication, and session parameter setup — typically 20-100ms and ~5MB of memory on the database server. Without pooling, each application thread opens its own connection for every request. At 1,000 requests/second with 100 app servers × 50 threads each = 5,000 simultaneous connections, PostgreSQL’s default connection limit (100-500) is exhausted immediately. With connection pooling (PgBouncer in transaction mode), 5,000 app-side connections are multiplexed onto 25-50 actual Postgres connections, each reused across thousands of queries. Connection cost drops to near-zero per query — the connection is already open and ready.”}},{“@type”:”Question”,”name”:”What is the difference between PgBouncer’s session, transaction, and statement pooling modes?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Session mode: one Postgres server connection is assigned to a client for the entire session (as long as the client is connected). Equivalent to no pooling — useful only for compatibility. Transaction mode: a server connection is assigned only for the duration of a transaction (BEGIN to COMMIT/ROLLBACK). Connection is returned to the pool between transactions. This is the most efficient mode — 5,000 clients can share 25 connections if they’re not all in a transaction simultaneously. Limitation: cannot use session-scoped features (prepared statements in some drivers, advisory locks, LISTEN/NOTIFY, SET LOCAL). Statement mode: connection returned after each individual statement. Cannot use multi-statement transactions. Use transaction mode for stateless web apps; session mode only when session-level features are required.”}},{“@type”:”Question”,”name”:”How do you size a connection pool for a given workload?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Rule of thumb: pool_size = num_cpu_cores × 2 + effective_spindle_count (from HikariCP research). For a 16-core Postgres instance with SSDs: ~32-40 connections. More connections beyond this causes contention (lock waits, context switching) rather than improved throughput. For the application side: configure PgBouncer’s default_pool_size to this number (25-50), and max_client_conn to the total number of application threads across all instances (e.g., 5,000). Monitor pg_stat_activity for connection count and wait events. If queries are waiting on "Client" (waiting for a connection from the pool), increase pool size. If waiting on "Lock" or "IO", adding more connections will not help — the bottleneck is elsewhere.”}},{“@type”:”Question”,”name”:”What happens if a connection is returned to the pool with an open transaction?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The next request that acquires that connection inherits the uncommitted transaction. It may see data written by the previous request (dirty read if isolation is read uncommitted), and its own writes will be committed or rolled back with the orphaned transaction. This corrupts data silently — no error is raised. Prevention: always rollback on release (pool.release() must call conn.rollback() before returning the connection). PgBouncer in transaction mode automatically handles this: it only releases the connection to the pool at COMMIT or ROLLBACK, so an open transaction always holds its connection. Application-level pools must explicitly rollback. Use the context manager pattern (with get_connection(pool) as conn) to ensure commit or rollback always happens.”}},{“@type”:”Question”,”name”:”How does connection pooling interact with database read replicas?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Maintain separate pools for the primary and replicas — do not put primary and replica connections in the same pool. Mixing pools would randomly route writes to replicas (which reject them) and cause errors. Typical setup: primary_pool (pool_size=20, used for all writes and consistency-sensitive reads), replica_pool (pool_size=50 across N replicas, used for read-heavy queries). The application routes queries based on intent: write operations go to primary_pool, dashboard queries and feed reads go to replica_pool. PgBouncer supports this with multiple [databases] sections in its config, each pointing to a different Postgres host. Monitor each pool’s utilization separately — a replica pool at 100% utilization while primary is at 20% indicates a query routing imbalance.”}}]}

Database connection pooling and RDS scaling design is discussed in Amazon system design interview questions.

Database connection pooling and high-throughput system design is covered in Netflix system design interview preparation.

Database connection pooling and e-commerce backend design is discussed in Shopify system design interview guide.