Low Level Design: Connection Pool Design

A connection pool maintains a set of pre-established connections to a resource (database, HTTP service, message broker) that can be reused across requests, avoiding the overhead of establishing a new connection for each operation. Establishing a TCP connection + TLS handshake + database authentication takes 50-200ms. With connection pooling, that cost is paid once at startup, and subsequent requests borrow a connection in microseconds. PgBouncer, HikariCP, and connection2 are production connection poolers used at scale. Understanding connection pool design is essential for any system with a database or external service dependency.

Connection Pool Internals

A connection pool manages a fixed set of connections: idle connections waiting to be borrowed, and active connections in use by requests. Key parameters: min_connections: minimum connections to keep open (warm pool, no cold starts). max_connections: maximum connections (prevents overwhelming the database with too many concurrent connections). acquire_timeout: maximum time to wait for an available connection before throwing an error. max_lifetime: maximum time a connection stays in the pool before being replaced (prevents stale connections from server-side timeout). idle_timeout: how long an idle connection stays before being closed (reduces server-side connection pressure during off-peak hours). Pool size guideline: database can handle C concurrent connections; set max_connections = C / number_of_app_server_instances.

// Go database connection pool configuration
import "database/sql"

db, err := sql.Open("postgres", dsn)

// Pool sizing: PostgreSQL default max_connections = 100
// With 4 app server instances: 100/4 = 25 connections per instance
db.SetMaxOpenConns(25)         // max active connections
db.SetMaxIdleConns(10)         // keep 10 idle connections warm
db.SetConnMaxLifetime(30 * time.Minute)  // replace connections older than 30m
db.SetConnMaxIdleTime(5 * time.Minute)   // close idle connections after 5m

// HikariCP (Java) equivalent:
// hikari.maximumPoolSize=25
// hikari.minimumIdle=10
// hikari.connectionTimeout=30000   (30s acquire timeout)
// hikari.idleTimeout=600000        (10m idle timeout)
// hikari.maxLifetime=1800000       (30m max lifetime)

// PgBouncer: proxy-level connection pooler
// Sits between app servers and PostgreSQL
// transaction mode: one PostgreSQL connection per transaction (most efficient)
// session mode: one PostgreSQL connection per client session (compatible with all features)
// [pgbouncer]
// pool_mode = transaction
// max_client_conn = 1000  -- app servers can open 1000 connections to PgBouncer
// default_pool_size = 25  -- PgBouncer opens only 25 real PostgreSQL connections

Connection Pool Sizing and Saturation

Setting max_connections too high: database server runs out of memory (each PostgreSQL connection uses ~5-10MB) and CPU context switches between connections become a bottleneck. The classic mistake is setting max_connections = 1000 per app server — 10 app servers overwhelm Postgres with 10,000 connections. Setting max_connections too low: requests queue waiting for a free connection (pool saturation), increasing response latency. Detecting saturation: monitor pool wait time (P99 time to acquire connection), active vs. idle connection ratio, and connection error rate. HikariCP exposes JMX metrics. Use PgBouncer in transaction mode between app servers and PostgreSQL to multiplex thousands of app connections over a small PostgreSQL connection pool.

Connection Health Checking

A pooled connection can become invalid: database server restarted, network timeout caused a silent disconnect, or the server closed the connection due to idle timeout. Validation strategies: Test on borrow: execute a lightweight query (SELECT 1) before returning the connection to the caller. Catches invalid connections but adds latency on every borrow. Test on return: validate when connection is returned to pool. Background validation: a pool maintenance thread periodically runs SELECT 1 on idle connections and removes failures. Error-based eviction: when a query fails with a connection error, remove the connection from pool and retry on a new connection. HikariCP uses connection testing with keepalive queries by default (every 30 seconds for idle connections).

Key Interview Discussion Points

  • Thread-per-request vs async: traditional thread-per-request models block threads waiting for connections (pool saturation visible as thread saturation); async models (Node.js, Java Reactor, Go goroutines) allow thousands of logical requests to share a small connection pool without blocking threads
  • PgBouncer transaction mode limitations: PostgreSQL features that require session state (prepared statements, advisory locks, SET LOCAL, LISTEN/NOTIFY) break under transaction-mode connection pooling because the same PostgreSQL session is not guaranteed across transactions
  • Connection pool per database: maintain separate pools for read replicas and the primary — route read-only queries to the replica pool, writes to the primary pool; replica pool can be sized for high read throughput
  • Circuit breaker integration: when the database is overloaded, rapidly failing connections should trigger a circuit breaker that rejects new requests with a 503 immediately rather than queueing indefinitely and exhausting the connection pool
  • Serverless and connection pools: serverless functions create new processes per request, each opening connections directly — use a proxy connection pooler (RDS Proxy, PgBouncer, Neon) to avoid overwhelming the database with thousands of short-lived connections
{ “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “Why is connection pooling important and what does it solve?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Establishing a database connection requires a TCP handshake, TLS negotiation, and database authentication — typically 50-200ms. For an application handling 1000 requests/second, creating a new connection per request is impractical. Connection pooling maintains a set of pre-established connections that are borrowed by requests and returned to the pool after use. Borrowing a pooled connection costs microseconds instead of milliseconds. Pools also cap the maximum connections to the database (a PostgreSQL server handles at most 100-300 concurrent connections efficiently — more causes memory exhaustion and CPU thrashing). Without a pool, a traffic spike creates thousands of simultaneous connection attempts, overwhelming the database. The pool serializes connection creation and queues requests, smoothing out spikes.” } }, { “@type”: “Question”, “name”: “How do you size a database connection pool correctly?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Connection pool sizing formula: pool_size = (database_max_connections) / (number_of_app_server_instances). If PostgreSQL max_connections = 100 and you have 4 app servers, each pool should have at most 25 connections. A common mistake: setting pool size based on application concurrency (number of threads or goroutines) instead of database capacity. A server with 100 goroutines does not need 100 connections — goroutines frequently block on I/O and many can share connections. Use the Hikari minimum pool size formula: pool_size = Tn * (Cm – 1) + 1, where Tn = number of threads and Cm = maximum simultaneous connections per transaction (usually 1). Monitor pool wait time and active/idle ratio: if wait time is consistently high, the pool is undersized; if idle ratio is consistently high, the pool is oversized and holding unnecessary server resources.” } }, { “@type”: “Question”, “name”: “What is PgBouncer and how does transaction mode pooling work?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “PgBouncer is a connection pooler proxy that sits between application servers and PostgreSQL. In transaction mode (the most efficient), a PostgreSQL connection is held only for the duration of a transaction — when the transaction commits or rolls back, the connection returns to the pool. This allows thousands of application connections (each handling one request at a time) to share a small pool of real PostgreSQL connections (e.g., 25). PgBouncer handles 1000+ app-side connections per real PostgreSQL connection in transaction mode. Limitation: features that require session state across transactions do not work in transaction mode — prepared statements (use protocol-level prepared statements or disable them), advisory locks, SET LOCAL commands, and LISTEN/NOTIFY are not compatible. Use session mode for compatibility at the cost of lower multiplexing efficiency.” } }, { “@type”: “Question”, “name”: “How should serverless functions handle database connections?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Serverless functions (AWS Lambda, Google Cloud Functions) create a new process per invocation, each attempting to open a database connection. With 1000 concurrent invocations, 1000 connections are attempted simultaneously — overwhelming PostgreSQL (max 100-300 connections). Solutions: (1) RDS Proxy (AWS): a managed connection pooler for RDS databases; Lambda functions connect to RDS Proxy, which maintains a pool of real database connections. RDS Proxy scales to handle thousands of Lambda connections. (2) PgBouncer deployed as a sidecar or separate service: Lambda connects to PgBouncer, which pools real connections. (3) Neon serverless Postgres and PlanetScale use connection multiplexing natively, designed for serverless workloads. (4) Connection reuse within Lambda: reuse the connection across warm Lambda invocations by storing the database client object outside the handler function (in the global scope), but cap the pool size to 1-2 per Lambda instance.” } } ] }
Scroll to Top