Question 1

What is the formula for optimal connection pool size and what are HikariCP's defaults?

Accepted Answer

The widely cited formula from the HikariCP documentation (derived from the PostgreSQL wiki) is: pool size = (core count u00d7 2) + effective_spindle_count. For a 4-core server with SSD storage (spindle count u2248 1), that yields a pool of 9. This counterintuitively small number reflects the fact that more connections than CPU cores causes context-switch overhead and lock contention that reduces throughput. HikariCP defaults: maximumPoolSize=10, minimumIdle=10 (i.e., a fixed pool), connectionTimeout=30000ms, idleTimeout=600000ms, maxLifetime=1800000ms. In practice you should benchmark your specific workload—I/O-heavy queries with long wait times can justify larger pools because threads block on network/disk rather than CPU.

Question 2

How do you detect and resolve connection pool exhaustion?

Accepted Answer

Detection: instrument pool metrics (active connections, pending waiters, acquisition time p99) via JMX or Micrometer and alert when pending waiters > 0 for more than a few seconds, or when connectionTimeout exceptions appear in logs. HikariCP logs a warning with a thread dump when a connection cannot be acquired within half the connectionTimeout. Resolution paths: (1) increase pool size if CPU headroom exists; (2) identify and fix slow queries holding connections too long; (3) add a read replica and route read traffic there to reduce primary pool pressure; (4) introduce a connection proxy layer (PgBouncer, ProxySQL) for transaction-mode pooling that multiplexes many app threads onto fewer backend connections; (5) implement circuit breakers so a downstream slowdown doesn't cascade into exhaustion.

Question 3

How do you route read replica traffic from within a connection pool?

Accepted Answer

The cleanest approach is to maintain two separate pool instances—one pointing at the primary and one at the replica set—and expose them as named DataSource beans (e.g., primaryDataSource and readDataSource). Application code or a routing DataSource wrapper (Spring's AbstractRoutingDataSource) selects the appropriate pool based on transaction read-only flag or an explicit annotation (@Transactional(readOnly=true)). Alternatively, a proxy layer like ProxySQL or AWS RDS Proxy can perform read/write splitting transparently at the network level without application changes. Key concerns: replica lag means reads may return stale data; you must decide per use-case whether eventual consistency is acceptable, and fall back to primary for reads that require strong consistency immediately after a write.

Question 4

How do you implement slow query detection and logging via a connection pool wrapper?

Accepted Answer

Wrap the JDBC Connection or DataSource with a proxy (P6Spy, datasource-proxy, or a custom InvocationHandler) that records the wall-clock time around each Statement.execute() call. If execution time exceeds a configurable threshold (e.g., 500ms), log the full SQL, bound parameters, execution time, and calling thread/stack trace. This avoids relying solely on database-side slow query logs, which may not capture client-perceived latency including network time. In HikariCP you can set leakDetectionThreshold to log connections held open longer than a threshold, which catches cases where application code forgets to close a connection or holds it across slow external calls. Combine with distributed tracing (OpenTelemetry) to correlate slow queries with upstream request traces.

Question 5

What strategies handle multi-tenant connection pooling efficiently?

Accepted Answer

Three main patterns: (1) Pool-per-tenant: each tenant gets a dedicated pool pointing at their schema or database. Strong isolation, easy per-tenant limits, but memory overhead grows linearly with tenant count—practical only for hundreds of tenants. (2) Shared pool with schema switching: a single pool connects to a shared database; on borrow, the connection runs SET search_path or USE  to switch tenant context. Lower overhead but schema-switch bugs can leak data between tenants, and schema switching adds latency per acquisition. (3) Proxy-based pooling (PgBouncer per tenant, or a smart proxy that maps tenant ID to backend): combines multiplexing efficiency with isolation. For large SaaS platforms, pattern 3 or a tiered approach (dedicated pools for large tenants, shared pool for small ones) is typical.

Low Level Design: Database Connection Pool

Connection Pool Architecture

Pool Sizing

Connection Lifecycle

Health Checking

Read Replica Routing

Slow Query Detection

Multi-Tenant Connection Pooling

Circuit Breaker Integration