GraphQL is a query language for APIs where clients specify exactly the data they need and receive nothing more. Unlike REST (where the server defines fixed response shapes), GraphQL lets clients compose queries from a typed schema. This eliminates over-fetching (receiving unused fields) and under-fetching (making multiple requests to assemble data). GraphQL is particularly well-suited for frontend applications with complex, varying data requirements across different views.
Schema Design
The GraphQL schema is the API contract. Define types (objects, scalars, enums, unions, interfaces) and operations (queries for reads, mutations for writes, subscriptions for real-time). Design for the client’s perspective: types should model domain objects as the UI presents them, not as the database stores them. Avoid leaking database implementation details (column names, join tables) into the schema. Use meaningful names: Order not OrderRecord; cancelOrder not updateOrderStatus. Define pagination on list fields: connections pattern (edges/nodes/pageInfo with cursor-based pagination) is the standard for production GraphQL APIs. Use non-null types (!) where null would be unexpected — this makes client code simpler.
N+1 Query Problem and DataLoader
The N+1 problem is GraphQL’s most common performance pitfall. A query for 100 orders, each with a user, triggers: 1 query for orders + 100 queries for users (one per order) = 101 queries. DataLoader (by Facebook) solves this with batching and caching. Instead of fetching one user at a time, DataLoader collects all user_id lookup requests within a single execution tick, deduplicates them, makes one batch query (SELECT * FROM users WHERE id IN (…)), and resolves each individual request from the batch result. Per-request DataLoader instances prevent cross-request caching (which would cause data leakage between users). DataLoader is mandatory for any production GraphQL API with nested relationships.
Query Complexity and Depth Limiting
GraphQL clients can request arbitrarily deep or wide queries. A malicious client could send: { users { friends { friends { friends { posts { comments { author { … } } } } } } } } — a deeply nested query that executes exponentially many database queries. Protect with: query depth limit (reject queries deeper than N levels — typically 7-10), query complexity limit (assign a cost to each field; reject queries with total cost above a threshold; list fields cost more than scalar fields), and query timeout (kill resolvers that exceed a wall-clock time budget). Libraries: graphql-depth-limit, graphql-query-complexity, graphql-shield. Rate limit by query complexity rather than request count to prevent a single complex query from consuming as many resources as many simple requests.
Persisted Queries
In production, clients send the same queries repeatedly. Persisted queries replace the full query string with a hash: the client sends {id: “sha256_hash”} instead of the full query text. The server maps the hash to the pre-registered query string. Benefits: smaller request payloads (hash vs full query text), ability to reject arbitrary queries from unauthenticated clients (only whitelisted query hashes are accepted — prevents query injection), and CDN cacheability for GET-based queries (no sensitive query in the URL). Persisted queries are registered at build time; the registry is stored server-side (Redis, database, or embedded in the server). Automatic persisted queries (APQ) dynamically registers queries on first execution.
Subscriptions for Real-Time
GraphQL subscriptions push real-time updates to clients. The client subscribes: subscription { orderStatusUpdated(orderId: “42”) { status, updatedAt } }. The server sends a message each time the order status changes. Implementation: WebSocket transport (most common — graphql-ws protocol), Server-Sent Events (SSE — simpler, unidirectional). Backend: when an order status changes, publish an event to a pub/sub channel (Redis pub/sub); all subscription servers subscribed to that channel push the event to matching client subscriptions. Filter subscriptions on the server (only push to clients subscribed to the specific order ID) to avoid unnecessary traffic. Scale subscription servers horizontally with the same sticky session + pub/sub pattern used for WebSocket servers.
Federation for Microservices
GraphQL Federation (Apollo Federation) composes multiple GraphQL subgraph schemas into a unified supergraph. Each service owns its own GraphQL schema and resolvers. The federated gateway routes each field to the correct service. Example: User service owns the User type; Order service owns the Order type; the Order type extends User with an orders field — the gateway resolves User.orders by calling the Order service with the user_id. This allows teams to independently evolve their schemas while presenting a unified API to clients. The gateway handles the query planning: splitting client queries into subgraph sub-queries, executing them in parallel where possible, and assembling the final response.
Caching Challenges
GraphQL POST requests are not cacheable by HTTP caches (caches key on URL + method; POST bodies differ). Solutions: GET requests for queries (include query + variables in the URL query string — cacheable by CDN; limited by URL length), persisted queries (hash-based GET requests are cacheable), and response caching at the resolver level (cache individual field resolutions in Redis keyed on field + arguments — finer-grained than HTTP response caching). HTTP caching headers on GraphQL endpoints are typically Cache-Control: no-store for authenticated queries (per-user data must not be cached by a shared cache). Public, unauthenticated queries (product catalog, public content) can use CDN caching with appropriate Cache-Control headers and persisted query GET requests.