Every system you design needs an API. In system design interviews, choosing the right API style — and explaining why — separates senior candidates from mid-level ones. The choice between REST, GraphQL, and gRPC has real implications for performance, developer experience, and operational complexity.
Strategy
Don’t approach this as a dogmatic choice. Each style has a home. The interview question is usually implicit: you’re designing a system and you need to decide how clients talk to it. Pick the API style that fits the requirements, then justify it. If you say “REST” by default without considering the use case, an experienced interviewer will push back.
REST
Representational State Transfer. Resources are identified by URLs; actions are expressed via HTTP verbs (GET, POST, PUT, PATCH, DELETE). Stateless — each request contains all the information needed to fulfill it.
GET /users/123 → fetch user
POST /users → create user
PUT /users/123 → replace user
PATCH /users/123 → partial update
DELETE /users/123 → delete user
Pros:
- Simple, well-understood, cacheable by default (GET requests).
- Works with any HTTP client — browsers, curl, every language’s HTTP library.
- Stateless design maps naturally to horizontal scaling and load balancers.
- Responses are human-readable JSON; easy to debug.
- Enormous ecosystem: OpenAPI/Swagger, Postman, API gateways, rate limiting middleware.
Cons:
- Over-fetching:
GET /users/123returns the entire user object even if the client only needs the name. - Under-fetching: Getting a user’s posts and comments requires multiple round trips (
GET /users/123,GET /users/123/posts,GET /posts/456/comments). - No standard for real-time — you bolt on WebSockets or Server-Sent Events separately.
- Versioning is awkward:
/v1/users,/v2/users— multiple versions to maintain.
When to use REST: Public APIs (third-party developers need predictability), CRUD-heavy services, browser-facing APIs, anything that benefits from HTTP caching. Most company APIs are REST.
GraphQL
A query language for APIs developed by Facebook in 2012, open-sourced in 2015. Clients specify exactly what data they need in a single request; the server returns exactly that — nothing more, nothing less.
# Client sends this query
query {
user(id: "123") {
name
email
posts(last: 3) {
title
publishedAt
comments(first: 5) {
body
author { name }
}
}
}
}
# Server returns exactly this shape — no over or under-fetching
Pros:
- Eliminates over-fetching and under-fetching — clients get exactly what they ask for.
- Single endpoint (
/graphql) — no URL proliferation. - Strongly typed schema — the schema is documentation. Clients and servers both validate against it.
- Excellent for mobile clients with limited bandwidth — request only the fields you display.
- Introspection — clients can query the API to discover its own schema.
Cons:
- N+1 problem: Naïvely resolving nested queries hits the database once per object. Must use a DataLoader pattern (batching) to avoid. Easy to miss in a code review.
- HTTP caching doesn’t work out of the box — all queries are POST requests to the same endpoint.
- Complex queries can be expensive — a malicious client can craft a deeply nested query that DoSes your DB. Requires query depth limiting and complexity analysis.
- Steeper learning curve and more infrastructure (schema, resolvers, DataLoader).
- Overkill for simple CRUD APIs.
When to use GraphQL: Frontend-heavy applications where multiple clients (web, iOS, Android) consume the same data differently. Aggregation layer over multiple microservices (BFF — Backend for Frontend pattern). Facebook, GitHub, Shopify, Twitter all use GraphQL for their main APIs.
gRPC
Google’s Remote Procedure Call framework. Defines services in Protocol Buffers (protobuf) — a binary, strongly-typed schema. The framework auto-generates client and server code in any language.
# users.proto
syntax = "proto3";
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User); // server-side streaming
}
message GetUserRequest {
string user_id = 1;
}
message User {
string id = 1;
string name = 2;
string email = 3;
}
Pros:
- Performance: Protobuf binary serialization is 5–10× smaller and faster to parse than JSON. HTTP/2 multiplexing means multiple RPC calls share one TCP connection.
- Streaming: Native support for client streaming, server streaming, and bidirectional streaming — built into the protocol, not bolted on.
- Code generation: From the
.protofile, gRPC generates type-safe clients in Go, Python, Java, C++, etc. No manual HTTP client code. - Strongly typed — schema mismatch is a compile error, not a runtime surprise.
Cons:
- Not browser-native. Browsers can’t speak HTTP/2 in a way gRPC requires without a proxy (grpc-web). Poor choice for public-facing browser APIs.
- Protobuf binary is not human-readable — harder to debug without tooling.
- Requires protobuf toolchain and generated code in each language. More setup than REST.
- Less familiar to frontend developers; steeper adoption curve.
When to use gRPC: Internal service-to-service communication (microservices). High-throughput, low-latency internal APIs. Streaming scenarios (real-time data, bidirectional chat at the service layer). Polyglot microservices where generated clients eliminate boilerplate. Netflix, Square, Lyft, and most large microservices architectures use gRPC internally.
Side-by-Side Comparison
| | REST | GraphQL | gRPC |
|—|—|—|—|
| Protocol | HTTP/1.1 or 2 | HTTP/1.1 or 2 | HTTP/2 |
| Data format | JSON (usually) | JSON | Protobuf (binary) |
| Schema | OpenAPI (optional) | Required (SDL) | Required (proto) |
| Caching | Easy (GET is cacheable) | Hard (all POST) | Hard |
| Streaming | Bolted on (SSE/WS) | Subscriptions (WS) | Native (4 modes) |
| Browser support | ✓ | ✓ | Limited (grpc-web) |
| Performance | Good | Good | Best |
| Learning curve | Low | Medium | Medium-high |
| Best for | Public APIs, CRUD | Flexible client queries | Internal services |
API Versioning
REST versioning strategies, since interviewers ask:
- URL versioning:
/v1/users,/v2/users. Simple, explicit. Most common. - Header versioning:
Accept: application/vnd.company.v2+json. Cleaner URLs, but less visible. - Query param:
/users?version=2. Easy to test, but messy at scale.
GraphQL and gRPC handle versioning differently: you evolve the schema by adding fields (never removing or renaming) and deprecating old ones. No version numbers in URLs.
Rate Limiting at the API Layer
Always worth mentioning. Common strategies:
- Token bucket: Each client gets a bucket of N tokens. Each request consumes one token. Tokens refill at a fixed rate. Allows bursts.
- Fixed window: Count requests per minute. Simple but allows burst at window boundary (99 at :59, 99 at :01 = 198 in 2 seconds).
- Sliding window: More accurate. Count requests in the last 60 seconds using a rolling timestamp log.
Store rate limit counters in Redis for shared state across load-balanced API servers.
Summary
Use REST for public APIs and simple CRUD — it’s universal, cacheable, and well-understood. Use GraphQL when multiple clients consume the same data differently and over/under-fetching is a real problem. Use gRPC for internal service-to-service communication where performance and strong typing matter. In a system design interview, state your choice and defend it against the requirements — don’t just default to REST without thinking.
Related System Design Topics
- SQL vs NoSQL — your API will need a backing store; the API style influences which DB model fits best.
- Load Balancing — REST APIs are stateless and horizontally scalable behind a load balancer; gRPC uses HTTP/2 which changes how the LB handles connections.
- Caching Strategies — REST GET requests are cacheable at the HTTP layer; GraphQL and gRPC require application-level caching.
- Design a URL Shortener — a worked example that uses a REST API with the design decisions covered here.