WebSockets are easy to demo and hard to run at scale. Senior frontend interviews probe whether you understand the production realities — connection management, server-side scaling, auth, and the dozen ways a real-time system can fail.
The basic flow
- Client opens WebSocket:
new WebSocket('wss://...') - HTTP upgrade handshake
- Persistent TCP connection
- Both sides can send messages
Authentication
HTTP cookies sent during upgrade work for same-origin. Cross-origin needs:
- Token in URL (logged in server logs — avoid)
- First-message auth: client sends auth message after connect; server validates before processing
- Custom subprotocol header: auth in
Sec-WebSocket-Protocol
Token expiry handling: when token expires, server closes connection with specific code; client refreshes token and reconnects.
Reconnection
Networks drop. Standard reconnect:
- Exponential backoff (1s, 2s, 4s, 8s)
- Cap at 30s
- Indicate to user: “Reconnecting…”
- Reset timer on successful reconnect
Libraries handle this: reconnecting-websocket, socket.io-client.
Message ordering on reconnect
The hard part. Strategies:
- Client tracks last received message ID
- On reconnect, sends “give me messages since X”
- Server replays missed messages
Without this, users miss messages during reconnect.
Heartbeat / ping
Networks may silently drop connections. Detect with heartbeat:
- Client sends ping every 30s
- Server responds with pong
- If no pong within timeout, declare connection dead and reconnect
WebSocket protocol has built-in ping/pong; some implementations expose, some don’t.
Server-side scaling
WebSockets are stateful. Each connection ties to a server instance. Scaling concerns:
- Connection limits: Linux file descriptor limits, typically 65K per process
- Memory: ~10–50KB per connection, depending on framework
- Sticky sessions: load balancer must route the same client to the same instance
For 100K+ concurrent connections, plan capacity carefully.
Pub/sub for fanout
Broadcasting messages to many users:
- Server instances subscribe to a Redis pubsub channel
- App publishes to Redis
- All instances receive; broadcast to their connected clients
This pattern (or NATS, Kafka) is standard for chat, live updates, etc.
Channel / room management
Users care about specific topics (chat rooms, document IDs). Pattern:
- Client joins channel after connecting
- Server tracks which connections are in which channels
- Broadcast only to relevant channels
Backpressure
If a slow client cannot keep up:
- Buffer fills
- Server runs out of memory
- Close the slow connection (rather than affecting other clients)
Implement explicit backpressure: drop messages, close slow clients, signal “you fell behind, reconnect.”
Proxy and load balancer issues
- Some proxies idle-timeout WebSocket connections after 30–60 seconds
- HTTP/1.1 proxies may not support WebSocket upgrades
- nginx and HAProxy support WebSockets natively but need explicit config
Test with your actual deployment topology.
Mobile-specific
- Background apps lose WebSocket connection
- iOS app suspended? Connection dies. Reconnect on foreground.
- Cellular handoff (Wi-Fi → LTE) drops the connection
- For critical real-time, use push notifications as backup signal
Common mistakes
- No reconnection logic
- Token expires; connection silently dies
- No heartbeat; zombie connections accumulate
- No message replay; users miss messages on reconnect
- Single-server architecture; cannot scale
Frequently Asked Questions
Should I use Socket.io or native WebSocket?
Native WebSocket is leaner. Socket.io adds reconnect, fallback to long polling, namespaces — useful overhead.
Can I run WebSockets on serverless?
API Gateway WebSocket on AWS, Cloudflare Durable Objects — both support. Higher latency than dedicated servers but easier ops.
How many concurrent WebSocket connections can a Node.js server handle?
Tens of thousands per instance with proper tuning. Beyond that, scale horizontally.