System Design Interview: IoT Platform (Connected Devices at Scale)

IoT system design is increasingly common at Amazon (AWS IoT), Google (Cloud IoT), Tesla, Nest, and any company building connected hardware. The core challenges are managing millions of persistent device connections, ingesting high-frequency telemetry data, and pushing commands and firmware updates reliably to constrained devices.

What Makes IoT Different

  • Device constraints: microcontrollers with 256KB RAM, 2G/3G connectivity, intermittent connections, battery-powered
  • Volume: 10 million devices each sending telemetry every 30 seconds = 333,000 messages/second
  • Bidirectional: devices send data up and receive commands down (unlike web APIs where clients always initiate)
  • Reliability: devices must not lose important commands (firmware updates, safety alerts) even if temporarily offline
  • Security: devices in the field cannot be easily patched — security must be robust from day one

Architecture Overview

[IoT Devices] ←→ MQTT ←→ [MQTT Broker Cluster]
                                    ↓
                          [Message Router / Rules Engine]
                          /          |           
                [Telemetry       [Command       [Alert
                 Ingestion]       Handler]       Engine]
                    ↓                ↓               ↓
              [Kafka Stream]    [Device DB]    [PagerDuty/
                    ↓           (Cassandra)     Slack]
              [Time-Series DB]
              (InfluxDB/TimescaleDB)
                    ↓
              [Analytics / Dashboard]

MQTT Protocol

MQTT (Message Queuing Telemetry Transport) is the standard protocol for IoT. It uses a publish-subscribe model over persistent TCP connections, with very low overhead (2-byte header minimum).

Key MQTT concepts:

Topics: hierarchical namespace for messages
  devices/{device_id}/telemetry     -- device sends sensor data
  devices/{device_id}/status        -- device sends heartbeat
  devices/{device_id}/commands      -- server sends commands to device
  devices/{device_id}/firmware      -- server pushes OTA updates

Quality of Service (QoS) levels:
  QoS 0: at-most-once (fire and forget) — no acknowledgment
          Use for telemetry where occasional loss is acceptable
  QoS 1: at-least-once — broker acknowledges; client retries until ack
          Use for commands and alerts
  QoS 2: exactly-once — 4-way handshake; guaranteed once delivery
          Use for billing events, firmware updates

Retained messages:
  Broker stores last message on a topic for new subscribers
  Device connecting after outage immediately gets the latest command

Last Will and Testament (LWT):
  Device registers a "will" message on connect
  If device disconnects unexpectedly, broker publishes the will
  Use for device_status: {status: "offline", timestamp: ...}

MQTT Broker at Scale

A single MQTT broker (Eclipse Mosquitto) handles ~100,000 connections. At 10M devices, you need a clustered broker:

Options:
  1. EMQ X (EMQX): open-source, handles 10M connections on a cluster
  2. AWS IoT Core: managed service, scales automatically, ~$0.08/device/month
  3. HiveMQ: enterprise MQTT broker with clustering

Sharding connections across brokers:
  - Route devices by device_id hash to a specific broker node
  - Brokers share subscriptions and route cross-broker messages via internal Kafka
  - Device always reconnects to same broker node (sticky routing via DNS/LB)

Connection state (session state):
  - Which topics device is subscribed to
  - Offline messages queued while device was disconnected
  - QoS 1/2 message acknowledgment state
  - Store in Redis (shared across broker cluster) or Cassandra

Telemetry Ingestion Pipeline

MQTT Broker → Kafka "telemetry" topic → [Stream Processors]
                                              ↓
                                  [Time-Series Database (InfluxDB)]
                                         measurements:
                                           temperature{device=A} 72.3 t=1234
                                           humidity{device=A}    58.0 t=1234
                                              ↓
                                  [Rollup Job (Spark/Flink)]
                                    raw: every 30s
                                    1-min avg: rolling window
                                    1-hr avg: for dashboards
                                    24-hr avg: for long-term storage

Kafka partitioning: by device_id — all messages from one device
  land on the same partition, preserving ordering

Throughput: 333,000 messages/s × 500 bytes = 167 MB/s → Kafka handles this easily

Time-Series Database Choice

Database Write Throughput Retention Use Case
InfluxDB ~1M points/s Automatic (configurable) IoT telemetry, metrics
TimescaleDB High (Postgres extension) Manual or policies SQL queries on time series
Cassandra Very high Manual TTL Wide column, device state history
Amazon Timestream High (managed) Automatic tiers Serverless, AWS integration

Device Management

-- Device registry (PostgreSQL)
CREATE TABLE devices (
    device_id       VARCHAR(64) PRIMARY KEY,
    owner_user_id   BIGINT,
    device_type     VARCHAR(50),  -- sensor, camera, thermostat
    firmware_version VARCHAR(20),
    last_seen_at    TIMESTAMP,
    status          ENUM("active","offline","deprovisioned"),
    metadata        JSONB          -- flexible per-device attributes
);

-- Device shadow / digital twin (Redis or DynamoDB)
-- Stores desired state + reported state for each device
{
  "device_id": "dev_12345",
  "desired": {
    "target_temperature": 72,
    "mode": "cooling"
  },
  "reported": {
    "target_temperature": 68,
    "mode": "heating",
    "firmware": "2.4.1"
  },
  "delta": {
    "target_temperature": 72,   // delta = desired - reported
    "mode": "cooling"
  },
  "last_updated": "2026-01-15T12:00:00Z"
}

-- Device connects: receives delta and applies it
-- After applying: reports new state, delta clears

Over-the-Air (OTA) Firmware Updates

OTA Update Flow:
1. Engineer uploads new firmware binary to S3
2. Creates update campaign: target all devices running firmware < 2.4.1
3. Campaign service queries device registry for matching devices
4. Sends firmware update command in batches:
   - Batch 1: 1% of fleet (canary) — monitor error rate for 24h
   - Batch 2: 10% — monitor
   - Batch 3: 100% — full rollout

Device-side update flow:
1. Device receives command: {action: "update", url: "s3://firmware/v2.5.0.bin", checksum: "sha256:..."}
2. Device downloads firmware (chunked, resumable)
3. Verifies checksum
4. Writes to secondary flash partition
5. Reboots into new firmware
6. If boot fails 3 times, reverts to previous firmware (fail-safe)
7. Reports new firmware version to broker

Security

  • Device authentication: X.509 client certificates (per-device unique certificate) stored in secure element on hardware
  • Transport security: MQTT over TLS 1.3 (port 8883)
  • Message authorization: device can only publish to its own topic (devices/{its_own_id}/*)
  • Command signing: commands signed with server private key; device verifies with embedded server public key
  • Certificate rotation: automate certificate renewal before expiry via OTA

Interview Discussion Points

  • How do you handle millions of devices reconnecting simultaneously after a regional outage? Rate-limit reconnections (exponential backoff with jitter on device side); pre-scale broker cluster before expected reconnection wave; prioritize critical devices
  • What is the device shadow pattern? Decouple desired state (what you want) from reported state (what the device says it has) — allows commands to be queued even when device is offline; device syncs on reconnect
  • How do you detect anomalous device behavior? ML model on telemetry stream (Spark Streaming or Flink): statistical deviation from device historical baseline, sudden value spikes, absence of expected heartbeats

Frequently Asked Questions

How does MQTT differ from HTTP for IoT device communication?

MQTT is a publish/subscribe protocol designed for constrained devices and unreliable networks. Key differences from HTTP: MQTT maintains a persistent TCP connection (no per-message handshake overhead), uses a lightweight binary framing (fixed 2-byte header vs HTTP headers that can exceed 1KB), and supports QoS levels (0=fire-and-forget, 1=at-least-once, 2=exactly-once). HTTP is request/response — the server cannot push data to the device without long-polling. MQTT brokers push messages instantly when published. For a device sending temperature every 30 seconds, MQTT uses roughly 1/10th the bandwidth of HTTP. MQTT also supports Last Will and Testament (LWT): the broker publishes a configured message on behalf of a device when it disconnects unexpectedly, enabling automatic offline detection.

How do you handle OTA firmware updates for millions of IoT devices?

OTA (Over-The-Air) updates require a staged rollout to avoid bricking millions of devices simultaneously. The architecture: (1) Firmware binary is uploaded to object storage (S3) and signed with a private key. (2) An update campaign targets a device group — first 0.1% of devices, then 1%, 10%, 100%, with automatic rollback if error rates exceed a threshold. (3) The device shadow or twin stores desired_firmware_version. When a device connects, it compares its current version to the desired version and downloads the delta patch (not the full binary) from a pre-signed URL. (4) The device verifies the binary signature before applying, then reboots and reports its new version. (5) A watchdog timer detects failed boots and rolls back to the previous firmware automatically.

What is the device shadow pattern and why is it used?

A device shadow (also called a device twin) is a JSON document stored in the cloud that represents the last known and desired state of a physical device. It has two sections: reported (what the device last told the cloud its state is) and desired (what the cloud wants the device to do). The pattern solves the problem of intermittent connectivity: if a device is offline when a command is sent, the command is persisted in the desired state. When the device reconnects, it reads the delta between desired and reported, executes the command, then updates reported. This decouples cloud applications from device connectivity. AWS IoT, Azure IoT Hub, and Google Cloud IoT all implement this pattern. The shadow is also the authoritative state for dashboards and automation rules — they read from the shadow rather than polling devices directly.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does MQTT differ from HTTP for IoT device communication?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “MQTT is a publish/subscribe protocol designed for constrained devices and unreliable networks. Key differences from HTTP: MQTT maintains a persistent TCP connection (no per-message handshake overhead), uses a lightweight binary framing (fixed 2-byte header vs HTTP headers that can exceed 1KB), and supports QoS levels (0=fire-and-forget, 1=at-least-once, 2=exactly-once). HTTP is request/response — the server cannot push data to the device without long-polling. MQTT brokers push messages instantly when published. For a device sending temperature every 30 seconds, MQTT uses roughly 1/10th the bandwidth of HTTP. MQTT also supports Last Will and Testament (LWT): the broker publishes a configured message on behalf of a device when it disconnects unexpectedly, enabling automatic offline detection.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle OTA firmware updates for millions of IoT devices?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “OTA (Over-The-Air) updates require a staged rollout to avoid bricking millions of devices simultaneously. The architecture: (1) Firmware binary is uploaded to object storage (S3) and signed with a private key. (2) An update campaign targets a device group — first 0.1% of devices, then 1%, 10%, 100%, with automatic rollback if error rates exceed a threshold. (3) The device shadow or twin stores desired_firmware_version. When a device connects, it compares its current version to the desired version and downloads the delta patch (not the full binary) from a pre-signed URL. (4) The device verifies the binary signature before applying, then reboots and reports its new version. (5) A watchdog timer detects failed boots and rolls back to the previous firmware automatically.”
}
},
{
“@type”: “Question”,
“name”: “What is the device shadow pattern and why is it used?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A device shadow (also called a device twin) is a JSON document stored in the cloud that represents the last known and desired state of a physical device. It has two sections: reported (what the device last told the cloud its state is) and desired (what the cloud wants the device to do). The pattern solves the problem of intermittent connectivity: if a device is offline when a command is sent, the command is persisted in the desired state. When the device reconnects, it reads the delta between desired and reported, executes the command, then updates reported. This decouples cloud applications from device connectivity. AWS IoT, Azure IoT Hub, and Google Cloud IoT all implement this pattern. The shadow is also the authoritative state for dashboards and automation rules — they read from the shadow rather than polling devices directly.”
}
}
]
}

Companies That Ask This Question

Scroll to Top