Requirements
An IoT data platform ingests telemetry from millions of connected devices — sensors, smart meters, industrial equipment, wearables — and provides storage, querying, and real-time alerting. Core challenges: massive write throughput (millions of devices sending readings every few seconds), efficient time-series storage (data is append-only and queried by time range), real-time anomaly detection (alerting within seconds of a threshold breach), and long-term data retention with cost optimization. Scale: AWS IoT Core connects 500 million+ devices. Industrial IoT platforms may ingest 1 million readings per second. Smart city deployments can have 100K+ sensors per city block.
Device Connectivity and Ingestion
Protocol selection: MQTT (Message Queue Telemetry Transport) is the standard for IoT ingestion. It uses a pub/sub model over TCP with a 2-byte header overhead, QoS levels (0=fire-and-forget, 1=at-least-once, 2=exactly-once), and persistent sessions (broker queues messages for offline devices). Alternative: CoAP (Constrained Application Protocol) for extremely resource-constrained devices (microcontrollers). HTTP/REST for devices with sufficient resources and where reliability matters more than efficiency. Architecture: devices connect to an MQTT broker cluster (EMQX, HiveMQ, AWS IoT Core). Each device publishes to a topic: devices/{device_id}/telemetry. The broker routes messages to subscribers — in this case, the ingestion pipeline. Ingestion pipeline: MQTT broker → Kafka (device_telemetry topic) → stream processor (Flink/Spark Streaming) → time-series DB + real-time alert engine. Kafka serves as a buffer and replay log. Throughput: Kafka handles 1M+ messages/second on a modest cluster. Message schema: {device_id, timestamp (ISO 8601 or Unix epoch), readings: {temperature: 23.4, humidity: 60, battery: 87}, metadata: {firmware: “v2.1.3”}}. Device registry: maintain a device registry (PostgreSQL): device_id, owner_id, type, model, firmware_version, provisioning_date, location, is_active. Used to validate incoming messages and enrich with metadata.
Time-Series Storage
Time-series data has a distinct access pattern: high write throughput, queries always include a time range, recent data is accessed far more than old data. Purpose-built time-series databases outperform general-purpose RDBMS by 10-100x for this workload. InfluxDB: purpose-built for metrics and events. Measurements (like tables) organized by tags (indexed metadata: device_id, location, type) and fields (values: temperature, humidity). Retention policies: auto-expire old data (e.g., raw 1-second data kept 30 days; 1-minute aggregates kept 1 year; 1-hour aggregates kept forever). Continuous queries: InfluxDB runs aggregate queries continuously in the background, populating downsampled measurements. TimescaleDB: PostgreSQL extension. Hypertables auto-partition data by time (chunks: e.g., 1-week per chunk). Chunk compression: older chunks are compressed 90%+. Standard SQL queries work on all chunks transparently. Best of both worlds: time-series optimization + full SQL. Compression: time-series data compresses extremely well (delta encoding for timestamps, gorilla encoding for floating-point values). InfluxDB and TimescaleDB achieve 10-50x compression. A raw 100GB dataset compresses to 2-10GB. Data tiering: keep recent data (last 7 days) on fast NVMe SSDs. Move older data to object storage (S3) with Parquet format for cheap long-term analytics (queried via Athena or BigQuery).
Real-Time Alerting
Alert rules: device owners configure threshold-based alerts: temperature > 80°C for 5 consecutive readings → CRITICAL alert. Rules stored in an AlertRule table: rule_id, device_id (or device_group), metric, operator, threshold, sustained_periods, severity, notification_channels. Alert evaluation: Flink stateful stream processing evaluates rules against the live data stream. Per-device state machine: track the count of consecutive threshold breaches. If count >= sustained_periods: fire the alert. Deduplication: suppress repeated alerts for the same condition — fire once when the condition is met, fire a resolution alert when it clears. Alert routing: alerts published to a Kafka alert topic → notification service → PagerDuty, SMS, email, webhook, Slack. Anomaly detection: ML-based detection (isolation forest, LSTM autoencoder) for devices where the threshold is not known in advance. Baseline learned from 2-4 weeks of historical data per device. Anomaly score published alongside readings. Users configure alert on anomaly_score > 0.8. Deployed as a separate Flink job consuming from the same Kafka topic.
Query and Visualization
Common query patterns: time-range queries (temperature of device X from 9am to 5pm), aggregation (hourly average temperature across all devices in building Y), downsampling (1-second raw → 1-minute averages for charting), latest value (current status of all devices in a fleet). Query API: REST/GraphQL with time range, device filter, metric, and aggregation parameters. Backend translates to InfluxDB Flux queries or TimescaleDB SQL. Dashboard: Grafana connects natively to InfluxDB and TimescaleDB. Pre-built IoT dashboards with auto-refresh. Customers can build custom dashboards. Live streaming: WebSocket connection pushes new readings to the dashboard every second without polling. Device shadow: maintain a “shadow” (last known state) for each device in Redis: HSET device:{id}:shadow temperature 23.4 humidity 60 updated_at 1700000000. Provides instant “current state” queries without scanning the time-series DB. Updated by the stream processor on every new reading.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why is MQTT preferred over HTTP for IoT device telemetry?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”MQTT uses a 2-byte fixed header and persistent TCP connections, consuming far less bandwidth and battery than HTTP's stateless request-response with large headers. It supports QoS levels for reliability, persistent sessions that queue messages while a device is offline, and a pub/sub model that decouples devices from consumers. At 1 million devices sending data every 10 seconds, MQTT's efficiency difference over HTTP translates to significant infrastructure cost savings.”}},{“@type”:”Question”,”name”:”How does a device shadow work in an IoT platform?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A device shadow (also called a device twin) is a cached copy of the device's last known state stored in Redis. Every time a device sends a telemetry reading, the stream processor updates the shadow: HSET device:{id}:shadow temperature 23.4 updated_at 1700000000. Applications querying "current temperature of device X" read from the shadow in < 1ms instead of scanning the time-series database. Crucially, the shadow is readable even when the device is offline.”}},{“@type”:”Question”,”name”:”How do you handle sustained threshold alerts without firing on every single reading?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a Flink stateful stream processor. Per-device state tracks the count of consecutive readings that exceed the threshold. Only fire the alert when the count reaches the configured sustained_periods (e.g., 5 consecutive readings). Also implement deduplication: once an alert fires, suppress subsequent alerts until the condition clears. Fire a resolution alert when the count drops back to zero. This prevents alert storms while still catching genuine sustained faults.”}},{“@type”:”Question”,”name”:”How does data tiering reduce IoT storage costs?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Recent data (last 7-30 days) stays on fast NVMe SSDs in the time-series database for low-latency queries. Older data is automatically downsampled (1-second raw → 1-minute aggregates → 1-hour aggregates) by continuous queries or stream jobs, reducing data volume 60-3600x. Archived raw data is exported to object storage (S3) in Parquet format, which is queryable via Athena or BigQuery at much lower cost than SSD storage. The combination can reduce storage cost by 95% compared to keeping all raw data on SSD indefinitely.”}},{“@type”:”Question”,”name”:”How do you handle device connectivity at massive scale (millions of concurrent devices)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Deploy a horizontally scalable MQTT broker cluster (EMQX or HiveMQ). Each broker node handles ~100K concurrent connections. Route devices to brokers via a load balancer with sticky sessions (same device always connects to the same broker for session persistence). For 1 million devices: ~10 broker nodes. Broker cluster maintains a distributed session store (Redis) so if a broker fails, devices reconnect to another node and their queued messages are preserved. Kafka provides backpressure isolation — if processing slows, messages queue in Kafka rather than backing up to the brokers.”}}]}
See also: Databricks Interview Prep
See also: Cloudflare Interview Prep
See also: Snap Interview Prep