Question 1

What is the difference between tumbling, sliding, and session windows?

Accepted Answer

Tumbling windows: fixed-size, non-overlapping intervals — each event belongs to exactly one window (e.g., hourly counts). Sliding windows: fixed-size but advance by a smaller step — events belong to multiple overlapping windows (e.g., rolling 10-minute average updated every 1 minute). Session windows: variable-size, defined by activity gaps — all events within a period of continuous activity form one session, with a new session starting after a configurable inactivity gap.

Question 2

What is a watermark in stream processing and why is it needed?

Accepted Answer

A watermark is a timestamp assertion: 'no events with timestamps before T will arrive in the future.' It signals that a window can be safely closed and its result emitted. Without watermarks, a window can never close because late events might arrive forever. The watermark advances as new events arrive, set as max_event_time - allowed_lateness. Conservative watermarks (large lateness) produce more correct results but add latency. Aggressive watermarks emit quickly but may miss late events.

Question 3

What is the difference between event time and processing time in stream processing?

Accepted Answer

Event time is when an event actually occurred (embedded in the event payload). Processing time is when the event arrives at the stream processor. Network delays, mobile app batching, and retries cause events to arrive later than their event time. Event-time processing correctly places late events in their original windows. Processing-time windows are simpler but produce wrong results when events are delayed — a midnight metric spike that was generated at 11:59 PM would be incorrectly counted in the next day's window.

Question 4

How does Flink guarantee exactly-once stateful processing?

Accepted Answer

Flink takes periodic distributed snapshots (Chandy-Lamport algorithm) of all operator state to durable storage (HDFS, S3). On failure, Flink restores from the latest checkpoint and reprocesses Kafka messages from the corresponding offset. Since state is restored from checkpoint and input is replayed from Kafka, each record is processed exactly once in terms of effect on state and output. Downstream sinks must also be transactional (Kafka transactions, database UPSERT by event_id) to achieve end-to-end exactly-once.

Low Level Design: Stream Processing Windows

Tumbling Windows

Sliding Windows

Session Windows

Event Time vs Processing Time

Watermarks

Late Data Handling

Stateful Stream Processing