“Design an audiobook app” is the audio-only cousin of the music-streaming and podcast prompts. Audible, Libby, Spotify Audiobooks, Apple Books are the references. The interview tests whether you understand long-form audio playback, cross-device position sync, the ergonomics of the listening UX, and the constraints of background playback on mobile OS.
Clarify scope
- Library + listening, or also purchase / borrowing?
- Streaming, offline downloads, or both?
- Speed control, sleep timer, chapter navigation?
- Cross-device sync (phone, tablet, watch, web)?
- CarPlay and Android Auto?
Audio playback architecture
- iOS: AVAudioSession + AVPlayer (or AudioKit for more control)
- Android: ExoPlayer / Media3 with MediaSession integration
- Background-capable audio category configured at app launch
- Lock-screen / Dynamic Island controls registered via MediaSession (Android) or MPNowPlayingInfoCenter (iOS)
Background playback constraints
- iOS: enable “Audio, AirPlay, and Picture in Picture” background mode in Info.plist; the OS keeps the process alive while audio plays
- Android: foreground service required for audio-while-locked playback; a notification is mandatory
- Both: when audio stops, the OS may suspend or kill the process within seconds
- Resume on Bluetooth headset reconnect, headphone re-plug, alarm finishing
Audio interruption handling
Phone call comes in, alarm fires, navigation announcement plays. The OS issues an interruption event. Your app should:
- Pause cleanly on interruption begin
- Resume on interruption end if the interruption was brief and the user was actively listening
- Save current position regardless
Position sync — the senior signal
Cross-device sync is the killer feature. State to sync:
- Current book, chapter, byte/time offset
- Last-played timestamp (for conflict resolution)
- Listening speed, sleep timer, bookmark list
Sync strategy:
- Optimistic local update; periodic upload (every 30s of playback)
- On app foreground: pull latest from server; if remote is newer than local, ask user “Continue from device X at 2:34:15?”
- Conflict resolution: latest-write-wins is fine for a single-user product
Downloads
- Per-chapter downloads, not whole-book — shorter chunks resume better
- Background download via OS background-transfer APIs (URLSession on iOS, WorkManager on Android)
- Automatic download of next chapter on Wi-Fi, configurable
- Storage management: show usage, allow user to delete books from local
- Encrypted at rest if licensed content (DRM)
Sleep timer
- Fixed durations (15, 30, 45, 60 min) plus “end of chapter”
- Smooth fade-out audio over the last 5–10 seconds
- State persists if user backgrounds and returns
Speed control
- 0.5× to 3.0× typical
- Pitch correction (preserve voice naturalness) — AVAudioEngine on iOS, ExoPlayer can do it
- Per-book preference saved (some users speed up only certain narrators)
Bookmarks and notes
- Audio bookmark = (book ID, time offset, optional text note)
- Sync alongside position
- Export to user’s notes app or share by URL
CarPlay and Android Auto
- iOS: CarPlay audio category supported by MPRemoteCommandCenter + CarPlay UI templates
- Android Auto: Media browser + MediaSession; templates for Now Playing and library
- Voice commands (“Play the next chapter”) via Siri/Assistant intent extensions
- Constrained UI — no scroll lists past N items, no inputs requiring keyboard
Whispersync-style “last position from any device”
Audible’s Whispersync is the gold standard. Pattern:
- On each playback start, fetch latest server position
- If server position is newer than local last-known and meaningfully different, prompt the user to jump
- Otherwise resume locally
- On every position upload, server stamps the device that made the update
Performance
- Audio buffers should sustain 30+ minutes of playback if network drops
- Pre-fetch next chapter when current is 80% complete
- Memory budget tight; release decoded buffers behind the playback head
- Battery: avoid CPU-heavy effects (heavy DSP); rely on hardware acceleration
Edge cases interviewers love
- Audio session reactivation after the user dismisses then reopens an alarm
- User puts on AirPods after starting playback on speaker — route audio
- Headphones unplugged — pause (industry convention)
- Bluetooth disconnect — pause and surface a clear error
- App killed while listening — restore on relaunch with a “Continue listening” CTA
Frequently Asked Questions
Audiobook vs podcast — what differs?
Audiobooks have stable position-of-record (you do not skip around as much), longer chapters, and stronger speed-control expectations. Podcasts have RSS-driven catalog updates and ad insertion. UX overlap is large; backend differs.
How do I handle DRM?
FairPlay on iOS, Widevine on Android. Decrypt at the playback boundary; never expose raw audio to the file system. License renewal happens periodically; handle expired-license errors gracefully.
What about variable narrator audio quality?
Encode at multiple bitrates; let the user pick or auto-select based on Wi-Fi vs cellular. Loudness normalization (LUFS-based) so chapters do not vary in volume.