Suno is a leading AI music generation platform — full songs from a text prompt, including vocals, instrumentation, and lyrics. Series B in 2024. The interview emphasizes generative-audio model engineering, real-time inference, and the unique product engineering of a creative-tools platform.
Process
Recruiter screen → 60-minute coding (Python preferred for ML, TypeScript for product) → onsite virtual: 2 coding, 1 ML system design, 1 craft deep-dive, 1 behavioral. ML-research candidates get a research deep-dive. Cycle: 3–5 weeks.
What they actually ask
- Design a streaming music-generation API with chunked output
- Design a multi-tenant inference platform optimized for audio models
- Design an audio fingerprinting system for similarity / IP protection
- Coding: medium-hard DSA, often audio or pipeline framing
- Behavioral: ownership, taste, fast-moving creative startup
Levels and comp (2026)
- SE: $190K–$260K total (cash + meaningful equity)
- Senior SE: $270K–$370K total
- Staff / ML Research: $385K–$580K+ total at top of band
Prep priorities
- Be fluent in Python (research/serving), C++/CUDA helpful for inference roles
- Understand audio model architectures (diffusion in latent audio space, autoregressive token models)
- Brush up on streaming inference, audio codecs (Encodec, SoundStream), and DSP basics
Frequently Asked Questions
Is Suno remote-friendly?
Hubs in Cambridge MA (HQ) and remote across US. Many engineering and research roles remote.
How does Suno compare to Udio or Stable Audio?
Suno is the most consumer-facing with broad genre coverage. Udio (founded by ex-Google DeepMind) is a close competitor. Stable Audio is open-weights. Suno pays competitively for AI startups.
What is the engineering culture?
Small, technically dense, taste-driven, fast-shipping. Strong music-domain knowledge among the team.