Trending:
Startups & Funding

ElevenLabs raises $500M at $11B valuation, bets voice will replace keyboards

Voice AI infrastructure provider ElevenLabs closed a $500 million Series D at an $11 billion valuation, tripling its value in under a year. CEO Mati Staniszewski argues voice is becoming the primary interface for AI, not a feature. The company now powers conversational systems at Deutsche Telekom, Revolut, and Cisco.

The Deal

ElevenLabs raised $500 million at an $11 billion valuation this week, with ICONIQ Capital leading a round that tripled the voice AI company's previous valuation. Speaking at Web Summit in Doha, co-founder and CEO Mati Staniszewski positioned the funding around a specific thesis: voice is replacing keyboards and screens as the default interface for AI systems.

"Hopefully all our phones will go back in our pockets, and we can immerse ourselves in the real world around us, with voice as the mechanism that controls technology," Staniszewski said.

What They've Shipped

ElevenLabs runs a text-to-speech infrastructure platform supporting 100 languages, roughly 70 more than competing models. The company's v3 model handles emotional nuance and contextual understanding. Its Scribe v2 Realtime product provides high-accuracy transcription. The Agents Platform is already deployed at Deutsche Telekom, Revolut, and Cisco for customer-facing conversational systems.

The company operates a Voice Marketplace where creators can license voices, generating over $10 million in creator payouts. This dual model (creator tools plus enterprise infrastructure) creates a data flywheel, though the balance between consumer virality and enterprise reliability remains to be proven at scale.

The Stack Evolution

Staniszewski described the company's technical roadmap as moving from a "cascade" model (speech-to-text, LLM processing, text-to-speech running sequentially) to an integrated duplex system handling real-time conversation with natural interruptions. He estimates this transition could happen by 2026.

This matters because latency remains the key barrier to production voice agent deployments. Enterprises testing OpenAI's Realtime API and ElevenLabs streaming endpoints are benchmarking sub-500ms response times as table stakes for natural conversation. The company that solves turn-taking and interruption handling at scale wins the enterprise conversational AI infrastructure play.

Pattern Recognition

We've seen this movie before. OpenAI and Google both positioned voice as central to their next models. Apple is quietly acquiring voice-adjacent technology (Q.ai). The difference: ElevenLabs is infrastructure-first, not application-first. They're selling API access and concurrent request capacity to enterprises building their own voice agents, not competing with them.

The $11 billion valuation assumes voice becomes ubiquitous across enterprise applications. History suggests interfaces change slowly, then suddenly. Worth watching which enterprises move from pilots to production deployments in the next 12 months.