ElevenLabs Development History: From Voice AI Beta to Multilingual v2, Turbo, and Beyond

Professional voice recording and AI text-to-speech workflow for ElevenLabs-style synthesis

Voice recording and synthesis workflow central to ElevenLabs text-to-speech and voice cloning

ElevenLabs was founded in 2022 with a mission to make content universally accessible in any language and voice. After opening a public beta in early 2023, it quickly became known for natural, expressive AI speech that avoided the robotic tone of earlier TTS. The company then shipped Eleven Multilingual v2, Turbo models for low-latency streaming, voice cloning, sound effects, speech-to-text, and later music and conversational agents. This article traces ElevenLabs' development from beta to a full-stack voice and audio platform.

Timeline Overview

Date	Milestone	Details
2022	Founding	ElevenLabs co-founded by Piotr Dąbkowski and Mati Staniszewski; focus on deep learning speech synthesis
Jan 2023	Public beta	Public beta launched; attention for natural inflection, emotion, and expressiveness; $2M pre-seed
Jun 2023	Series A	$19M Series A; platform growth and model improvements
2023–2024	Out of beta / Multilingual v2	Official exit from beta; release of Eleven Multilingual v2 as foundational model for ~29 languages with consistent voice and accents
Jan 2024	Series B	$80M Series B; scaling infrastructure and product breadth
2024–2025	Turbo v2 / Turbo v2.5	Low-latency Turbo v2; Turbo v2.5 with ~250–300ms latency, 32 languages, Vietnamese/Hungarian/Norwegian; ~3× faster in places
2025	Agents, Music, Scribe	Eleven v3 API; Agents platform (widgets, Twilio, knowledge base); Music Generation API; Scribe (speech-to-text); Global TTS preview
Later	Funding and scale	Series D $500M at ~$11B valuation; reported $330M+ ARR; expansion into ElevenAgents, ElevenCreative, ElevenAPI

Core Models and Products

Text-to-Speech

Eleven Multilingual v2: Foundational model for ~29 languages; automatic language detection; consistent voice and accent across languages; used for high-quality narration and dubbing
Eleven Turbo v2.5: Low-latency TTS in 32 languages (~250–300ms); 3× faster than earlier Turbo; added Vietnamese, Hungarian, Norwegian
Eleven Flash v2.5: Ultra-low latency (~75ms), 32 languages, lower cost per character for real-time use cases
Eleven v3: Newer API model with improved quality and Text-to-Voice Design (custom voices from text descriptions)

Podcast and audiobook production with AI voice synthesis and ElevenLabs multilingual TTS

Content creation and audiobook-style workflows powered by ElevenLabs voice AI

Voice Cloning and Design

ElevenLabs offers instant and professional voice cloning so users can create digital voices that speak in nearly 30 languages. Voice design and a library of 3,000+ community voices support both custom and pre-made options for ads, audiobooks, and video.

Beyond TTS

Speech-to-text (Scribe): Transcription with diarization and background noise reduction
Sound effects: AI-generated sound effects for video and games
Music generation: AI music composition and streaming API for paid users
ElevenAgents: Conversational AI with customizable widgets, Twilio outbound calling, knowledge bases

Ecosystem and Access

ElevenLabs is available via the web app, API, and integrations (e.g. Twilio, various no-code tools). Developers use the API for TTS, voice cloning, and—with paid plans—music and agents. Platforms like FuseAI Tools bundle ElevenLabs models for text-to-speech, speech-to-text, sound effects, and audio isolation so users can try them without managing API keys.

Summary

ElevenLabs grew from a 2022 founding and 2023 beta into one of the leading voice AI companies. Eleven Multilingual v2 and the Turbo line established high-quality, low-latency TTS in dozens of languages; voice cloning and sound effects extended the stack; Scribe, Music, and Agents turned it into a broad audio and conversation platform. With significant funding and reported nine-figure ARR, ElevenLabs continues to push voice and audio AI for global, multilingual content.

Key Takeaways

Founded 2022; public beta January 2023; exited beta with Eleven Multilingual v2 (≈29 languages).
Turbo v2/v2.5 deliver low-latency TTS (down to ~75ms with Flash) in 32 languages.
Voice cloning, sound effects, Scribe (STT), Music, and Agents expand beyond core TTS.
Backed by a16z, Sequoia, and others; Series D $500M at ~$11B valuation; $330M+ ARR reported.

Try ElevenLabs on FuseAI Tools for text-to-speech, speech-to-text, sound effects, and audio isolation in one place.