🌱 Founding Story

ElevenLabs was co-founded in April 2022 by Mateusz Staniszewski (CEO) Piotr Dąbkowski (CTO.) Both founders were originally from Poland: Dąbkowski, a former machine learning engineer at Google, and Staniszewski, previously a deployment strategist at Palantir. The creation of ElevenLabs was driven by their experiences with poorly dubbed American films, which lacked the liveliness and authenticity of the original voices.

In a January 2023 interview on Concept Ventures, Staniszewski and Dąbkowski described the company's origin as follows:

“We grew up in Poland where we watched too many voiceover (VO) movies as kids. VOs were flat and monotonous, and even though you could scarcely hear the actors beneath, we’ve always been struck by how lively their actual voices seemed in comparison. Then dubbing came, still often poorly produced, but we kept imagining the actors actually being able to speak our language. Fast forward some 15 years and we both found ourselves quitting our jobs at Google and Palantir, and founded Eleven to solve this problem once and for all.”

Mati (left) & Piotr (right) Source: Concept Ventures

<aside> 💥 ElevenLabs is a voice technology research company, developing the most compelling AI speech software for publishers and creators. In the same interview by Concept Ventures, Staniszewski and Dąbkowski discuss an example of how and when you would use their product.

</aside>

“Imagine you’re a YouTuber who creates videos on astronomy. You record in your native language, let’s say in English, and this limits your audience to English-speakers. To solve this problem, we automatically dub your video and produce one where you speak native-grade Spanish, in your own voice and with your emotions preserved, matching the original editing of the visuals - all without sounding robotic. Now imagine a future where all audio content is accessible in any language, in high production quality - across movies, TV, advertising, podcasts, audiobooks, streaming, gaming or real-time conversation - that’s what we're tackling.”

Staniszewski and Dąbkowski’s approach was to develop tools that could preserve the distinctive features of a speaker’s voice and tone across different languages. This technology aimed to overcome the limitations of traditional dubbing methods, providing a more authentic and engaging experience for audiences. Both co-founders envisioned a future where all audio content, including movies, TV shows, and podcasts, could be accessible in any language while encapsulating distinct emotions. After coming out of Beta in August, the team enhanced its deep learning model, enabling the tool to recognize text and generate speech in over 25 universal languages. **Eleven Labs secured $80 million in a Series B funding round** co-led by many investors such as Andreessen Horowitz, Nat Friedman (former CEO of GitHub), and Daniel Gross (co-founder of Cue.),

Products

Since launch, ElevenLabs has rapidly gained a reputation as a leading AI text-to-speech generator — the platform offers both free and premium services, allowing users to produce realistic-sounding speech, personalized AI voices, and a voice cloning feature.

🗣️ Speech Synthesis:

Speech Synthesis is ElevenLabs browser-based, AI-assisted text-to-speech software that is capable of producing lifelike speech by synthesizing vocal emotion and intonation. Their software adjusts the intonation and pacing of delivery based on the context of the language input, using algorithms to analyze textual context and detect emotions like anger, sadness, happiness, or alarm to result more realistic and human-like speech inflection. The product offers high-quality voices including different languages and accents enabling users to select voices that best fit their needs such as custom voice generation and real-time processing.

https://x.com/elevenlabsio/status/1651697236533735424?s=20

📚Voice Library:

Through Voice Design technology and VoiceLab, users can clone voices from short audio snippets or create entirely new synthetic voices. ElevenLabs offers various speech synthesis models, including English v1, multilingual v1 (experimental), and multilingual v2. Each of these models has its unique strengths and is designed for different applications, like achieving good stability, language diversity, or low-latency applications.

On the ElevenLabs website, they state,

“Voice Library is not just a repository; it's a vibrant community platform for discovery and sharing. You can equally browse and use synthetic voices shared by others to uncover possibilities for your own use-cases. Whether you're crafting an audiobook, designing a video game character, or adding a new dimension to your content, Voice Library offers unbounded potential for discovery. Hear a voice you like? Simply add it to your VoiceLab. potential for discovery.”