News

"VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as ...
Text-to-speech models from ElevenLabs, Hume AI, and Descript are all pushing the limits of AI-generated voice technology.
In its initial announcement, Google didn't say if and when the feature would make its way to the Google Docs app. Code sleuth ...
Creating voice agents just got a whole lot easier, thanks to the OpenAI's latest speech-to-speech model, GPT-Realtime.
VibeVoice is a new open-source AI tool that can generate a full 90 minute audio podcast recording with multiple speakers from ...
The software company ElevenLabs has launched an AI text-to-speech app for audiobooks, enabling writers to sell audiobooks ...
Aura-2 Beats ElevenLabs, Cartesia, and OpenAI in Preference Testing for Conversational Enterprise Use Cases, Delivering Natural, Context-Aware Speech Synthesis with Unmatched Clarity, Speed, and ...
"We used a pretrained text-to-speech model to generate audio and simulate a target," said Cho. "And we also used Ann's pre-injury voice, so when we decode the output, it sounds more like her." ...
Deepgram, the leading voice AI platform for enterprise use cases, today announced Aura-2, its next-generation text-to-speech (TTS) model purpose-built for re ...
Realtime API supports multi-model text and speech experiences including natural speech-to-speech conversations using preset voices already supported in the API.