ElevenLabs has introduced Scribe, a speech-to-text tool designed to transcribe audio with high accuracy across 99 languages. It includes features like word-level timestamps, speaker identification, and the ability to detect non-verbal sounds like laughter or music. The model is designed to handle real-world audio challenges, making it useful for subtitles, searchable podcasts, and multilingual transcriptions.
Scribe is priced at $0.40 per hour for transcribing pre-recorded audio, with a real-time version coming soon. It aims to improve accessibility in languages that have limited speech recognition options. Learn more: ElevenLabs Blog.