Hertz-dev is an open-source, first-of-its-kind base model for full-duplex conversational audio.
say is always on, recording and transcribing your voice 24/7. Whenever inspiration strikes, just say it.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
Audio & Video Transcription | Speech-to-text.
Smarter subtitling and transcription.
We combine artificial and human intelligence to bring you accurate and fast transcripts, captions, and translated subtitles with ease.