Search: [speech-recognition] - Biapy Web Directory

Hertz-dev https://github.com/Standard-Intelligence/hertz-dev

Tue Nov 5 15:30:34 2024

📧email

Hertz-dev is an open-source, first-of-its-kind base model for full-duplex conversational audio.

🍓 Ichigo https://github.com/homebrewltd/ichigo

Fri Oct 18 14:22:00 2024

📧email

Llama3.1 learns to Listen. Local real-time voice AI (Formerly llama3-s).

🍓 Ichigo is an open, ongoing research experiment to extend a text-based LLM to have native "listening" ability. Think of it as an open data, open weight, on device Siri.

say https://github.com/8ta4/say

Fri Aug 2 15:10:39 2024

📧email

say is always on, recording and transcribing your voice 24/7. Whenever inspiration strikes, just say it.

Distil-Whisper https://github.com/huggingface/distil-whisper

Thu Dec 7 10:09:15 2023

📧email

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

AI Transcriptions by Riverside https://riverside.fm/transcription

Mon Jul 17 08:42:58 2023

📧email

Accurate AI Transcriptions in Minutes.

Web service proposing to transcribe video and/or audio content using AI

Whisper https://openai.com/index/whisper/

Wed Mar 15 08:26:45 2023

📧email

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

Whisper @ GitHub.

Amberscript https://www.amberscript.com/en/

Mon Jan 16 08:28:16 2023

📧email

Audio & Video Transcription | Speech-to-text.
Smarter subtitling and transcription.
We combine artificial and human intelligence to bring you accurate and fast transcripts, captions, and translated subtitles with ease.

Whisper https://github.com/openai/whisper

Fri Sep 23 11:32:29 2022

📧email

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.

Links per page

Filters