GitHub - borayil/stream2whisper: lightweight low latency URL to transcription process with yt-dlp, ffmpeg and OpenAI gpt-realtime-whisper

Disclaimer: This software is provided for educational and research purposes only. Users are solely responsible for ensuring their use complies with all relevant rules and regulations.

install & use

pip install -r requirements.txt

import asyncio
from stream2whisper import Stream2Whisper

OPENAI_API_KEY = "your-api-key-here"
app = Stream2Whisper(OPENAI_API_KEY)
asyncio.run(app.run(stream_url="https://www.youtube.com/@Markets/live", language="en"))

For flexibility and transcribing multiple streams, use the worker.py multiprocess approach.

export OPENAI_API_KEY="your-key-here";
python3 worker.py --stream-url https://www.youtube.com/@TheBurntPeanut/live
python3 worker.py --stream-url https://www.youtube.com/@YahooFinance/live --language en
python3 worker.py --stream-url https://www.youtube.com/@Halktvkanali/live --language tr

Save the transcription outputs to files

python3 worker.py --stream-url https://www.youtube.com/@Markets/live >> markets.txt # appends
python3 worker.py --stream-url https://www.youtube.com/@Markets/live >> log.txt 2>&1 # combine stdout and stderr

You can monitor the stdout, the file you wrote to or event.delta in the code (edit stream2whisper.py)

what is stream2whisper?

it is a lightweight, high level yet low latency Python implementation that uses yt-dlp, ffmpeg and OpenAI gpt-realtime-whisper model for getting transcriptions of livestreams on supported sites such as YouTube, Twitch, Kick and more.

requirements

internet connection
OPENAI_API_KEY
python3.11+
$0.017 per minute of audio transcribed

why?

this is for lightweight, high level use cases without bothering with CUDA, ML libraries, all the different whisper engines, audio libraries and whatnot.

To self host whisper yourself, see openai/whisper and ggml-org/whisper.cpp. If you do not have CUDA GPUs, there are CPU and Mac alternatives of course. For example, on Apple Silicon M1+ Macs, mlx-whisper is quite efficient and works great.

Many household GPUs can handle certain whisper models and workloads but for certain use cases, highest accuracy with the lowest latency at scale is needed. This is an idea in that direction but at the cost of credits.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
stream2whisper.py		stream2whisper.py
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

install & use

what is stream2whisper?

requirements

why?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

install & use

what is stream2whisper?

requirements

why?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages