Paste a YouTube URL and the bot finds highlight-worthy moments, renders them as 9:16 Shorts/Reels/TikTok clips with speech-only subtitles, and sends MP4 files back.
- AI-powered clip scoring (optional) β when an LLM API key is configured, uses GPT-4o-mini (or any OpenAI-compatible model) to evaluate transcript segments for teaching value, engagement, and standalone clarity. Falls back to keyword-based scoring when LLM is unavailable.
- Topic segmentation β splits transcript at natural boundaries (pauses, transitions, semantic shifts) so each clip is a coherent topic unit.
- Speaker-aware face tracking β detects faces during speech segments, uses mouth motion analysis to identify the active speaker, and crops vertically to frame the talking person. Avoids following silent listeners.
- Bilingual support β optimized for Indonesian + English captions with extensive phrase detection for both languages.
- Smart download β downloads only the relevant video sections, not the full video.
python3 -m venv .venv
. .venv/bin/activate
pip install -e '.[dev,llm]' # llm extra enables AI scoring
cp .env.example .env
# Edit .env β set BOT_TOKEN (required) and LLM_API_KEY (optional)
python -m youtube_clipper_bot.botCopy .env.example to .env and configure:
| Variable | Required | Description |
|---|---|---|
BOT_TOKEN |
Yes | Telegram bot token |
LLM_API_KEY |
No | OpenAI-compatible API key. Set to enable AI clip scoring. |
LLM_BASE_URL |
No | API base URL (default: https://api.openai.com/v1) |
LLM_MODEL |
No | Model name (default: gpt-4o-mini) |
CLIP_COUNT |
No | Number of clips to extract (default: 3) |
MIN_CLIP_SECONDS |
No | Minimum clip duration (default: 35) |
MAX_CLIP_SECONDS |
No | Maximum clip duration (default: 120) |
OUTPUT_FORMAT |
No | vertical (default), horizontal, or both |
BURN_SUBTITLES |
No | Burn subtitles into video (default: true) |
FACE_TRACKING |
No | Enable speaker-aware face tracking (default: true) |
When LLM_API_KEY is set, the bot:
- Sends all transcript segments to the LLM in one batch call (~$0.002 per video)
- Gets back quality scores (1-10), titles, summaries, and reasons
- Uses LLM scores for clip selection instead of keyword counting
- Generates compelling titles via LLM instead of extracting sentences
When LLM is unavailable (no key, API error, missing package), falls back to keyword-based scoring seamlessly.
- OpenAI (GPT-4o-mini recommended for cost/speed)
- Any OpenAI-compatible API (set
LLM_BASE_URL) - Local models via vLLM, Ollama, etc.
Source videos are deleted after rendering. Each clip is deleted after Telegram confirms upload. The whole job folder is removed after the job finishes.
. .venv/bin/activate
pip install -e '.[dev,llm]'
pytest tests/ -v