Skip to content

lukesorvik/brainrotinator

Repository files navigation

LinkedIn


Logo

Brainrotinator

Podcast Clip Automation Using AI

  • Edits long form content into clips with subtitles using FFmpeg (libass for burn-in)
  • Web UI built with Gradio for editing — CLI still works for headless / cron use
  • Transcribes audio using Vosk or Whisper Models (your choice)
  • Mutes audio where profanity is detected using FFmpeg's volume filter driven by SRT timestamps
  • Uses TinyLlamma LLM to generate titles based on transcription for YouTube and Instagram.
  • Automatically uploads to YouTube, Instagram, and Tiktok based on schedule given in config file using Selenium Firefox.
  • Downloads videos from youtube using given URL using Pytube
  • Thank you Timofei for the inspiration and name of the project.
Table of Contents
  1. About The Project
  2. Architecture
  3. Getting Started
  4. Running the Program
  5. Things to note
  6. Vosk or Whisper
  7. License
  8. Contact
  9. Acknowledgments

About The Project

Video Created and Uploaded Using Brainrotinator

Tuckers.Takes.You.Inside.the.Mind.of.Howard.Vice.The.Trapdoor.That.Was.Part.of.the.Sphinx.s.Drill.mp4

Watch the demo on youtube

Original Video

I initially made this as a joke. I have been editing youtube videos myself for around 10 years now. I wanted to see if I could automate the horrible podcast clips I see on youtube shorts using python.

It was really fun working with AI models to make some cool features for this project

New Gradio Layout:

image

Architecture

The editor is FFmpeg-only as of the v2 rewrite. moviepy, ImageMagick, and cleanvid have all been removed. Setup is dramatically simpler — pip install -r requirements.txt and a working ffmpeg binary is enough to edit videos.

Application Flow

flowchart TD
    subgraph Input
        A1[Upload MP4] 
        A2[YouTube URL\nyt-dlp download]
        A3[Existing file\nin to_split/]
    end

    subgraph UI["Entry Points"]
        B1[app.py\nGradio Web UI]
        B2[main.py\nCLI]
    end

    subgraph Editor["brainrotinator/ — VideoEditor"]
        C1[Split into chunks\nvideo_editor.py]
        C2{Blur mode?}
        C3[Blur letterbox\nffmpeg_ops.py]
        C4[Center crop 9:16\nffmpeg_ops.py]
        C5[Transcribe audio\ntranscribe.py]
        C6{Whisper\nor Vosk?}
        C7[Whisper model]
        C8[Vosk model]
        C9[Generate SRT\nsubtitles.py]
        C10[Detect profanity\nprofanity.py / swears.txt]
        C11[Burn subtitles\nlibass / ffmpeg_ops.py]
        C12[Mute profanity\nFFmpeg volume filter]
        C13[Generate title\nTinyLlama LLM]
    end

    subgraph Output["done_split/"]
        D1[Final MP4 clips\nwith subtitles]
    end

    subgraph Uploaders
        E1[YouTube\nyoutube_uploader_selenium]
        E2[Instagram\nInstagram_Uploader]
        E3[TikTok\nupload_tiktok.py]
    end

    A1 & A2 & A3 --> B1
    A1 & A2 & A3 --> B2
    B1 & B2 --> C1
    C1 --> C2
    C2 -->|Yes| C3
    C2 -->|No| C4
    C3 & C4 --> C5
    C5 --> C6
    C6 -->|Whisper| C7
    C6 -->|Vosk| C8
    C7 & C8 --> C9
    C9 --> C10
    C10 --> C11
    C10 --> C12
    C11 & C12 --> C13
    C13 --> D1
    D1 --> E1 & E2 & E3
Loading

Repo Layout

brainrotinator/                # Editor package — pure FFmpeg
  ffmpeg_ops.py                # Cut, crop, blur, burn-in, mute wrappers
  subtitles.py                 # SRT → styled ASS, profanity → mute-range list
  transcribe.py                # Vosk / Whisper / TinyLlama (lazy, resumable)
  video_editor.py              # Per-chunk orchestration
  profanity.py                 # Text-profanity censor
uploaders/                     # Selenium-based uploaders
  login.py, uploader_selenium.py, upload_tiktok.py
  Instagram_Uploader/, youtube_uploader_selenium/
downloader/                    # yt-dlp downloader module
  downloadVid.py, combineAudioVideo.py
assets/                        # Static files
  fonts/, swears.txt, title.txt
to_split/, done_split/, subtitles/   # Runtime media directories
models/                        # Persistent AI models storage (Vosk/TinyLllama/Whisper)
app.py                         # Gradio Web UI entrypoint
main.py                        # CLI entrypoint
config.py / config.json        # Pydantic configuration model
Dockerfile / docker-compose.yml # Containerization setup

(back to top)

Getting Started

Editing requires only Python, FFmpeg (with libass), and ~8 GB of disk for models on first run. The selenium uploaders additionally need Firefox + geckodriver and per-platform cookies.

Install with Docker

The best way to run Brainrotinator with Docker is using Docker Compose. This automatically handles mounting the to_split, done_split, and models directories so your files and AI models are saved locally on your machine, working seamlessly across Windows, Mac, and Linux without complex path variables.

  1. Ensure your config.json points the AI models to the synced models folder:

    "voskModelDir": "models",
    "tinyLlamaDir": "models"
  2. Build and start the container in the background. (Use --build the first time you run this, or after pulling in new code updates):

    docker compose up -d --build

    Note: On subsequent runs, you can just use docker compose up -d to start the container instantly without docker checking for build updates.

Then open http://localhost:7860.

To view logs or access the container shell:

  • Logs: docker compose logs -f
  • Shell: docker compose exec brainrotinator bash

If you'll use the uploader, run python login.py on your host first (it needs a GUI) so cookies are present in the mounted volume before the container starts.

Save output locally without Gradio

You can also run the CLI editor directly via Docker Compose (bypassing the web UI). Finished clips land in done_split/ on your host machine, so no browser download is needed:

docker compose run --rm brainrotinator python main.py -e

Drop your source mp4s into to_split/ before running.

(back to top)

Install without Docker

Prerequisites

  • Python 3.10+
  • FFmpeg with libass (ffmpeg -filters | grep " ass " should list it; most distro packages and the official Windows builds include it)
  • ~10 GB VRAM if you'll use Whisper; CPU is fine for Vosk
  • ~4 GB for the TinyLlama model, ~4 GB for the Vosk model (downloaded automatically on first use)
  • Firefox + geckodriver — only if you'll use the uploader

Steps

  1. Clone the repo and cd in.
  2. Install Python deps:
    pip install -r requirements.txt
  3. Make sure ffmpeg is on your PATH. No IMAGEMAGICK_BINARY / FFMPEG_BINARY env vars are needed anymore.
  4. (Uploader only) install geckodriver v0.32.0 and put it on your PATH, then run python login.py once on a machine with a GUI to capture cookies.
  5. Launch:

Running the Program

Gradio UI

python app.py

Tabs:

  • Edit — upload an mp4 or paste a YouTube URL, set chunk length / blur / Vosk-vs-Whisper / profanity filter, watch logs stream as the splitter runs.
  • Library — list everything in done_split/. Click a filename to download it to your computer. Delete clips you don't want.
  • Settings — edit config.json in-browser, validated against the pydantic schema before save.

CLI

python main.py        # default loop: edit one video → upload from done_split → repeat
python main.py -e     # edit only (consume to_split/, write to done_split/)
python main.py -u     # upload only (consume done_split/ on the schedule in config.json)

When the editor runs out of videos in to_split/, the CLI prompts for a YouTube URL and downloads it via yt-dlp.

Config

config.json is now validated by config.Config (config.py). Defaults are filled in for any missing keys.

{
    "tags": ["chuckle Sandwich", "jschlatt", "ted nivison", "slimecicle", "gaming", "comedy"],
    "description": "#shorts",

    "howManyUploads": 1,
    "howManyHoursBetweenSchedule": 0,
    "howManyMinsBetweenUpload": 5,
    "howManyHoursLongToSleep": 23,
    "sleepXMinsBeforeStartingUploader": 0,

    "chunkDuration": 58,
    "blurTopBottomOfClip": true,
    "useWhisperForTranscription": false,
    "filterProfanityInSubtitles": false,

    "uploadToYoutube": true,
    "uploadToInstagram": true,
    "uploadToTiktok": false,
    "firefoxHeadless": true,

    "voskModelDir": "",
    "tinyLlamaDir": ""
}
Key Meaning
tags YouTube tags; also used as #hashtags appended to the IG/TikTok caption
description YouTube description; prepended before tags for IG/TikTok
howManyUploads Uploads per cycle before sleeping howManyHoursLongToSleep
howManyHoursBetweenSchedule Hours between each scheduled upload (YouTube/TikTok only — IG ignores)
howManyMinsBetweenUpload Base delay between uploads, plus a 0–5 min jitter
chunkDuration Clip length in seconds
blurTopBottomOfClip true = blurred letterbox; false = center-crop to 9:16
useWhisperForTranscription true = Whisper, false = Vosk
filterProfanityInSubtitles Censor swears in burned-in subtitles (audio is muted regardless)
firefoxHeadless Must be true inside Docker (no display)
voskModelDir, tinyLlamaDir Where to cache models. Empty = current working directory

(back to top)

Things to note

  • Editor vs uploader: the editor is FFmpeg-only and should keep working indefinitely. The selenium uploaders depend on YouTube/IG/TikTok DOM layout and will break when those sites change. I am not maintaining them.
  • swears.txt is the source of truth for what gets muted. Add or remove words to taste. Matching is word-bounded so ass won't match class.
  • TikTok uploads require cookies from https://github.com/wkaisertexas/tiktok-uploader — and you will hit captchas. A paid solver like sadcaptcha can fix it; this repo doesn't include one.
  • Headless selenium: cookies must already exist or the upload will crash. Run login.py on a machine with a GUI first.
  • Models download lazily on first transcription. Vosk shows a tqdm progress bar; TinyLlama uses huggingface_hub.snapshot_download (resumable).
  • libass fonts: the burn-in filter is invoked with fontsdir=fonts/, so any .ttf you drop in fonts/ is available. Default is Bangers.ttf.

Vosk or whisper

Whisper

Pros:

  • Really accurate
  • Better profanity filter due to accuracy

Cons:

  • Subtitles linger/timing is bad
  • Late/early profanity mute due to timing

Vosk

Pros:

  • Timing is really good
  • Timing of muting profanity very good

Cons:

  • Not very accurate, so words might not get filtered
  • A lot of the words are not accurate

Notes

Maybe adapt whisper to use https://github.com/m-bain/whisperX for better timing

  • I used vosk for the example video in readme
  • Vosk vs whisper comparison in the demo_and_images folder

(back to top)

License

Do not Sell this program. Do not use it for your own cloud service you are selling like this. Other than that do what you like with it.

Acknowledgments

Thank you to the following projects for making this possible.

TODO

  • Add support for different models (current ones are outdated)
  • Change preview of subtitles, maybe run subtitles through llm to get emojis or custom colors per line
  • Color param for text
  • CV to focus crop around face/person talking

(back to top)

About

Podcast Clip Automation Using AI

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors