🎬 Autonomous AI Video Production Pipeline
Features • Installation Guide • Pipeline • Dashboard • Cost Optimization • Scheduling
🤖 Automate AI-powered stickman animation production, from script to YouTube
| Feature | Description |
|---|---|
| 🤖 AI Scriptwriting | Gemini 2.5 Flash generates viral hooks + scene-by-scene storyboard |
| 🎨 AI Image Generation | Imagen 3.0 creates consistent stickman characters across scenes |
| 🎬 Slideshow Mode | Ken Burns zoom effect — $0 cost for video generation |
| 🎥 Veo Mode | Google Veo 2.0 AI video clips (premium, allow-listed) |
| 🗣️ Free TTS | edge-tts — neural voiceovers 100% free, no API key |
| 🎵 Background Music | Loop + duck BGM from assets/bgm.mp3 |
| 📝 Subtitles | Lower-thirds with drop-shadow via MoviePy |
| 📊 Streamlit Dashboard | Full GUI: generate, preview, download, publish |
| 📅 Autonomous Scheduler | 30-day monthly plan with randomised 5–7h intervals |
| 🚀 YouTube Upload | OAuth 2.0 → auto-publish as public |
| 💰 Cost-Efficient | Slideshow + edge-tts = ~$0.02/video (Gemini + Imagen only) |
| 🔄 Local Caching | Re-runs skip all completed phases — 0s on cache hit |
| Requirement | Minimum | Recommended |
|---|---|---|
| OS | Windows 10, macOS 12+, Linux (Ubuntu 20.04+) | Any 64-bit OS |
| Python | 3.10 | 3.11+ |
| RAM | 4 GB | 8 GB+ |
| Disk Space | 500 MB (project) + 2 GB (cached videos) | 10 GB free |
| Internet | Broadband (for API calls) | 10+ Mbps |
| FFmpeg | v4.0 | v6.0+ |
Windows
- Go to python.org/downloads
- Download Python 3.11 or 3.12
- IMPORTANT: Check ✅ "Add Python to PATH" during installation
- Click Install Now
- Verify:
python --version pip --version
macOS
# Using Homebrew (recommended)
brew install python@3.11
# Verify
python3 --version
pip3 --versionLinux (Ubuntu/Debian)
sudo apt update
sudo apt install python3 python3-pip python3-venv -y
python3 --versionFFmpeg is required for video assembly and the slideshow effect.
Windows
- Download from gyan.dev/ffmpeg/builds → ffmpeg-release-full.7z
- Extract to
C:\ffmpeg - Add to PATH:
- Search → "Environment Variables"
- Under System Variables → Path → Edit
- Add:
C:\ffmpeg\bin - OK all windows
- Verify:
ffmpeg -version
Alternative: Install via
winget:winget install "FFmpeg (Essentials Build)"
macOS
brew install ffmpeg
ffmpeg -versionLinux
sudo apt install ffmpeg -y
ffmpeg -versiongit clone https://github.com/saiedpod-bot/Stickman-Studio.git
cd Stickman-Studio# Create virtual environment
python -m venv .venv
# Activate it:
# Windows:
.venv\Scripts\activate
# macOS / Linux:
source .venv/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install all dependencies
pip install -r requirements.txt
pip install edge-tts # Free local TTS (neural voices)You need a Google Cloud account to use Gemini (AI script) and Imagen (AI images). The $300 free trial gives you 90 days of free credits — enough to produce thousands of videos.
- Go to cloud.google.com/free
- Click "Get started for free"
- Sign in with your Google account (Gmail)
- Fill in:
- Country
- Name & address (billing info — your card will NOT be charged, used only for verification)
- Credit/Debit card (Google does a temporary $1 hold and refunds it)
- ✅ You now have $300 in free credits + 90-day trial
⚠️ No charges without your consent. The free tier also includes many always-free products. You can set budgets and alerts in the console.
- Go to console.cloud.google.com
- At the top bar, click the project dropdown → New Project
- Enter project name (e.g.
stickman-studio) - Note the Project ID (e.g.
stickman-studio-123456) — you'll need it later - Click Create
Enable these APIs for your project:
| API | Purpose | Link |
|---|---|---|
| Vertex AI API | Gemini (scripts) + Imagen (images) | Enable |
| Cloud Storage | Store generated assets (optional) | Enable |
| YouTube Data API v3 | Upload videos (optional) | Enable |
To enable each:
- Click the Enable link above
- Make sure your project is selected (top bar)
- Click Enable
This is how the project authenticates with Google Cloud:
- Go to Service Accounts
- Click + Create Service Account
- Name:
stickman-studio-sa - Click Create and Continue
- Under Grant access → Add roles:
- Vertex AI User (
roles/aiplatform.user) - Storage Object Admin (
roles/storage.objectAdmin)
- Vertex AI User (
- Click Done
- In the service accounts list, click on the email of your new account
- Go to Keys tab → Add Key → Create New Key
- Choose JSON → Create
- A
.jsonfile will download automatically — keep it safe! - Rename it if you like (e.g.
stickman-studio-key.json)
-
Move the service account JSON key to the project root folder (
Stickman-Studio/) -
Copy the example env file:
cp .env.example .env
-
Open
.envin any text editor and fill in:# Your GCP project ID (from Step 7) GCP_PROJECT_ID=stickman-studio-123456 # Path to the service account key (from Step 9) GOOGLE_APPLICATION_CREDENTIALS=stickman-studio-key.json # GCP region (keep default) GCP_LOCATION=us-central1
-
Save the file
Why install gcloud CLI?
The gcloud CLI is helpful for:
- Debugging authentication issues
- Managing Google Cloud resources from the terminal
- Setting up Application Default Credentials (if not using a service account)
Not required for basic usage — the service account key is sufficient.
Windows
# Download installer
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/GoogleCloudSDKInstaller.exe
# Run the installer (follow GUI prompts)
# After installation, authenticate:
gcloud auth application-default loginmacOS / Linux
# Install
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
# Authenticate
gcloud auth application-default login# Make sure virtual environment is activated
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
# Generate a video with default slideshow mode:
python orchestrator.py "How black holes work" --video-mode slideshow
# Generate with Veo animation (premium, costs credits):
python orchestrator.py "Why the sky is blue" --video-mode animationWhat to expect:
[1/4] ✍️ Script → Gemini generates storyboard (5-10 sec)
[2/4] 🎨 Images → Imagen generates scene images (30-60 sec)
[3/4] 🎬 Video → ffmpeg Ken Burns zoom (10-20 sec)
[3.5] 🗣️ TTS → edge-tts generates narration (10-30 sec)
[4/4] 🎞️ Assembly → ffmpeg combines everything (10-20 sec)
✅ Video saved to: projects/how_black_holes_work/final.mp4
streamlit run app.pyOpens in your browser at http://localhost:8501
To enable automatic publishing to YouTube:
- Go to Google Cloud Console → Credentials
- Click + Create Credentials → OAuth Client ID
- Application type: Desktop app
- Name:
Stickman Studio YouTube Uploader - Click Create
- Click Download JSON — rename it to
client_secrets.json - Move
client_secrets.jsonto the project root folder - First upload will open your browser for OAuth consent:
- Sign in with your YouTube channel's Google account
- Click Advanced → Go to App (unsafe) → Allow
- Token is cached in
youtube_token.jsonfor future runs - ✅ Done! All future uploads will be automatic.
Topic Input
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 1: Script (Gemini 2.5 Flash) │
│ → Viral hook + scene-by-scene storyboard.json │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 2: Images (Imagen 3.0) │
│ → Character reference + consistent scene images │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 3: Video │
│ ┌─ slideshow: ffmpeg Ken Burns zoom (FREE) ──────┐│
│ └─ animation: Veo 2.0 image-to-video (premium) ┘│
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 3.5: Narration (edge-tts — FREE) │
│ → Per-scene MP3 with neural voices │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 4: Assembly (ffmpeg — local) │
│ → Concatenate + overlay audio + BGM → final.mp4 │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 5: YouTube Upload (OAuth 2.0) │
│ → Auto-publish as public with title, desc, tags │
└─────────────────────────────────────────────────────┘
Pre-generated example videos showing the pipeline output (also available on the Releases page):
| Slideshow Mode — "How Gravity Works" (2.3 MB) | Slideshow Mode — "How Magnets Work" (0.6 MB) |
|---|---|
| 3 scenes each — AI script (Gemini) + AI images (Imagen) + Ken Burns zoom (ffmpeg) + neural TTS (edge-tts) + subtitles (MoviePy) | |
All samples are in samples/ directory. Generate your own with:
python orchestrator.py "Your topic here" --video-mode slideshowLaunch the full GUI:
streamlit run app.pyTabs:
| Tab | Purpose |
|---|---|
| Studio | Select an idea, generate video, preview, publish |
| Gallery | Browse all previously generated projects |
| Ready to Upload | Videos from batch production, ready for YouTube |
| Schedule | Autonomous mode toggle, random 5–7h intervals |
| Monthly | 30-day production plan with kill switch & health monitor |
- Live logs — real-time pipeline output in
st.status - Progress bars — per-phase, per-video tracking
- System Health — 🟢 Green / 🟡 Yellow / 🔴 Red indicator
- Kill Switch — immediately halt all autonomous processes
- Estimated API Usage — remaining Gemini/Imagen calls
- System Logs — expandable panel with last 100 log lines
| Feature | Cost | Notes |
|---|---|---|
| Slideshow Mode | $0 | --video-mode slideshow — ffmpeg Ken Burns zoom |
| edge-tts | $0 | Free neural TTS, no API key needed |
| Local Caching | $0 | Re-runs skip completed phases entirely |
| Gemini 2.5 Flash | ~$0.0005/call | Script generation |
| Imagen 3.0 | ~$0.02/image | Image generation |
| Veo 2.0 | ~$0.05/clip | Only when using --video-mode animation |
| YouTube Upload | $0 | Free via OAuth 2.0 |
Default mode is slideshow to minimize costs.
# Via dashboard: Schedule tab → Toggle OnRandomised intervals avoid YouTube pattern detection.
# Via dashboard: Monthly tab → Start Monthly Plan- 1–3 videos/day, randomly assigned
- 5-day blocks with rotating publishing windows (08:00, 10:00, 14:00, 16:00, 20:00)
- Persists to
system_state.json— survives server reboots - Kill switch available in the UI
# CLI: process all ideas in daily_plan.json
python -c "from scheduler import start_batch_production; start_batch_production()"
# Full autonomous cycle (plan → produce → upload)
python -c "from scheduler import run_autonomous_cycle; run_autonomous_cycle('Science')"stickman_studio/
├── app.py # Streamlit dashboard
├── orchestrator.py # CLI + importable pipeline runner
├── content_planner.py # Gemini → viral video ideas
├── scheduler.py # Batch + autonomous + monthly scheduler
├── uploader.py # YouTube OAuth 2.0 upload + auto-publish
├── tts_engine.py # edge-tts (free local TTS)
├── storage.py # GCS upload/download
├── ai_engine.py # Module-level wrappers for all phases
├── requirements.txt
├── .env.example → .env # Configuration
├── client_secrets.json # YouTube OAuth (user-provided)
├── samples/ # Pre-generated example videos
├── assets/
│ └── bgm.mp3 # Optional background music
├── stickman_studio/
│ ├── config.py # .env loading + Vertex AI init
│ ├── logging_setup.py # Console + file logging
│ ├── models.py # Scene / StoryBoard dataclasses
│ ├── retry.py # Tenacity retry decorator
│ └── phases/
│ ├── phase1_script.py # Gemini storyboard
│ ├── phase2_images.py # Imagen 3.0 images
│ ├── phase3_video.py # Veo 2.0 video
│ ├── phase3_slideshow.py # Ken Burns zoom
│ ├── phase4_assembly.py # ffmpeg concat + audio
│ └── phase4_subtitles.py # MoviePy subtitles
| Dependency | Version | Purpose |
|---|---|---|
| Python | 3.10+ | Runtime |
google-cloud-aiplatform |
≥1.158 | Vertex AI SDK |
google-genai |
≥2.9 | Veo client |
edge-tts |
≥7.2 | Free TTS |
streamlit |
≥1.28 | Dashboard |
moviepy |
≥2.1 | Subtitles |
google-api-python-client |
— | YouTube API |
google-auth-oauthlib |
— | YouTube OAuth |
ffmpeg |
≥4.0 | Video assembly |
stickman-studio ai-video-generation google-vertex-ai
gemini imagen veo youtube-automation content-creator
python streamlit edge-tts text-to-video
free-tts ai-animation video-pipeline
Add these to your GitHub repo → Settings → Topics for discoverability.
MIT © 2026 — Free to use, modify, and distribute.
Made with ❤️ and 🤖
Automate your content. Own your audience.