Intelligent sports highlight generation with agentic AI
ArenaVision is an AI-powered system that automatically analyzes sports videos, detects key moments, creates highlight reels, and generates commentary using Google Cloud's advanced AI services.
Watch the full demonstration: ArenaVision Demo
- Multi-Input Support: YouTube URLs, file uploads, and live streams
- Agentic AI Architecture: Specialized AI agents for each processing stage
- Dual Vision Analysis: Combines Google Video Intelligence API + Gemini Vision
- Intelligent Highlight Detection: Automatically ranks and selects best moments
- Smooth Video Editing: Creates highlight reels with fade transitions
- AI Commentary: Generates text and audio commentary for highlights
- Interactive Editing: Chatbot-powered iterative video refinement
- Logo & Intro Generation: Creates custom logos (Imagen 3) and intro videos (Veo 3.1)
- Social Media Integration: Direct posting to X (Twitter)
*Primary screen showing video input options and processing controls. View and download individual highlight segments with descriptions and timestamps. Iterative editing with chatbot assistance β remove segments, trim clips, and refine highlight.
*
Generate custom logos using Imagen 3 and create intro videos with Veo 3.1
Review final highlight reel, download, and post directly to X (Twitter)
ArenaVision uses a multi-agent pipeline architecture where specialized agents process video through sequential stages:
βββββββββββββββ
β Input Agent β β Handles video input (YouTube/Upload/Live Stream)
βββββββββββββββ
β
βββββββββββββββ
βVision Agent β β Analyzes video content (detects plays, events)
βββββββββββββββ
β
βββββββββββββββ
βPlanner Agentβ β Ranks moments, creates highlight segments
βββββββββββββββ
β
βββββββββββββββ
βEditor Agent β β Extracts & compiles highlight reel
βββββββββββββββ
β
ββββββββββββββββ
βCommentator β β Generates text & audio commentary
βAgent β
ββββββββββββββββ
- Modularity: Each agent has a single responsibility
- Extensibility: Easy to add new agents or modify existing ones
- Error Handling: Each agent handles errors gracefully with fallbacks
- State Management: Pipeline orchestrator manages data flow between agents
- Python 3.9+: Main programming language
- Streamlit: Web UI framework for interactive frontend
- MoviePy: Video editing and manipulation
- yt-dlp: YouTube video downloading
- OpenCV: Video processing utilities
- Video Intelligence API: Detects sports events, shot changes, objects
- Gemini 2.0 Flash: Vision analysis and text generation
- Veo 3.1: Video generation (intro videos)
- Imagen 3 (WHISK): Logo/image generation
- Text-to-Speech (gTTS): Audio commentary generation
google-cloud-videointelligence: Video analysisgoogle-generativeai: Gemini API accessgoogle-cloud-aiplatform: Vertex AI servicespydub: Audio processingpillow: Image processing
pip install -r requirements.txtKey Dependencies:
streamlit>=1.28.0google-cloud-videointelligence>=2.17.0google-generativeai>=0.3.0moviepy>=1.0.3,<2.0decorator>=4.4.1,<4.4.2(compatibility with MoviePy)yt-dlp>=2023.10.7gtts>=2.4.0
Create a .env file in the project root:
# Google Cloud
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_APPLICATION_CREDENTIALS=./service-account-key.json
GOOGLE_API_KEY=your-api-key
# Twitter (optional, for posting)
TWITTER_API_KEY=...
TWITTER_API_SECRET=...
TWITTER_ACCESS_TOKEN=...
TWITTER_ACCESS_SECRET=...- Create a service account in Google Cloud Console
- Grant roles:
- Vertex AI User
- Video Intelligence API User
- Download JSON key file
- Set
GOOGLE_APPLICATION_CREDENTIALSpath in.env
Enable in Google Cloud Console:
- Video Intelligence API
- Vertex AI API
- Generative AI API
python test_keys.pystreamlit run app.pyThe app will open at http://localhost:8501
- YouTube Mode: Downloads video using
yt-dlpwith custom headers - Upload Mode: Validates and processes uploaded video files
- Live Stream Mode: Connects to RTSP streams for real-time processing
- Video Intelligence API: Detects sports events, shot changes, and objects
- Gemini Vision: Analyzes key frames for context, player visibility, and crowd reactions
- Dual Analysis: Combines structured detection with contextual understanding
- Moment Collection: Gathers all potential highlights from multiple sources
- Intelligent Ranking: Scores moments based on:
- Success (made vs missed shots)
- Crowd reaction (excitement level)
- Timing (endings prioritized)
- Action detection
- Segment Creation: Creates video segments with proper pre/post buffers
- Segment Extraction: Extracts individual highlight clips
- Reel Compilation: Combines segments with smooth fade transitions
- Quality Optimization: Ensures proper timing and flow
- Text Generation: Creates exciting commentary using Gemini AI
- Audio Synthesis: Converts to speech using Google Text-to-Speech
- Synchronization: Matches commentary to video timestamps
- Purple Gradient Design: Professional sports aesthetic
- Oswald & Montserrat Fonts: Modern, bold typography
- Animated Backgrounds: Subtle particle effects
- Smooth Transitions: Polished user experience
- Click-Anywhere Navigation: Landing page supports full-screen clicking
- Iterative Editing: Chatbot-powered video refinement
- Real-time Progress: Visual progress bars during processing
- Responsive Design: Works on various screen sizes
hack/
βββ agents/ # Agent implementations
β βββ base_agent.py # Abstract base class
β βββ input_agent.py # Video input handling
β βββ vision_agent.py # Video analysis
β βββ planner_agent.py # Highlight planning
β βββ editor_agent.py # Video editing
β βββ commentator_agent.py # Commentary generation
β βββ chatbot_agent.py # Interactive editing
βββ handlers/ # Input handlers
β βββ youtube_handler.py
β βββ live_stream_handler.py
βββ utils/ # Utility functions
β βββ video_utils.py
β βββ video_editor.py
β βββ image_generator.py
β βββ veo_generator.py
βββ app.py # Streamlit frontend
βββ pipeline.py # Pipeline orchestrator
βββ config.py # Configuration management
βββ requirements.txt # Python dependencies
βββ .env # Environment variables (not in repo)
You need 3 things:
- GOOGLE_API_KEY - Get from Google AI Studio
- GOOGLE_CLOUD_PROJECT - Your project ID from Google Cloud Console
- GOOGLE_APPLICATION_CREDENTIALS - Service account JSON file
π Detailed instructions: See API_KEYS_GUIDE.md for complete setup guide.
Set up your Google Cloud project and enable:
- Video Intelligence API
- Vertex AI
- Generative AI APIs (Gemini, Veo, Imagen)
- Skips Video Intelligence API for faster processing
- Uses only Gemini Vision analysis
- Ideal for short videos or quick demos
- Chatbot-powered video refinement
- Natural language commands:
- "Remove the second segment"
- "Trim 5 seconds from the end"
- "Reorder clips by excitement"
- Maintains edit history for undo/redo
- Logo Generation: Uses Imagen 3 (WHISK) to create custom logos
- Intro Videos: Uses Veo 3.1 to generate 5-second intro videos
- Customization: Text overlays and background descriptions
- Direct posting to X (Twitter)
- Automatic video upload and caption
- OAuth authentication
-
MoviePy Import Error
- Solution: Install
decorator==4.4.1(compatibility fix)
- Solution: Install
-
Video Intelligence API Slow
- Solution: Use Fast Mode for quicker processing
-
Gemini Rate Limits
- Solution: Switch to
gemini-1.5-flashfor higher quotas
- Solution: Switch to
-
Python 3.9 Compatibility
- Solution: Uses
compat_fix.pyshim for Google Cloud libraries
- Solution: Uses
TECHNICAL_README.md: Comprehensive technical documentationAGENTS.md: Detailed agent documentationAPI_KEYS_GUIDE.md: Step-by-step API setupPIPELINE_FLOW.md: Visual pipeline flow diagramsARCHITECTURE.md: System architecture details
5-Minute Demo Structure:
- Landing Page (10s): Animated welcome screen
- Input (30s): Paste YouTube URL or upload video
- Processing (60s): Show progress bar, explain agent pipeline
- Results (90s): Display highlight reel, segments, commentary
- Editor (60s): Show iterative editing with chatbot
- Logo/Intro (30s): Generate logo and intro video
- Final (30s): Download and post to X
- Service account keys stored in
.env(not committed to repo) - API keys loaded from environment variables
- No hardcoded credentials
.gitignoreexcludes sensitive files
This project is part of a hackathon submission.
This is a hackathon project. For questions or issues, please refer to the documentation files.
Built with β€οΈ using Google Cloud AI Services
For detailed technical information, see TECHNICAL_README.md