A comprehensive computer vision project that combines person detection, tracking, and emotion recognition using state-of-the-art AI models. Features a modern desktop GUI application with OpenCV visualization and a mobile-first web interface.
- ๐ค Person Detection & Tracking - YOLOv8n with ByteTrack for multi-person tracking with persistent IDs
- ๐ค Face Detection - Specialized YOLOv8n-face-lindevs model for accurate face detection within person bounding boxes
- ๐ Emotion Recognition - Real-time facial emotion analysis using FER (optimized 1-second intervals)
- ๐บ Multi-Source Input - Supports webcam and YouTube livestreams (desktop app)
- ๐ค๏ธ Track History - Visual trails showing person movement paths with polylines (max 30 points)
- ๐ฑ Mobile Web Interface - WebRTC-based web app with server-side AI processing
- โก ONNX Optimization - Faster inference with optimized ONNX models (automatic fallback to PyTorch)
- ๐ Live Statistics - Real-time FPS counter and person count display
- ๐ฎ Modern GUI Controls - OpenCV-based desktop interface with keyboard shortcuts
- ๐จ Glass Morphism UI - Modern overlay design with semi-transparent elements
๐ PersonTracker/
โโโ ๐ README.md # Main project documentation
โโโ โ๏ธ engine.py # Core tracking engine with PersonTrackerEngine class
โโโ ๐ฅ๏ธ gui.py # Modern OpenCV-based desktop interface
โโโ ๐ ONNX_Conversion.ipynb # Model optimization notebook
โโโ ๐ requirements.txt # Python dependencies (16 packages)
โโโ ๐ models/ # AI Models storage
โ โโโ ๐ค yolov8n.onnx # Optimized person detection (primary)
โ โโโ ๐ค yolov8n-face-lindevs.onnx # Optimized face detection (primary)
โ โโโ ๐ archive/ # Original PyTorch models (.pt files)
โ โโโ yolov8n.pt # Person detection (fallback)
โ โโโ yolov8n-face-lindevs.pt # Face detection (fallback)
โ โโโ yolo11n.pt # Alternative model
โโโ ๐ WebApp/ # Web-based mobile interface
โโโ ๐ server.py # Flask backend (76 lines) with AI processing
โโโ ๐ฑ index.html # Mobile-optimized frontend with WebRTC
โโโ ๐จ styles.css # Modern gradient UI styling
โโโ โก script.js # Interactive JavaScript for camera
โโโ ๐ README.md # WebApp documentation (411 lines)
โโโ ๐ DEPLOYMENT_GUIDE.md # Production deployment guide
# Clone the repository
git clone <repository-url>
cd PersonTracker
# Install dependencies
pip install -r requirements.txtpython gui.py# Use webcam (default)
python gui.py --source webcam --webcam_id 0 --conf 0.4
# Use YouTube stream
python gui.py --source youtube --youtube_url "https://youtu.be/su33E1lreMc" --conf 0.4cd WebApp
python server.py- Local Testing: http://localhost:8080
- Mobile Access: http://YOUR_IP:8080 (find IP using
ipconfigon Windows)
- Tap "๐ Start Tracking" to begin camera capture
- Allow camera permissions when prompted
- View real-time AI detection with emotion analysis
python gui.py [options]
Options:
--source {webcam,youtube} Source type (default: webcam)
--youtube_url URL YouTube URL (default: provided demo URL)
--webcam_id INT Webcam device ID (default: 0)
--conf FLOAT Confidence threshold (default: 0.4)- ONNX Models (Preferred): Automatically used if available in
models/directory - PyTorch Fallback: Uses
.ptfiles frommodels/archive/if ONNX not found - Image Processing: 1280x720 display resolution, 416x416 for ONNX inference
- 0.3 - More detections, some false positives
- 0.4 - Balanced (recommended default)
- 0.5 - Higher precision, fewer detections
- 0.6+ - Very strict, minimal false positives
# Default settings (webcam, confidence 0.4)
python gui.py
# Specific webcam with custom confidence
python gui.py --webcam_id 1 --conf 0.5# Use default demo YouTube URL
python gui.py --source youtube
# Use custom YouTube URL
python gui.py --source youtube --youtube_url "https://youtu.be/YOUR_VIDEO_ID"cd WebApp
python server.py
# Access at http://localhost:8080# Find your IP address
ipconfig # Windows
ifconfig # Mac/Linux
# Access from mobile: http://YOUR_IP:8080
# Example: http://192.168.1.100:8080import cv2
from engine import PersonTrackerEngine
# ... (see Advanced Usage section for complete example)- Python 3.8+
- 4GB RAM
- CPU: Intel i5 or AMD Ryzen 5
- Storage: 2GB free space
- Camera: For webcam mode
- 8GB+ RAM
- GPU: NVIDIA GTX 1060 or better (CUDA support)
- CPU: Intel i7 or AMD Ryzen 7
- SSD Storage
ultralytics>=8.0.0 # YOLO models and ByteTrack tracking
opencv-python>=4.5.0 # Computer vision operations and GUI
numpy>=1.21.0 # Numerical computations
yt-dlp>=2023.1.0 # YouTube stream extraction (desktop only)
fer>=22.4.0 # Facial emotion recognition
flask>=2.0.0 # Web framework (WebApp only)
Total Package Count: 16 dependencies (see requirements.txt for complete list)
- ๐น Video Input - Captures frames from webcam (OpenCV) or YouTube stream (yt-dlp)
- ๐ Person Detection - YOLOv8n detects persons with ByteTrack for ID persistence
- ๐ค Face Detection - YOLOv8n-face-lindevs finds faces within person bounding boxes
- ๐ Emotion Analysis - FER analyzes facial expressions (1-second intervals for performance)
- ๐จ Visualization - Draws bounding boxes, track history polylines, and emotion labels
- ๐ Statistics - Real-time FPS calculation and person count display
- Person Detection: YOLOv8n (optimized for speed) - Red bounding boxes
- Face Detection: YOLOv8n-face-lindevs (specialized) - Green bounding boxes
- Emotion Recognition: FER library (7 emotions: angry, disgust, fear, happy, sad, surprise, neutral)
- Tracking: ByteTrack algorithm with unique ID persistence
- Track History: Gray polylines showing movement paths (limited to 30 points)
- Engine Class:
PersonTrackerEngineinengine.pyhandles all AI processing - GUI Interface: OpenCV-based viewer with modern overlay design
- Web Interface: Flask server with WebRTC frontend for mobile access
- Automatic Fallback: ONNX โ PyTorch model loading with error handling
- [F] - Toggle fullscreen mode (with fallback to window resize)
- [SPACE] - Pause/Resume video processing (preserves last frame)
- [Q] or [ESC] - Quit application gracefully
- Modern Header Overlay - Semi-transparent dark background with key information
- Title Display - "PERSON TRACKER - AI VISION" with orange glow effect
- Source Indicator - Shows "Webcam" or "YouTube" source type
- Live Statistics - FPS counter (green) and person count display
- Pause Indicator - "|| PAUSED" message when video is paused
- Control Help - Bottom overlay showing available keyboard shortcuts
- Person Boxes - Red rectangles around detected persons with ID numbers
- Face Boxes - Green rectangles around detected faces with "Face" label
- Emotion Labels - Displayed near faces with confidence percentage
- Track History - Gray polylines showing movement paths (max 30 points)
- Glass Morphism UI - Modern semi-transparent overlays with proper alpha blending
- ๐ Start Tracking - Begin WebRTC camera capture and AI processing
- ๐ Stop - End tracking session and release camera resources
- ๐ Live Dashboard - Real-time FPS, person count, and connection status
- ๐ฑ Responsive Design - Adapts to any screen size and orientation
- ๏ฟฝ Touch-Optimized - Large buttons designed for mobile interaction
| Document | Description | Target Audience |
|---|---|---|
| WebApp README | Web interface features, mobile usage | End users, demo presenters |
| Deployment Guide | Production deployment with Cloudflare | DevOps, system admins |
| ONNX Conversion | Model optimization tutorial | ML engineers |
- Models are automatically loaded from
models/directory if available - Fallback to PyTorch models in
models/archive/if ONNX files missing - No manual configuration required - the engine handles model selection
The application attempts multiple quality levels automatically:
- 1080p โ 720p60 โ 720p โ 480p โ 360p โ 240p โ 144p
- Uses yt-dlp with 30-second timeout per format attempt
- Selects best available MP4 format for OpenCV compatibility
- Use ONNX models for ~2x faster inference than PyTorch
- Adjust confidence threshold based on accuracy vs. speed needs
- Close other applications to free up system resources
- Use SSD storage for faster model loading
- Enable GPU acceleration if CUDA-compatible GPU available
- Low FPS โ Check if ONNX models are being used, close other apps
- High CPU usage โ Consider lowering confidence threshold
- Memory issues โ Restart application, check available RAM
- Camera issues โ Try different webcam_id values (0, 1, 2, etc.)
| Hardware | Resolution | FPS | Person Detection | Face Detection |
|---|---|---|---|---|
| RTX 3070 | 1280x720 | 25-30 | โ Excellent | โ Excellent |
| GTX 1060 | 1280x720 | 15-20 | โ Good | โ Good |
| Intel i7 (CPU) | 1280x720 | 8-12 | โ Acceptable | |
| Intel i5 (CPU) | 640x480 | 10-15 | โ Good | โ Acceptable |
- Processing Interval: 1 second (optimized for performance)
- Accuracy: ~85% on standard datasets
- Latency: <100ms per face on GPU, <500ms on CPU
# Try different camera IDs
python gui.py --webcam_id 1 # or 2, 3, etc.
# Check available cameras on Windows
# Device Manager โ Cameras# Update yt-dlp to latest version
pip install -U yt-dlp
# Try with a different YouTube URL
python gui.py --source youtube --youtube_url "NEW_URL"โ Error: No face detection model found!
โ
Solution: Ensure required files exist:
- models/yolov8n.onnx (or models/archive/yolov8n.pt)
- models/yolov8n-face-lindevs.onnx (or models/archive/yolov8n-face-lindevs.pt)
# Check if ONNX models are being used (console output shows model type)
# Run ONNX conversion notebook if needed
# Close other applications to free resources
# Lower confidence threshold: --conf 0.3# Reinstall all dependencies
pip install -r requirements.txt --force-reinstall
# Check Python version (3.8+ required)
python --version# Check if port 8080 is available
netstat -an | findstr 8080 # Windows
lsof -i :8080 # Mac/Linux
# Try different port
python server.py # Modify port in server.py if needed- Console shows model loading status and file paths
- FPS counter indicates performance issues (should be >10 for smooth operation)
- Error messages provide specific failure details
from engine import PersonTrackerEngine
from ultralytics import YOLO
from fer import FER
# Initialize models
person_model = YOLO('models/yolov8n.onnx')
face_model = YOLO('models/yolov8n-face-lindevs.onnx')
emotion_detector = FER()
# Create engine instance
engine = PersonTrackerEngine(
person_model=person_model,
face_model=face_model,
emotion_detector=emotion_detector,
conf=0.4,
emotion_interval=1.0
)
# Process frame
processed_frame, emotions, last_emotion_time, fps, person_count = engine.process_frame(frame)- Train custom YOLO models using Ultralytics framework
- Convert to ONNX format using the provided notebook
- Place in
models/directory with appropriate naming - Application will automatically detect and use new models
The desktop application uses a fallback system for YouTube streams:
- Attempts highest quality first (1080p)
- Falls back through: 720p60 โ 720p โ 480p โ 360p โ 240p โ 144p
- Automatically selects best available format for real-time processing
The PersonTrackerEngine class can be imported and used in other Python projects:
- Real-time video analysis pipelines
- Security camera systems
- Interactive art installations
- Research applications
- Desktop OpenCV GUI - Modern overlay interface with keyboard controls (โ Complete)
- Mobile Web Interface - WebRTC-based mobile app with Flask backend (โ Complete)
- Configuration File - Save/load user preferences and settings
- Recording Feature - Save processed video output to file
- Screenshot Capture - Save current frame with detections
- Enhanced GUI Controls - Settings panel within OpenCV interface
- Multiple Camera Support - Process multiple webcam streams simultaneously
- Data Export - CSV logging of tracking data and emotion history
- Real-time Dashboard - Separate statistics and analytics window
- Model Hot-swapping - Change models without restarting application
- Custom Emotion Models - Train domain-specific emotion recognition
- 3D Pose Estimation - Extended person analysis capabilities
- Age/Gender Detection - Additional demographic analysis
- Behavior Analysis - Activity and gesture recognition
- Multi-target Re-identification - Prevent ID switching and improve tracking
# Clone repository
git clone <repository-url>
cd PersonTracker
# Install all dependencies
pip install -r requirements.txt
# Run desktop application
python gui.py
# Or run mobile web application
cd WebApp && python server.py# Create virtual environment (recommended)
python -m venv venv
.\venv\Scripts\Activate.ps1
# Install dependencies
pip install -r requirements.txt
# Verify installation
python gui.py --help# Create isolated environment
python -m venv venv
# Activate environment
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install in development mode
pip install -r requirements.txt
# Optional: Install Jupyter for ONNX conversion
pip install jupyter notebook- ONNX Models (Recommended): Run
ONNX_Conversion.ipynbto generate optimized models - PyTorch Fallback: Application will download YOLOv8n.pt automatically if needed
- Face Model: Ensure
yolov8n-face-lindevs.ptis inmodels/archive/directory
- ๏ฟฝ๏ธ Professional Display - Full-screen mode with modern overlay design
- โก Real-time Processing - Live AI detection with minimal latency
- ๏ฟฝ Interactive Controls - Easy keyboard shortcuts for live demonstrations
- ๐ Live Statistics - Professional FPS and detection count display
- ๏ฟฝ Reliable Performance - ONNX optimization for consistent frame rates
- ๐ฑ Zero Installation - Works instantly on any mobile device
- ๐จ Modern UI Design - Professional gradient interface with smooth animations
- ๏ฟฝ Live Dashboard - Real-time statistics and connection status
- ๐ Network Access - Easy sharing via IP address for multiple devices
- ๐ Touch-Optimized - Large buttons designed for public interaction
Ready for public deployment with Cloudflare Tunnel:
# Quick public deployment (free tier)
cloudflared tunnel --url http://localhost:8080For permanent deployment with custom domains, see DEPLOYMENT_GUIDE.md
- Use ONNX models for best performance
- Set confidence to 0.4 for balanced detection
- Test camera lighting before exhibition
- Provide clear instructions for mobile users
- Monitor server performance during high traffic
๐ฏ Ready to get started? Try the desktop interface: python gui.py
๐ฑ Want the mobile experience? Check out the WebApp README!
We welcome contributions! Please feel free to submit issues, feature requests, or pull requests.
- Follow existing code style and structure
- Test on both desktop and mobile interfaces
- Update documentation for new features
- Ensure ONNX and PyTorch model compatibility
This project is licensed under the MIT License - see the LICENSE file for details.
- Ultralytics for the excellent YOLOv8 implementation and ByteTrack integration
- FER Library for robust emotion recognition capabilities
- OpenCV for comprehensive computer vision operations and GUI framework
- yt-dlp for reliable YouTube stream extraction
- Flask for lightweight web framework enabling mobile interface
- The open-source AI community for inspiration and collaborative development
Made with โค๏ธ for the AI and Computer Vision community