🎯 AI Person Tracker with Emotion Detection

A comprehensive computer vision project that combines person detection, tracking, and emotion recognition using state-of-the-art AI models. Features a modern desktop GUI application with OpenCV visualization and a mobile-first web interface.

✨ Key Features

🤖 Person Detection & Tracking - YOLOv8n with ByteTrack for multi-person tracking with persistent IDs
👤 Face Detection - Specialized YOLOv8n-face-lindevs model for accurate face detection within person bounding boxes
😊 Emotion Recognition - Real-time facial emotion analysis using FER (optimized 1-second intervals)
📺 Multi-Source Input - Supports webcam and YouTube livestreams (desktop app)
🛤️ Track History - Visual trails showing person movement paths with polylines (max 30 points)
📱 Mobile Web Interface - WebRTC-based web app with server-side AI processing
⚡ ONNX Optimization - Faster inference with optimized ONNX models (automatic fallback to PyTorch)
📊 Live Statistics - Real-time FPS counter and person count display
🎮 Modern GUI Controls - OpenCV-based desktop interface with keyboard shortcuts
🎨 Glass Morphism UI - Modern overlay design with semi-transparent elements

🏗️ Project Architecture

📂 PersonTracker/
├── 📄 README.md                     # Main project documentation
├── ⚙️ engine.py                    # Core tracking engine with PersonTrackerEngine class
├── 🖥️ gui.py                       # Modern OpenCV-based desktop interface
├── 📊 ONNX_Conversion.ipynb         # Model optimization notebook
├── 📋 requirements.txt              # Python dependencies (16 packages)
├── 📂 models/                       # AI Models storage
│   ├── 🤖 yolov8n.onnx             # Optimized person detection (primary)
│   ├── 👤 yolov8n-face-lindevs.onnx # Optimized face detection (primary)
│   └── 📂 archive/                  # Original PyTorch models (.pt files)
│       ├── yolov8n.pt              # Person detection (fallback)
│       ├── yolov8n-face-lindevs.pt # Face detection (fallback)
│       └── yolo11n.pt              # Alternative model
└── 📂 WebApp/                       # Web-based mobile interface
    ├── 🌐 server.py                 # Flask backend (76 lines) with AI processing
    ├── 📱 index.html                # Mobile-optimized frontend with WebRTC
    ├── 🎨 styles.css                # Modern gradient UI styling
    ├── ⚡ script.js                 # Interactive JavaScript for camera
    ├── 📄 README.md                 # WebApp documentation (411 lines)
    └── 🚀 DEPLOYMENT_GUIDE.md       # Production deployment guide

🚀 Quick Start Guide

🖥️ Desktop Application (Primary Interface)

1. Setup Environment

# Clone the repository
git clone <repository-url>
cd PersonTracker

# Install dependencies
pip install -r requirements.txt

2. Run Desktop GUI Application (Recommended)

python gui.py

3. Optional: Specify Source and Parameters

# Use webcam (default)
python gui.py --source webcam --webcam_id 0 --conf 0.4

# Use YouTube stream
python gui.py --source youtube --youtube_url "https://youtu.be/su33E1lreMc" --conf 0.4

📱 Mobile Web Application

1. Start the Flask Server

cd WebApp
python server.py

2. Access on Mobile Device

Local Testing: http://localhost:8080
Mobile Access: http://YOUR_IP:8080 (find IP using ipconfig on Windows)

3. Use the Interface

Tap "🚀 Start Tracking" to begin camera capture
Allow camera permissions when prompted
View real-time AI detection with emotion analysis

⚙️ Configuration & Usage

Desktop Application Arguments:

python gui.py [options]

Options:
  --source {webcam,youtube}     Source type (default: webcam)
  --youtube_url URL            YouTube URL (default: provided demo URL)
  --webcam_id INT              Webcam device ID (default: 0)
  --conf FLOAT                 Confidence threshold (default: 0.4)

Model Selection & Performance:

ONNX Models (Preferred): Automatically used if available in models/ directory
PyTorch Fallback: Uses .pt files from models/archive/ if ONNX not found
Image Processing: 1280x720 display resolution, 416x416 for ONNX inference

Confidence Threshold Guide:

0.3 - More detections, some false positives
0.4 - Balanced (recommended default)
0.5 - Higher precision, fewer detections
0.6+ - Very strict, minimal false positives

🎮 Usage Examples

Desktop Application Examples:

Basic Webcam Usage:

# Default settings (webcam, confidence 0.4)
python gui.py

# Specific webcam with custom confidence
python gui.py --webcam_id 1 --conf 0.5

YouTube Stream Processing:

# Use default demo YouTube URL
python gui.py --source youtube

# Use custom YouTube URL
python gui.py --source youtube --youtube_url "https://youtu.be/YOUR_VIDEO_ID"

Web Application Examples:

Local Development:

cd WebApp
python server.py
# Access at http://localhost:8080

Network Access Setup:

# Find your IP address
ipconfig  # Windows
ifconfig  # Mac/Linux

# Access from mobile: http://YOUR_IP:8080
# Example: http://192.168.1.100:8080

Integration Examples:

Using the Engine Directly:

import cv2
from engine import PersonTrackerEngine
# ... (see Advanced Usage section for complete example)

🛠️ System Requirements

Minimum Requirements:

Python 3.8+
4GB RAM
CPU: Intel i5 or AMD Ryzen 5
Storage: 2GB free space
Camera: For webcam mode

Recommended for Optimal Performance:

8GB+ RAM
GPU: NVIDIA GTX 1060 or better (CUDA support)
CPU: Intel i7 or AMD Ryzen 7
SSD Storage

Core Dependencies:

ultralytics>=8.0.0    # YOLO models and ByteTrack tracking
opencv-python>=4.5.0  # Computer vision operations and GUI
numpy>=1.21.0         # Numerical computations
yt-dlp>=2023.1.0      # YouTube stream extraction (desktop only)
fer>=22.4.0           # Facial emotion recognition
flask>=2.0.0          # Web framework (WebApp only)

Total Package Count: 16 dependencies (see requirements.txt for complete list)

🧠 How It Works

Core Processing Pipeline:

📹 Video Input - Captures frames from webcam (OpenCV) or YouTube stream (yt-dlp)
🔍 Person Detection - YOLOv8n detects persons with ByteTrack for ID persistence
👤 Face Detection - YOLOv8n-face-lindevs finds faces within person bounding boxes
😊 Emotion Analysis - FER analyzes facial expressions (1-second intervals for performance)
🎨 Visualization - Draws bounding boxes, track history polylines, and emotion labels
📊 Statistics - Real-time FPS calculation and person count display

Model Architecture:

Person Detection: YOLOv8n (optimized for speed) - Red bounding boxes
Face Detection: YOLOv8n-face-lindevs (specialized) - Green bounding boxes
Emotion Recognition: FER library (7 emotions: angry, disgust, fear, happy, sad, surprise, neutral)
Tracking: ByteTrack algorithm with unique ID persistence
Track History: Gray polylines showing movement paths (limited to 30 points)

Technical Implementation:

Engine Class: PersonTrackerEngine in engine.py handles all AI processing
GUI Interface: OpenCV-based viewer with modern overlay design
Web Interface: Flask server with WebRTC frontend for mobile access
Automatic Fallback: ONNX → PyTorch model loading with error handling

🎮 Controls & Interface

Desktop Application (OpenCV GUI):

[F] - Toggle fullscreen mode (with fallback to window resize)
[SPACE] - Pause/Resume video processing (preserves last frame)
[Q] or [ESC] - Quit application gracefully

Visual Interface Elements:

Modern Header Overlay - Semi-transparent dark background with key information
Title Display - "PERSON TRACKER - AI VISION" with orange glow effect
Source Indicator - Shows "Webcam" or "YouTube" source type
Live Statistics - FPS counter (green) and person count display
Pause Indicator - "|| PAUSED" message when video is paused
Control Help - Bottom overlay showing available keyboard shortcuts

Detection Visualization:

Person Boxes - Red rectangles around detected persons with ID numbers
Face Boxes - Green rectangles around detected faces with "Face" label
Emotion Labels - Displayed near faces with confidence percentage
Track History - Gray polylines showing movement paths (max 30 points)
Glass Morphism UI - Modern semi-transparent overlays with proper alpha blending

Web Interface (Mobile):

🚀 Start Tracking - Begin WebRTC camera capture and AI processing
🛑 Stop - End tracking session and release camera resources
📊 Live Dashboard - Real-time FPS, person count, and connection status
📱 Responsive Design - Adapts to any screen size and orientation
� Touch-Optimized - Large buttons designed for mobile interaction

📚 Documentation

Document	Description	Target Audience
WebApp README	Web interface features, mobile usage	End users, demo presenters
Deployment Guide	Production deployment with Cloudflare	DevOps, system admins
ONNX Conversion	Model optimization tutorial	ML engineers

🚀 Performance Optimization

ONNX Model Usage (Automatic):

Models are automatically loaded from models/ directory if available
Fallback to PyTorch models in models/archive/ if ONNX files missing
No manual configuration required - the engine handles model selection

YouTube Stream Quality (Desktop Only):

The application attempts multiple quality levels automatically:

1080p → 720p60 → 720p → 480p → 360p → 240p → 144p
Uses yt-dlp with 30-second timeout per format attempt
Selects best available MP4 format for OpenCV compatibility

Performance Tips:

Use ONNX models for ~2x faster inference than PyTorch
Adjust confidence threshold based on accuracy vs. speed needs
Close other applications to free up system resources
Use SSD storage for faster model loading
Enable GPU acceleration if CUDA-compatible GPU available

Troubleshooting Performance:

Low FPS → Check if ONNX models are being used, close other apps
High CPU usage → Consider lowering confidence threshold
Memory issues → Restart application, check available RAM
Camera issues → Try different webcam_id values (0, 1, 2, etc.)

📊 Performance Benchmarks

Typical Performance (YOLOv8n + ONNX):

Hardware	Resolution	FPS	Person Detection	Face Detection
RTX 3070	1280x720	25-30	✅ Excellent	✅ Excellent
GTX 1060	1280x720	15-20	✅ Good	✅ Good
Intel i7 (CPU)	1280x720	8-12	✅ Acceptable	⚠️ Slow
Intel i5 (CPU)	640x480	10-15	✅ Good	✅ Acceptable

Emotion Recognition Performance:

Processing Interval: 1 second (optimized for performance)
Accuracy: ~85% on standard datasets
Latency: <100ms per face on GPU, <500ms on CPU

🐛 Troubleshooting

Common Issues & Solutions:

1. Camera Not Detected:

# Try different camera IDs
python gui.py --webcam_id 1  # or 2, 3, etc.

# Check available cameras on Windows
# Device Manager → Cameras

2. YouTube Stream Fails:

# Update yt-dlp to latest version
pip install -U yt-dlp

# Try with a different YouTube URL
python gui.py --source youtube --youtube_url "NEW_URL"

3. Models Not Found:

❌ Error: No face detection model found!
✅ Solution: Ensure required files exist:
- models/yolov8n.onnx (or models/archive/yolov8n.pt)
- models/yolov8n-face-lindevs.onnx (or models/archive/yolov8n-face-lindevs.pt)

4. Low Performance:

# Check if ONNX models are being used (console output shows model type)
# Run ONNX conversion notebook if needed
# Close other applications to free resources
# Lower confidence threshold: --conf 0.3

5. Import/Dependency Errors:

# Reinstall all dependencies
pip install -r requirements.txt --force-reinstall

# Check Python version (3.8+ required)
python --version

6. Web App Issues:

# Check if port 8080 is available
netstat -an | findstr 8080  # Windows
lsof -i :8080              # Mac/Linux

# Try different port
python server.py  # Modify port in server.py if needed

Debug Information:

Console shows model loading status and file paths
FPS counter indicates performance issues (should be >10 for smooth operation)
Error messages provide specific failure details

🎯 Advanced Usage

Using the PersonTrackerEngine Class Directly:

from engine import PersonTrackerEngine
from ultralytics import YOLO
from fer import FER

# Initialize models
person_model = YOLO('models/yolov8n.onnx')
face_model = YOLO('models/yolov8n-face-lindevs.onnx')
emotion_detector = FER()

# Create engine instance
engine = PersonTrackerEngine(
    person_model=person_model,
    face_model=face_model, 
    emotion_detector=emotion_detector,
    conf=0.4,
    emotion_interval=1.0
)

# Process frame
processed_frame, emotions, last_emotion_time, fps, person_count = engine.process_frame(frame)

Custom Model Integration:

Train custom YOLO models using Ultralytics framework
Convert to ONNX format using the provided notebook
Place in models/ directory with appropriate naming
Application will automatically detect and use new models

YouTube Stream Quality Selection:

The desktop application uses a fallback system for YouTube streams:

Attempts highest quality first (1080p)
Falls back through: 720p60 → 720p → 480p → 360p → 240p → 144p
Automatically selects best available format for real-time processing

Integration with Other Applications:

The PersonTrackerEngine class can be imported and used in other Python projects:

Real-time video analysis pipelines
Security camera systems
Interactive art installations
Research applications

🔮 Planned Enhancements

Short Term:

Desktop OpenCV GUI - Modern overlay interface with keyboard controls (✅ Complete)
Mobile Web Interface - WebRTC-based mobile app with Flask backend (✅ Complete)
Configuration File - Save/load user preferences and settings
Recording Feature - Save processed video output to file
Screenshot Capture - Save current frame with detections

Medium Term:

Enhanced GUI Controls - Settings panel within OpenCV interface
Multiple Camera Support - Process multiple webcam streams simultaneously
Data Export - CSV logging of tracking data and emotion history
Real-time Dashboard - Separate statistics and analytics window
Model Hot-swapping - Change models without restarting application

Long Term:

Custom Emotion Models - Train domain-specific emotion recognition
3D Pose Estimation - Extended person analysis capabilities
Age/Gender Detection - Additional demographic analysis
Behavior Analysis - Activity and gesture recognition
Multi-target Re-identification - Prevent ID switching and improve tracking

📦 Installation

Quick Installation:

# Clone repository
git clone <repository-url>
cd PersonTracker

# Install all dependencies
pip install -r requirements.txt

# Run desktop application
python gui.py

# Or run mobile web application
cd WebApp && python server.py

Windows PowerShell Setup:

# Create virtual environment (recommended)
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

# Verify installation
python gui.py --help

Development Setup:

# Create isolated environment
python -m venv venv

# Activate environment
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

# Install in development mode
pip install -r requirements.txt

# Optional: Install Jupyter for ONNX conversion
pip install jupyter notebook

Model Setup:

ONNX Models (Recommended): Run ONNX_Conversion.ipynb to generate optimized models
PyTorch Fallback: Application will download YOLOv8n.pt automatically if needed
Face Model: Ensure yolov8n-face-lindevs.pt is in models/archive/ directory

🎪 Perfect for Exhibitions & Demos

Desktop Application Features:

�️ Professional Display - Full-screen mode with modern overlay design
⚡ Real-time Processing - Live AI detection with minimal latency
� Interactive Controls - Easy keyboard shortcuts for live demonstrations
📊 Live Statistics - Professional FPS and detection count display
� Reliable Performance - ONNX optimization for consistent frame rates

Mobile Web App Features:

📱 Zero Installation - Works instantly on any mobile device
🎨 Modern UI Design - Professional gradient interface with smooth animations
� Live Dashboard - Real-time statistics and connection status
🌐 Network Access - Easy sharing via IP address for multiple devices
👆 Touch-Optimized - Large buttons designed for public interaction

Production Deployment:

Ready for public deployment with Cloudflare Tunnel:

# Quick public deployment (free tier)
cloudflared tunnel --url http://localhost:8080

For permanent deployment with custom domains, see DEPLOYMENT_GUIDE.md

Exhibition Setup Tips:

Use ONNX models for best performance
Set confidence to 0.4 for balanced detection
Test camera lighting before exhibition
Provide clear instructions for mobile users
Monitor server performance during high traffic

🎯 Ready to get started? Try the desktop interface: python gui.py
📱 Want the mobile experience? Check out the WebApp README!

🤝 Contributing

We welcome contributions! Please feel free to submit issues, feature requests, or pull requests.

Development Guidelines:

Follow existing code style and structure
Test on both desktop and mobile interfaces
Update documentation for new features
Ensure ONNX and PyTorch model compatibility

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Ultralytics for the excellent YOLOv8 implementation and ByteTrack integration
FER Library for robust emotion recognition capabilities
OpenCV for comprehensive computer vision operations and GUI framework
yt-dlp for reliable YouTube stream extraction
Flask for lightweight web framework enabling mobile interface
The open-source AI community for inspiration and collaborative development

Made with ❤️ for the AI and Computer Vision community

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
WebApp		WebApp
models		models
.gitignore		.gitignore
ONNX_Conversion.ipynb		ONNX_Conversion.ipynb
README.md		README.md
engine.py		engine.py
gui.py		gui.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation