Detect human emotions (calm, happy, fearful, disgust) from audio in real-time using a multi-model ML pipeline, a FastAPI backend, and a React + Tailwind frontend.
- Overview
- Architecture
- Tech Stack
- Quick Start (Docker)
- Local Development
- Dataset Setup
- Training the Model
- API Reference
- Database Schema
- Deployment
- Testing
- Contributing
EchoEmotion is an end-to-end production-ready Speech Emotion Recognition system built on the RAVDESS dataset. It compares five classifiers (MLP, Random Forest, SVM, XGBoost, LightGBM) and automatically selects the best by weighted F1 score.
Accuracy: ~72 β 78% depending on model selection and dataset size.
- π€ Browser microphone recording
- π Drag-and-drop audio upload (WAV, MP3, OGG, FLAC, M4A)
- π Probability chart + radar profile per prediction
- π Model comparison table with cross-validation
- π Full dashboard with emotion distribution analytics
- π JWT authentication (register / login)
- π PostgreSQL persistence for all predictions
- π³ Docker Compose for one-command startup
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β React + Vite β
β LandingPage / PredictPage / DashboardPage β
β Axios + TanStack Query + Framer Motion β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β HTTP (REST)
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β FastAPI (Python 3.11) β
β /api/v1/predict /train /dashboard /auth β
β JWT Auth Β· Rate Limiting Β· Swagger UI β
β β
β βββββββββββββββββββββ βββββββββββββββββββββββ β
β β ML Pipeline β β PostgreSQL (ORM) β β
β β FeatureExtractor β β User / Prediction β β
β β ModelTrainer β β EmotionStat β β
β β EmotionPredictor β β ModelRegistry β β
β βββββββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Layer | Technology |
|---|---|
| Frontend | React 18 Β· Vite Β· TailwindCSS Β· Framer Motion |
| Charts | Recharts |
| State | TanStack Query Β· Zustand |
| Backend | FastAPI Β· Uvicorn Β· Pydantic v2 |
| Auth | JWT (python-jose) Β· bcrypt (passlib) |
| ML | scikit-learn Β· librosa Β· XGBoost Β· LightGBM |
| Database | PostgreSQL 16 Β· SQLAlchemy 2 (async) |
| DevOps | Docker Β· Docker Compose Β· GitHub Actions |
git clone https://github.com/your-username/echoemotion.git
cd echoemotion
# 1. Add RAVDESS dataset (see Dataset Setup below)
# 2. Copy environment files
cp backend/.env.example backend/.env
# 3. Start everything
docker compose up --build
# API docs: http://localhost:8000/docs
# Frontend: http://localhost:3000
# PostgreSQL: localhost:5432cd backend
python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Create .env from example
cp .env.example .env
# Edit .env β at minimum set DATABASE_URL to your local Postgres
# Create tables
python scripts/init_db.py
# Run dev server
uvicorn app.main:app --reload --port 8000cd frontend
npm install
# Create .env.local
echo "VITE_API_URL=http://localhost:8000/api/v1" > .env.local
npm run dev # β http://localhost:5173- Download the RAVDESS dataset (speech-only files).
- Place the extracted
Actor_*folders insidebackend/dataset/:
backend/
βββ dataset/
βββ Actor_01/
β βββ 03-01-01-01-01-01-01.wav
β βββ ...
βββ Actor_02/
βββ ...
curl -X POST http://localhost:8000/api/v1/train \
-H "Content-Type: application/json" \
-d '{"observed_emotions": ["calm","happy","fearful","disgust"], "compare_models": true}'cd backend
python -c "
from app.ml.trainer import ModelTrainer
from app.core.config import EMOTIONS_MAP
t = ModelTrainer('dataset', 'models', EMOTIONS_MAP)
result = t.train(['calm','happy','fearful','disgust'])
print(result['best_model'], result['best_accuracy'])
"The pipeline will train MLP Β· Random Forest Β· SVM Β· XGBoost Β· LightGBM, run 5-fold cross-validation on each, and save the best model automatically.
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/health |
Health check + model status |
| GET | /api/v1/emotions |
List supported emotions |
| POST | /api/v1/predict |
Upload audio β emotion + confidence |
| POST | /api/v1/train |
Train / retrain model |
| GET | /api/v1/model-info |
Active model metadata |
| GET | /api/v1/metrics |
Full training metrics + comparison |
| GET | /api/v1/dashboard |
Prediction statistics |
| POST | /api/v1/auth/register |
Create account |
| POST | /api/v1/auth/login |
Get JWT token |
| GET | /api/v1/auth/me |
Current user info |
Full interactive docs: http://localhost:8000/docs
users (id UUID PK, email, username, hashed_password, is_active, created_at)
predictions (id UUID PK, user_id FK, filename, predicted_emotion, confidence,
all_probabilities JSON, audio_duration_s, created_at)
emotion_stats (id, emotion UNIQUE, total_count, avg_confidence, last_updated)
model_registry (id, version, algorithm, accuracy, metrics JSON, is_active, created_at)Build command: pip install -r requirements.txt
Start command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
Environment: set all vars from .env.example
cd frontend
npm run build
# Deploy /dist to Vercel
# Set VITE_API_URL to your backend URLFree-tier PostgreSQL β just update DATABASE_URL in your env.
# Backend
cd backend
pytest tests/ -v
# Frontend
cd frontend
npm testMIT License Β© 2025 Manoj Kharkar