Skip to content

manojk909/EchoEmotion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ EchoEmotion β€” Speech Emotion Recognition

Detect human emotions (calm, happy, fearful, disgust) from audio in real-time using a multi-model ML pipeline, a FastAPI backend, and a React + Tailwind frontend.

CI/CD Python FastAPI React PostgreSQL Docker


πŸ“– Table of Contents

  1. Overview
  2. Architecture
  3. Tech Stack
  4. Quick Start (Docker)
  5. Local Development
  6. Dataset Setup
  7. Training the Model
  8. API Reference
  9. Database Schema
  10. Deployment
  11. Testing
  12. Contributing

Overview

EchoEmotion is an end-to-end production-ready Speech Emotion Recognition system built on the RAVDESS dataset. It compares five classifiers (MLP, Random Forest, SVM, XGBoost, LightGBM) and automatically selects the best by weighted F1 score.

Accuracy: ~72 – 78% depending on model selection and dataset size.

Features

  • 🎀 Browser microphone recording
  • πŸ“ Drag-and-drop audio upload (WAV, MP3, OGG, FLAC, M4A)
  • πŸ“Š Probability chart + radar profile per prediction
  • πŸ† Model comparison table with cross-validation
  • πŸ“ˆ Full dashboard with emotion distribution analytics
  • πŸ” JWT authentication (register / login)
  • 🐘 PostgreSQL persistence for all predictions
  • 🐳 Docker Compose for one-command startup

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   React + Vite                      β”‚
β”‚  LandingPage / PredictPage / DashboardPage          β”‚
β”‚  Axios + TanStack Query + Framer Motion             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚ HTTP (REST)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              FastAPI  (Python 3.11)                 β”‚
β”‚  /api/v1/predict   /train   /dashboard   /auth      β”‚
β”‚  JWT Auth Β· Rate Limiting Β· Swagger UI              β”‚
β”‚                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   ML Pipeline     β”‚   β”‚   PostgreSQL (ORM)  β”‚   β”‚
β”‚  β”‚  FeatureExtractor β”‚   β”‚  User / Prediction  β”‚   β”‚
β”‚  β”‚  ModelTrainer     β”‚   β”‚  EmotionStat        β”‚   β”‚
β”‚  β”‚  EmotionPredictor β”‚   β”‚  ModelRegistry      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

Layer Technology
Frontend React 18 Β· Vite Β· TailwindCSS Β· Framer Motion
Charts Recharts
State TanStack Query Β· Zustand
Backend FastAPI Β· Uvicorn Β· Pydantic v2
Auth JWT (python-jose) Β· bcrypt (passlib)
ML scikit-learn Β· librosa Β· XGBoost Β· LightGBM
Database PostgreSQL 16 Β· SQLAlchemy 2 (async)
DevOps Docker Β· Docker Compose Β· GitHub Actions

Quick Start (Docker)

git clone https://github.com/your-username/echoemotion.git
cd echoemotion

# 1. Add RAVDESS dataset (see Dataset Setup below)
# 2. Copy environment files
cp backend/.env.example backend/.env

# 3. Start everything
docker compose up --build

# API docs:     http://localhost:8000/docs
# Frontend:     http://localhost:3000
# PostgreSQL:   localhost:5432

Local Development

Backend

cd backend
python -m venv venv && source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Create .env from example
cp .env.example .env
# Edit .env β€” at minimum set DATABASE_URL to your local Postgres

# Create tables
python scripts/init_db.py

# Run dev server
uvicorn app.main:app --reload --port 8000

Frontend

cd frontend
npm install
# Create .env.local
echo "VITE_API_URL=http://localhost:8000/api/v1" > .env.local
npm run dev   # β†’ http://localhost:5173

Dataset Setup

  1. Download the RAVDESS dataset (speech-only files).
  2. Place the extracted Actor_* folders inside backend/dataset/:
backend/
└── dataset/
    β”œβ”€β”€ Actor_01/
    β”‚   β”œβ”€β”€ 03-01-01-01-01-01-01.wav
    β”‚   └── ...
    β”œβ”€β”€ Actor_02/
    └── ...

Training the Model

Via API (recommended)

curl -X POST http://localhost:8000/api/v1/train \
  -H "Content-Type: application/json" \
  -d '{"observed_emotions": ["calm","happy","fearful","disgust"], "compare_models": true}'

Via Python script

cd backend
python -c "
from app.ml.trainer import ModelTrainer
from app.core.config import EMOTIONS_MAP
t = ModelTrainer('dataset', 'models', EMOTIONS_MAP)
result = t.train(['calm','happy','fearful','disgust'])
print(result['best_model'], result['best_accuracy'])
"

The pipeline will train MLP Β· Random Forest Β· SVM Β· XGBoost Β· LightGBM, run 5-fold cross-validation on each, and save the best model automatically.


API Reference

Method Endpoint Description
GET /api/v1/health Health check + model status
GET /api/v1/emotions List supported emotions
POST /api/v1/predict Upload audio β†’ emotion + confidence
POST /api/v1/train Train / retrain model
GET /api/v1/model-info Active model metadata
GET /api/v1/metrics Full training metrics + comparison
GET /api/v1/dashboard Prediction statistics
POST /api/v1/auth/register Create account
POST /api/v1/auth/login Get JWT token
GET /api/v1/auth/me Current user info

Full interactive docs: http://localhost:8000/docs


Database Schema

users (id UUID PK, email, username, hashed_password, is_active, created_at)
predictions (id UUID PK, user_id FK, filename, predicted_emotion, confidence,
             all_probabilities JSON, audio_duration_s, created_at)
emotion_stats (id, emotion UNIQUE, total_count, avg_confidence, last_updated)
model_registry (id, version, algorithm, accuracy, metrics JSON, is_active, created_at)

Deployment

Backend β†’ Render

Build command: pip install -r requirements.txt
Start command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
Environment: set all vars from .env.example

Frontend β†’ Vercel

cd frontend
npm run build
# Deploy /dist to Vercel
# Set VITE_API_URL to your backend URL

Database β†’ Neon

Free-tier PostgreSQL β€” just update DATABASE_URL in your env.


Testing

# Backend
cd backend
pytest tests/ -v

# Frontend
cd frontend
npm test

MIT License Β© 2025 Manoj Kharkar

About

Speech Emotion Recognition - ML + FastAPI + React

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors