DriverPulse is an end-to-end data analytics and machine learning pipeline that analyzes ward-level ride data from Namma Yatri (Bengaluru). It uncovers business insights, quantifies driver reliability, and segments operational areas based on performance metrics.
.
├── backend/
│ ├── data/
│ │ ├── namma_yatri_bengaluru_ward_wise_all_time_data.csv # Raw data
│ │ └── cleaned.csv # Cleaned + engineered data
│ ├── models/ # Saved ML models & artifacts
│ ├── analysis.ipynb # Data Cleaning & EDA
│ ├── train.ipynb # Model Training Pipeline
│ └── main.py # FastAPI Server
├── frontend/
│ └── App.jsx # React UI
└── README.md # Project Overview
- Parses Indian-style numbers (commas) and removes currency/percentage symbols formatting.
- Engineers 4 actionable business metrics:
earnings_per_km: A measure of driver revenue efficiency.supply_gap: Unmet demand (Searches missing Quotes).revenue_leakage: Potential revenue lost (Searches vs Completed Trips).reliability_score: A composite score derived from low cancellation and high quote acceptance.
- Conducts comprehensive EDA with histograms, correlation matrices, outlier detection (IQR), and multivariable scatter plots charting behavior vs earnings.
Three models are trained and persisted in the models/ directory for downstream use.
-
Ward Risk Classifier (Random Forest)
- Automatically labels wards into
High,Medium, andLowrisk using percentile thresholds computed from composite cancellation & conversion rates. - Evaluated using Accuracy, F1 (macro), and Confusion Matrices.
- Automatically labels wards into
-
Acceptance Predictor (RF Regressor)
- Predicts the
driver_quote_acceptance_ratedynamically given metrics likesupply_gap,avg_fare, andavg_distance.
- Predicts the
-
Driver Behavior Clustering (KMeans, k=3)
- Segments wards into exactly 3 distinct operational zones based on driver reliability and efficiency.
- Features used:
driver_cancellation_rate,driver_quote_acceptance_rate,earnings_per_km, andavg_distance. - Produces business-ready labels mapping wards to profiles: ⭐ Reliable Performers,
⚠️ High-Risk Zones, and 🔄 Average Balanced.
(Note: Model 1 and Model 2 also incorporate SHAP explainer visualizations to outline feature importance).
Run the following commands to install dependencies using uv and replicate the environment locally:
cd backend
uv syncThe project consists of a FastAPI backend and a React/Vite frontend.
Create a .env file in the backend/ directory and add your OpenRouter API key for the live LLM insights (which streaming uses the qwen/qwen3-coder:free model):
OPENROUTER_API=sk-or-v1-...Run the following command to start the backend API server on port 8000:
cd backend
uv run uvicorn main:app --reload --port 8000Open a new terminal and run the frontend development server (typically on port 5173):
cd frontend
npm install
npm run dev