BLR Traffic Demand Prediction

Brief demand-prediction pipeline for Bengaluru traffic using CatBoost and a compact feature-engineering pipeline.

Overview

Goal: Predict demand for geohash/time combinations using engineered time features, target encodings, and a CatBoost regressor.
Main script: traffic.py — contains preprocessing, model validation, final training, and submission generation (submission.csv).

Requirements

Run (local)

Place train.csv and test.csv in the project root (same directory as traffic.py)
Create and activate a virtual environment, then install dependencies:

python -m venv venv
venv\\Scripts\\activate    # Windows
source venv/bin/activate    # macOS / Linux
pip install -r requirements.txt

python traffic.py

What to expect:

The script will validate a model on a hold-out split, train a final CatBoost model on the full training set, and write submission.csv containing Index and predicted demand.

CLI usage:

python traffic.py --train train.csv --test test.csv --output submission.csv

Defaults: --train defaults to ./train.csv, --test defaults to ./test.csv, and --output defaults to submission.csv.
You can also tune iterations used by CatBoost at runtime:

python traffic.py --iterations 3000 --val-iterations 1500

Notes & Recommendations

The original notebook was developed in Colab and used /content/train.csv and /content/test.csv. If running locally, change those paths to ./train.csv and ./test.csv (or to their full paths).
traffic.py currently reads/writes CSVs directly and uses CatBoost categorical features. If you plan to productionize, consider parameterizing file paths and hyperparameters.
Suggested reading for better understanding - Presentation

Files

Contact / Origin

Original notebook: exported from a Colab notebook (author contact available in the script header).

Provide feedback

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
requirements.txt		requirements.txt
test.csv		test.csv
traffic.py		traffic.py
train.csv		train.csv