Enterprise-grade ML pipeline with automated training, deployment, and monitoring
Features β’ Architecture β’ Performance β’ Quick Start β’ Deployment
Build a machine learning model to predict whether a bank customer will subscribe to a term deposit based on their demographics, banking history, and campaign interaction data.
- Source: Kaggle Playground Series - Season 5, Episode 8
- Type: Synthetically generated data (similar to UCI Bank Marketing dataset)
- Training Samples: 45,211 records
- Features: 16 features (numerical + categorical)
- Target: Binary classification (subscription: yes/no)
The bank aims to optimize its marketing campaign efficiency by identifying customers most likely to subscribe to term deposits. This reduces marketing costs and improves campaign ROI by targeting high-probability prospects.
- Primary: ROC-AUC Score β₯ 0.95
- Secondary: F1-Score β₯ 0.90
- Tertiary: Model latency < 100ms for API inference
Binary Classification: Predict subscription probability for each customer with probability estimates for threshold tuning.
| Model | AUC-ROC | Accuracy | F1-Score |
|---|---|---|---|
| LightGBM | 0.97 | 94.2% | 0.93 |
| XGBoost | 0.96 | 93.8% | 0.92 |
| PyTorch FNN | 0.95 | 93.1% | 0.91 |
All models tracked with MLflow including:
- ROC/AUC curves for each model
- Confusion matrices with precision/recall
- Feature importance visualizations
- Hyperparameter tuning history
MLflow dashboard showing model metrics, parameters, and experiment history
Airflow DAG visualization with 6-task pipeline and execution logs
Automated CI/CD pipeline with lint, test, build, and deployment stages
- Data Versioning with DVC for reproducible datasets
- Experiment Tracking with MLflow (metrics, parameters, artifacts)
- Workflow Orchestration with Apache Airflow for scheduled retraining
- CI/CD Pipeline with GitHub Actions (lint, test, build, deploy)
- XGBoost: Gradient boosting (50 estimators, optimized)
- LightGBM: Fast gradient boosting (baseline)
- PyTorch Neural Network: 3-layer FNN with dropout & early stopping
- Hyperparameter Tuning: Optuna-based optimization
- Docker: Containerized Flask application
- AWS EC2: Scalable compute instances
- AWS ECR: Private Docker registry
- GitHub Actions: Self-hosted runners for CI/CD
- Flask REST API for real-time predictions
- Model serving with automatic versioning
- Health checks and monitoring endpoints
graph LR
A[Raw Data] -->|DVC| B[Data Ingestion]
B --> C[Data Transformation]
C --> D[Model Training]
D -->|XGBoost| E[MLflow Tracking]
D -->|LightGBM| E
D -->|PyTorch FNN| E
E --> F[Model Registry]
F --> G[Flask API]
G -->|Docker| H[AWS ECR]
H -->|Deploy| I[AWS EC2]
J[Airflow] -.->|Schedule| D
K[GitHub Actions] -.->|CI/CD| H
| Component | Purpose | Technology |
|---|---|---|
| Data Pipeline | Ingestion, validation, transformation | Pandas, Scikit-learn |
| Model Training | Train & evaluate ML models | XGBoost, LightGBM, PyTorch |
| Experiment Tracking | Log metrics, params, artifacts | MLflow 3.8.1 |
| Orchestration | Schedule & monitor pipelines | Apache Airflow 3.0.6 |
| Version Control | Track data & model versions | DVC + Git |
| API Server | Serve predictions via REST | Flask |
| Containerization | Package application | Docker |
| Registry | Store Docker images | AWS ECR |
| Compute | Host production services | AWS EC2 |
| CI/CD | Automated testing & deployment | GitHub Actions |
# Python 3.14+
python --version
# Docker installed
docker --version
# AWS CLI configured
aws --version# Clone repository
git clone https://github.com/Amit95688/ml-ops.git
cd ml-ops
# Install dependencies
pip install -r requirements.txt
# Initialize DVC
dvc pull # Download tracked datapython scripts/launch_mlflow_ui.py
# Access at: http://localhost:5000AIRFLOW_HOME=$(pwd)/airflow python scripts/start_airflow.py
# Access at: http://localhost:8080
# Login: admin / YHKfeaptbhGrBCkepython src/pipeline/train_pipeline.py
# Results logged to MLflow automaticallypython application.py
# API running at: http://localhost:5000docker build -t ml-ops-api:latest .docker run -d \
-p 5000:5000 \
--name ml-ops-api \
ml-ops-api:latest# Start MLflow UI and Airflow (API server + scheduler)
docker compose up -d mlflow airflow
# Stop everything
docker compose down- MLflow UI: http://localhost:5000
- Airflow UI: http://localhost:8080 (local dev auth is disabled; admin/admin also works)
- DAGs mount the local
dags/folder; MLflow artifacts persist in themlflow-artifactsvolume.
Airflow orchestration view (sample):

# Authenticate
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
# Tag
docker tag ml-ops-api:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/ml-ops-api:latest
# Push
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/ml-ops-api:latest# SSH to EC2 instance
ssh -i key.pem ec2-user@<public-ip>
# Pull from ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
docker pull <account-id>.dkr.ecr.us-east-1.amazonaws.com/ml-ops-api:latest
# Run on EC2
docker run -d -p 80:5000 \
--restart always \
--name ml-ops-api \
<account-id>.dkr.ecr.us-east-1.amazonaws.com/ml-ops-api:latest- Inbound: Port 80 (HTTP), 443 (HTTPS)
- Outbound: All traffic
name: MLOps CI/CD
on:
push:
branches: [main]
jobs:
continuous-integration:
runs-on: ubuntu-latest
steps:
- Lint & format check
- Run unit tests
- Data validation
continuous-delivery:
needs: continuous-integration
steps:
- Build Docker image
- Push to AWS ECR
continuous-deployment:
needs: continuous-delivery
runs-on: self-hosted # AWS EC2 runner
steps:
- Pull latest image
- Deploy to production
- Health check validation# Configure GitHub Actions runner on EC2
./config.sh --url https://github.com/Amit95688/ml-ops \
--token <YOUR_TOKEN>
./run.sh# Start MLflow server
mlflow server \
--backend-store-uri sqlite:///mlflow.db \
--default-artifact-root ./mlruns \
--host 0.0.0.0 \
--port 5000import mlflow
with mlflow.start_run(run_name="XGBoost_Training"):
mlflow.log_params({"n_estimators": 50, "max_depth": 5})
mlflow.log_metric("auc_roc", 0.97)
mlflow.log_artifact("confusion_matrix.png")
mlflow.sklearn.log_model(model, "model")# View pipeline DAG
dvc dag
# Reproduce entire pipeline
dvc repro
# Push data to remote storage
dvc push
# Pull data from remote
dvc pullstages:
data_ingestion:
cmd: python -m src.components.data_ingestion
outs:
- artifacts/train.csv
- artifacts/test.csv
model_trainer:
cmd: python src/pipeline/train_pipeline.py
deps:
- artifacts/train.csv
outs:
- artifacts/model.pklcurl http://localhost:5000/healthcurl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{
"features": [5.1, 3.5, 1.4, 0.2, ...]
}'Response:
{
"prediction": 1,
"probability": 0.97,
"model": "LightGBM",
"version": "v1.2.3"
}ml-ops/
βββ .github/
β βββ workflows/
β βββ workflow.yml # CI/CD pipeline
βββ artifacts/ # Model artifacts & data
βββ dags/
β βββ ml_training_pipeline.py # Airflow DAG
βββ src/
β βββ components/
β β βββ data_ingestion.py
β β βββ data_transformation.py
β β βββ model_trainer.py
β β βββ pytorch_model.py
β βββ pipeline/
β βββ train_pipeline.py
β βββ predict_pipeline.py
βββ scripts/
β βββ start_airflow.py
β βββ launch_mlflow_ui.py
βββ tests/ # Unit & integration tests
βββ application.py # Flask API
βββ Dockerfile # Container definition
βββ dvc.yaml # DVC pipeline
βββ requirements.txt # Dependencies
βββ README.md
# Run all tests
pytest tests/
# With coverage
pytest --cov=src tests/
# Integration tests
pytest tests/integration/- URL:
http://localhost:8080 - Monitor DAG runs, task status, execution logs
- Schedule automatic retraining (weekly)
- URL:
http://localhost:5000 - Compare model versions
- View experiment metrics & artifacts
- Download trained models
tail -f logs/app_$(date +%Y-%m-%d).log# .env file
AWS_REGION=us-east-1
ECR_REPOSITORY=ml-ops-api
MLFLOW_TRACKING_URI=http://localhost:5000
AIRFLOW_HOME=/path/to/airflowEdit dvc.yaml to tune hyperparameters:
params:
xgboost:
n_estimators: 100
max_depth: 7
lightgbm:
num_leaves: 50# Fork the repository
# Create feature branch
git checkout -b feature/amazing-feature
# Commit changes
git commit -m "Add amazing feature"
# Push to branch
git push origin feature/amazing-feature
# Open Pull RequestThis project is licensed under the MIT License - see LICENSE file.
Amit Dubey
- MLflow for experiment tracking
- Apache Airflow for orchestration
- DVC for data versioning
- AWS for cloud infrastructure
- GitHub Actions for CI/CD automation
β Star this repo if you find it useful!
Made with β€οΈ for the ML community