This project is a production-grade Machine Learning Microservice designed to detect fraudulent credit card transactions in real-time.
Unlike standard data science notebooks, this system is architected for reliability and observability. It includes a serving API with strict schema validation, a containerized deployment environment, and an automated monitoring pipeline to detect model degradation (Data Drift) due to shifting fraud patterns.
- Real-Time Inference: Sub-100ms latency using FastAPI.
- Containerization: Fully Dockerized application for consistent deployment across environments.
- Data Validation: Strict type enforcement using Pydantic to reject malformed requests.
- Model Governance: Experiment tracking and artifact versioning via MLflow.
- Observability: Automated drift detection pipeline using Evidently AI.
graph LR
A[Historical Data] -->|Train| B(Random Forest Model)
B -->|Log & Version| C{MLflow Registry}
C -->|Load Artifact| D[FastAPI Service]
E[Live Transaction] -->|HTTP POST| D
D -->|Prediction| F[Fraud / Safe]
E -->|Batch Log| G[Monitoring Service]
G -->|Drift Check| H[Evidently AI]
H -->|Alert| I[Retrain Trigger]
The pipeline consists of three distinct stages:
- Training Pipeline: Data ingestion, preprocessing, and model training (Random Forest), logged to the MLflow Registry.
- Serving Layer: A REST API that loads the production model artifact and serves predictions.
- Monitoring Layer: A background process that compares live traffic against reference data to flag distributional shifts.
| Component | Technology | Role |
|---|---|---|
| Model | Scikit-Learn (Random Forest) | Classification Engine |
| API Framework | FastAPI | High-performance Async API |
| Containerization | Docker | Environment Isolation |
| Experiment Tracking | MLflow | Model Versioning & Registry |
| Monitoring | Evidently AI | Data Drift & Target Drift Detection |
| Data Validation | Pydantic | Schema Enforcement |
In financial fraud, fraudsters constantly adapt their tactics. A static model creates liability.
This system implements Evidently AI to monitor Covariate Shift. Below is a generated report showing a simulated attack where the distribution of feature V1 significantly deviated from the training baseline, triggering a retraining alert.
The entire application is packaged into a single image.
# 1. Build the image
docker build -t fraud-detection-api .
# 2. Run the container (Maps port 8000 on host to 80 in container)
docker run -p 8000:80 fraud-detection-api# 1. Install dependencies
pip install -r requirements.txt
# 2. Train the model
python src/train.py
# 3. Start the server
uvicorn app.main:app --reload