Skip to content

mayankdas2005/Fraud-detection-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ Real-Time Fraud Detection System (End-to-End MLOps)

Python FastAPI Docker MLflow Evidently

📋 Executive Summary

This project is a production-grade Machine Learning Microservice designed to detect fraudulent credit card transactions in real-time.

Unlike standard data science notebooks, this system is architected for reliability and observability. It includes a serving API with strict schema validation, a containerized deployment environment, and an automated monitoring pipeline to detect model degradation (Data Drift) due to shifting fraud patterns.

🚀 Key Features

  • Real-Time Inference: Sub-100ms latency using FastAPI.
  • Containerization: Fully Dockerized application for consistent deployment across environments.
  • Data Validation: Strict type enforcement using Pydantic to reject malformed requests.
  • Model Governance: Experiment tracking and artifact versioning via MLflow.
  • Observability: Automated drift detection pipeline using Evidently AI.

🏗️ System Architecture

graph LR
    A[Historical Data] -->|Train| B(Random Forest Model)
    B -->|Log & Version| C{MLflow Registry}
    C -->|Load Artifact| D[FastAPI Service]
    E[Live Transaction] -->|HTTP POST| D
    D -->|Prediction| F[Fraud / Safe]
    E -->|Batch Log| G[Monitoring Service]
    G -->|Drift Check| H[Evidently AI]
    H -->|Alert| I[Retrain Trigger]
Loading

The pipeline consists of three distinct stages:

  1. Training Pipeline: Data ingestion, preprocessing, and model training (Random Forest), logged to the MLflow Registry.
  2. Serving Layer: A REST API that loads the production model artifact and serves predictions.
  3. Monitoring Layer: A background process that compares live traffic against reference data to flag distributional shifts.

🛠️ Tech Stack

Component Technology Role
Model Scikit-Learn (Random Forest) Classification Engine
API Framework FastAPI High-performance Async API
Containerization Docker Environment Isolation
Experiment Tracking MLflow Model Versioning & Registry
Monitoring Evidently AI Data Drift & Target Drift Detection
Data Validation Pydantic Schema Enforcement

📊 Monitoring & Data Drift

In financial fraud, fraudsters constantly adapt their tactics. A static model creates liability.

This system implements Evidently AI to monitor Covariate Shift. Below is a generated report showing a simulated attack where the distribution of feature V1 significantly deviated from the training baseline, triggering a retraining alert.


⚡ How to Run

Option 1: Using Docker (Recommended)

The entire application is packaged into a single image.

# 1. Build the image
docker build -t fraud-detection-api .

# 2. Run the container (Maps port 8000 on host to 80 in container)
docker run -p 8000:80 fraud-detection-api

Option 2: Local Development

# 1. Install dependencies
pip install -r requirements.txt

# 2. Train the model 
python src/train.py

# 3. Start the server
uvicorn app.main:app --reload

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors