Add anomaly detection project by rashidrao-pk · Pull Request #18 · Prodigy-InfoTech/Data-Science-Projects

rashidrao-pk · 2026-06-10T13:27:25Z

Credit Card Fraud Detection using Anomaly Detection

Issue

Fixes #18

Overview

This project demonstrates how anomaly detection techniques can be used to identify fraudulent credit card transactions. Since fraudulent transactions are extremely rare compared to legitimate transactions, anomaly detection provides an effective approach for detecting unusual patterns without relying heavily on balanced labeled data.

Dataset

The project uses the Credit Card Fraud Detection dataset, which contains transactions made by European cardholders in September 2013.

Dataset Features:

284,807 transactions
492 fraudulent transactions
Highly imbalanced dataset
PCA-transformed features (V1–V28)
Additional features:
- Time
- Amount
- Class (0 = Normal, 1 = Fraud)

Dataset Source:
https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

Objectives

Explore and understand class imbalance
Apply preprocessing and feature scaling
Train anomaly detection models
Compare different anomaly detection approaches
Evaluate performance using classification metrics

Models Implemented

1. Isolation Forest

Isolation Forest isolates anomalies by randomly selecting features and split values. Anomalies require fewer splits and are therefore easier to isolate.

2. Local Outlier Factor (LOF)

LOF identifies anomalies by comparing the local density of a sample with the density of its neighbors.

3. One-Class SVM

One-Class SVM learns the boundary of normal transactions and identifies samples outside this boundary as anomalies.

Evaluation Metrics

The following metrics are used for evaluation:

Accuracy
Precision
Recall
F1-Score
ROC-AUC
Precision-Recall Curve
Confusion Matrix

Project Structure

Anomaly Detection/
│
├── anomaly_detection.ipynb
├── README.md
└── dataset/

Workflow

Load Dataset
Perform Exploratory Data Analysis (EDA)
Analyze Class Distribution
Scale Features
Train Anomaly Detection Models
Generate Predictions
Evaluate Results
Compare Models

Results

The notebook provides a side-by-side comparison of all implemented anomaly detection methods, highlighting their strengths and limitations when dealing with highly imbalanced fraud detection datasets.

Requirements

pip install pandas numpy matplotlib seaborn scikit-learn jupyter

Running the Notebook

jupyter notebook anomaly_detection.ipynb

Learning Outcomes

After completing this project, users will understand:

The challenges of imbalanced datasets
Fundamentals of anomaly detection
Differences between Isolation Forest, LOF, and One-Class SVM
Evaluation techniques for anomaly detection systems

Author

Muhammad Rashid

GitHub:
https://github.com/rashidrao-pk

Add anomaly detection project

3e12607

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add anomaly detection project#18

Add anomaly detection project#18
rashidrao-pk wants to merge 1 commit into
Prodigy-InfoTech:mainfrom
rashidrao-pk:add-anomaly-detection-project

rashidrao-pk commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rashidrao-pk commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Credit Card Fraud Detection using Anomaly Detection

Issue

Overview

Dataset

Objectives

Models Implemented

1. Isolation Forest

2. Local Outlier Factor (LOF)

3. One-Class SVM

Evaluation Metrics

Project Structure

Workflow

Results

Requirements

Running the Notebook

Learning Outcomes

Author

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rashidrao-pk commented Jun 10, 2026 •

edited

Loading