Skip to content

ryanjgani/fraud-detection

Repository files navigation

Fraud Detection

SDSC 2001 Python for Data Science
Final Course Project
Taught by Prof. LI Xinyue, City University of Hong Kong
Final Course Grade: A+

The goal of this project is to detect fraudulent and non-fraudulent transactions from a given dataset that was taken from Kaggle and manipulated by the Professor.

The following files has been posted:

  • Project instructions project-instructions.ipynb
  • Notebook fraud-detection.ipynb
  • Dataset creditcard_test.csv creditcard_train.csv

Project Information

Language: Python
Technology: Pandas, Matplotlib, Seaborn, scikit-learn

Short Summary

There are 5 main modules:

  • Data Exploration
    We start by exploring the dataset, handling missing values and outliers.
  • Data Visualization
    Continue to explore the dataset using visualizations and use them to explain the findings.
  • Dimension Reduction
    Apply unsupervised learning methods to achieve dimension reduction. We use Principal Component Analysis (PCA) as our dimensionality reduction algorithm.
  • Classification
    we use the dataset to train different models which are Gaussian NB, Decision Tree Clasifier, and Logistic Regression. We prepared both data for training and testing. For each model, we count the accuracy scores, plot a confusion matrix, and 5-Fold Cross Validation.
  • Summary

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors