Exploratory Data Analysis Demo

This repository contains a simple exploratory data analysis workflow using Python. The project demonstrates how to inspect, clean, and visualize a structured dataset using pandas, NumPy, seaborn, and matplotlib.

The notebook was developed as part of my early data analysis and machine learning learning journey and is maintained as a simple demonstration of foundational exploratory data analysis skills.

Project Overview

Exploratory Data Analysis is an important step in any data science or machine learning workflow. It helps analysts understand the structure of a dataset, detect missing values, identify outliers, explore variable relationships, and generate insights before model development.

In this project, a marketing/customer campaign dataset is analyzed to explore customer attributes and response patterns. The analysis includes data cleaning, missing value handling, univariate analysis, bivariate analysis, categorical analysis, and basic visualization.

Objectives

The main objectives of this project are to:

load and inspect a structured dataset
clean unnecessary or redundant columns
handle missing values
separate combined columns into meaningful features
explore numerical and categorical variables
visualize feature distributions
examine relationships between customer attributes and response outcomes
demonstrate basic exploratory data analysis using Python

Repository Structure

EDA_Demo/
├── EDA.ipynb
├── Marketing_Analysis.csv
├── README.md
├── requirements.txt
├── .gitignore
└── LICENSE

Depending on the current version of the repository, the dataset file may not be included. If the dataset is not included, users should place Marketing_Analysis.csv in the root directory before running the notebook.

Files Description

`EDA.ipynb`

This Jupyter notebook contains the exploratory data analysis workflow. It includes data loading, data cleaning, missing value handling, feature separation, and visual exploration of numerical and categorical variables.

`Marketing_Analysis.csv`

This is the dataset used in the notebook. The notebook expects this file to be available in the repository root directory.

`requirements.txt`

This file lists the Python packages required to run the notebook.

`README.md`

This file provides an overview of the project, usage instructions, limitations, and future improvement ideas.

Analysis Workflow

The notebook follows a basic EDA workflow.

1. Importing Libraries

The project uses common Python data analysis and visualization libraries, including:

pandas
NumPy
seaborn
matplotlib

2. Loading the Dataset

The dataset is loaded using pandas. The notebook uses skiprows=2 because the first two rows of the original file are not needed for analysis.

3. Data Cleaning

The cleaning steps include:

removing unnecessary columns
separating combined columns into individual variables
checking missing values
handling missing values in selected columns
preparing the dataset for analysis

4. Missing Value Handling

The notebook checks for missing values and handles them based on the nature of the affected variables. For example, missing values in categorical columns may be filled using the mode, while rows with missing target response values may be removed.

5. Univariate Analysis

Univariate analysis is performed to understand individual variables.

Examples include:

job category distribution
education distribution
salary summary statistics

6. Bivariate Analysis

Bivariate analysis is used to explore relationships between two variables.

Examples include:

salary versus balance
age versus balance
salary grouped by response
response differences across customer groups

7. Categorical Analysis

Categorical analysis is performed to examine how response rates vary across groups such as marital status and loan status.

8. Visualization

The notebook includes visualizations such as:

bar plots
pie charts
scatter plots
pair plots
heatmaps
box plots
count plots

How to Run the Project

Clone the repository:

git clone https://github.com/CodeeSam/EDA_Demo.git
cd EDA_Demo

Install the required dependencies:

pip install -r requirements.txt

Open the notebook:

jupyter notebook EDA.ipynb

Run the notebook cells in order.

Requirements

The main Python packages used in this project include:

pandas
numpy
seaborn
matplotlib
jupyter

A typical requirements.txt file may include:

pandas
numpy
seaborn
matplotlib
jupyter

Important Note on Dataset Availability

The notebook expects a file named:

Marketing_Analysis.csv

Project Note

This repository represents one of my early exploratory data analysis practice projects. It is maintained as part of my data science and machine learning learning archive.

The project is intended to demonstrate foundational EDA skills rather than advanced statistical modeling or machine learning.

Limitations

Some limitations of this project include:

The repository currently focuses on exploratory analysis only.
No predictive machine learning model is included.
The workflow is notebook-based and not modularized into scripts.
The analysis depends on the availability and structure of the original dataset.
Some visualizations may require further formatting for publication-level presentation.

Applications

This type of project can be useful as a starting point for:

exploratory data analysis practice
marketing analytics
customer behavior analysis
data visualization learning
beginner-level data science training
preparing datasets for machine learning workflows

Author

Samson Ayorinde Oni
Data Science | Machine Learning | Computational Research

License

This repository is released under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploratory Data Analysis Demo

Project Overview

Objectives

Repository Structure

Files Description

`EDA.ipynb`

`Marketing_Analysis.csv`

`requirements.txt`

`README.md`

Analysis Workflow

1. Importing Libraries

2. Loading the Dataset

3. Data Cleaning

4. Missing Value Handling

5. Univariate Analysis

6. Bivariate Analysis

7. Categorical Analysis

8. Visualization

How to Run the Project

Requirements

Important Note on Dataset Availability

Project Note

Limitations

Applications

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
EDA.ipynb		EDA.ipynb
LICENSE		LICENSE
Marketing_Analysis.csv		Marketing_Analysis.csv
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Exploratory Data Analysis Demo

Project Overview

Objectives

Repository Structure

Files Description

EDA.ipynb

Marketing_Analysis.csv

requirements.txt

README.md

Analysis Workflow

1. Importing Libraries

2. Loading the Dataset

3. Data Cleaning

4. Missing Value Handling

5. Univariate Analysis

6. Bivariate Analysis

7. Categorical Analysis

8. Visualization

How to Run the Project

Requirements

Important Note on Dataset Availability

Project Note

Limitations

Applications

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`EDA.ipynb`

`Marketing_Analysis.csv`

`requirements.txt`

`README.md`

Packages