Skip to content

Rose-Armstrong/EDA-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š Exploratory Data Analysis (EDA) Project

πŸ› οΈ Tech Stack & Libraries

  • Language: Python 3.12 | R 4.4
  • Core Libraries: pandas, numpy, matplotlib, seaborn
  • Optional/Advanced: plotly, ydata-profiling

🧹 Data Preprocessing & Cleaning

Before diving into analysis, the data was prepared using the following steps:

  • Handling Missing Values: [e.g., dropped nulls, imputed with median, filled via KNN]
  • Duplicate Removal: [e.g., removed X duplicated rows]
  • Outlier Treatment: [e.g., capped extreme values or removed anomalies]
  • Data Type Conversion: [e.g., string dates cast to datetime]

πŸ“ˆ Key Findings & Insights

Here are the primary observations uncovered during the exploration process:

  • Insight 1: [e.g., Sales peak during Q4, specifically in November.]
  • Insight 2: [e.g., Strong positive correlation ((r = 0.85)) between feature A and feature B.]
  • Insight 3: [e.g., Uneven distribution in the target variable; data is imbalanced.]

πŸ“Š Visualizations

Below is a summary of the visualizations used to understand the data's distribution and relationships:

  • Univariate Analysis: Histograms and box plots to check individual feature distributions and skewness.
  • Bivariate Analysis: Scatter plots and violin plots to identify relationships between the target variable and features.
  • Multivariate Analysis: Correlation heatmaps to detect multicollinearity among numerical variables.

About

Exploratory Data Analysis (EDA) Project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors