A collection of data analysis projects and practice notebooks using Python, Pandas, and Scikit-Learn.
This repository focuses on exploring datasets, handling missing values, and building a strong foundation in data preprocessing and analysis.
- Data cleaning and preprocessing
- Handling missing values
- Exploratory Data Analysis (EDA)
- Working with real-world datasets
- Basic machine learning using Scikit-Learn
- Pandas operations and data manipulation
- Airbnb NYC Dataset (AB_NYC_2019.csv)
- Google Play Store Dataset
- Custom datasets for practice
- Python 🐍
- Pandas
- NumPy
- Scikit-Learn
- Jupyter Notebook
- 📓 DataSetWalkthrough.ipynb – Dataset exploration
- 📓 NullValueFileHandeling.ipynb – Handling missing values
- 📓 Pandas.ipynb – Pandas operations
- 📓 sklearn.ipynb – Basic ML with Scikit-Learn
- 🐍 demo.py – Python script example
- 📊 Datasets – Airbnb NYC, Google Play Store, and custom data
This repository is created to:
- Practice data analysis concepts
- Build a strong foundation in data science
- Work with real-world datasets
- Improve problem-solving skills using Python
- Add data visualizations (Matplotlib, Seaborn)
- Include advanced EDA projects
- Build end-to-end ML pipelines
- Add project-based case studies
Anupam Singh
Aspiring Data Analyst & Developer