This repository is a regression learning series in Python that walks through the fundamentals of:
- mean squared error
- simple linear regression
- gradient descent
It serves as both a code companion to your YouTube content and a practical mini-course for understanding core regression concepts step by step.
Watch the series here: Regression Video Playlist
- Project Overview
- Repository Structure
- Topics Covered
- Datasets Used
- Visual Insight
- Code Example
- Why This Project Matters
- How to Run
- Future Improvements
- Author
The goal of this repository is to build intuition around regression by moving from basic error measurement to full linear models and optimization techniques.
Instead of jumping directly into advanced machine learning, the repo breaks regression into smaller pieces:
- understanding prediction error through Mean Squared Error
- fitting a simple linear regression model
- understanding how gradient descent supports learning model parameters
This structure makes the project especially useful for beginners learning both the math and the implementation side of regression.
-
#1-Mean Squared Error.ipynbMean Squared Error concept walkthrough -
#2-Simple Linear Regression/#2-Simple Linear Regression.ipynbsimple linear regression using height and weight data -
#2-Simple Linear Regression/Height_Weight.csvdataset for simple linear regression -
#3-Gradient Descent/#3 Gradient Descent.ipynbgradient descent demonstration -
#3-Gradient Descent/Iris.csvdataset used for gradient descent exploration -
images/simple_linear_regression_plot.pngextracted chart used in the README
The first notebook introduces the concept of prediction error and shows how Mean Squared Error can be used to evaluate model output.
This is a good starting point because it helps explain:
- what model error means
- why squared error is useful
- how regression quality can be measured numerically
The second notebook uses a height-weight dataset to demonstrate a basic linear regression workflow using scikit-learn.
It includes:
- loading the dataset
- splitting data into training and testing portions
- fitting a linear regression model
- making predictions
- printing coefficients and intercept
- plotting a regression line
- evaluating the model using Mean Squared Error
The third notebook uses the Iris dataset to introduce gradient descent concepts through regression-style fitting and visual interpretation.
It includes:
- loading the dataset
- selecting numeric variables
- plotting the relationship between features
- fitting a linear model
- discussing gradient-descent-related optimization ideas
From the project files:
- file:
Height_Weight.csv - rows: 35
- columns:
Height,Weight
From the project files:
- file:
Iris.csv - rows: 150
- includes classic Iris flower measurements such as:
SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm, andSpecies
This chart comes directly from the notebook and shows the regression line fitted against the height-weight data.
It helps communicate the central idea of linear regression clearly: finding the best-fit line that explains the relationship between an input feature and an output variable.
One of the core model training steps in the repo looks like this:
from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(hw_X_train, hw_Y_train)
hw_Y_pred = regr.predict(hw_X_test)This small example captures the main regression workflow:
- create a model
- train it on feature and target data
- generate predictions
This repository is useful because it demonstrates:
- machine learning fundamentals in a beginner-friendly way
- progression from regression error metrics to actual models
- use of
scikit-learnfor supervised learning - practical dataset handling with Pandas
- visual interpretation of regression results
It works well as a portfolio project because it shows both concept teaching and hands-on implementation.
- Clone the repository.
- Open the project folder.
- Install the required libraries:
pip install numpy pandas matplotlib scikit-learn jupyter- Launch Jupyter Notebook:
jupyter notebook- Open any of the notebooks and run the cells:
This repository could be improved further by:
- exporting more charts from the notebooks and embedding them in the README
- adding a short summary of results under each notebook
- including a “Regression Concepts At a Glance” section
- adding polynomial regression or multivariate regression examples
- adding model evaluation metrics beyond MSE
Divya Thakur
- GitHub: DivyaThakur24
- LinkedIn: divya-thakurr
- Portfolio: divyathakur24.github.io/DivyaThakurPortfolio
