Skip to content

DivyaThakur24/Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Regression with Python

Python scikit-learn Pandas Notebook YouTube

This repository is a regression learning series in Python that walks through the fundamentals of:

  • mean squared error
  • simple linear regression
  • gradient descent

It serves as both a code companion to your YouTube content and a practical mini-course for understanding core regression concepts step by step.

Watch the series here: Regression Video Playlist

Table of Contents

Project Overview

The goal of this repository is to build intuition around regression by moving from basic error measurement to full linear models and optimization techniques.

Instead of jumping directly into advanced machine learning, the repo breaks regression into smaller pieces:

  1. understanding prediction error through Mean Squared Error
  2. fitting a simple linear regression model
  3. understanding how gradient descent supports learning model parameters

This structure makes the project especially useful for beginners learning both the math and the implementation side of regression.

Repository Structure

Topics Covered

1. Mean Squared Error

The first notebook introduces the concept of prediction error and shows how Mean Squared Error can be used to evaluate model output.

This is a good starting point because it helps explain:

  • what model error means
  • why squared error is useful
  • how regression quality can be measured numerically

2. Simple Linear Regression

The second notebook uses a height-weight dataset to demonstrate a basic linear regression workflow using scikit-learn.

It includes:

  • loading the dataset
  • splitting data into training and testing portions
  • fitting a linear regression model
  • making predictions
  • printing coefficients and intercept
  • plotting a regression line
  • evaluating the model using Mean Squared Error

3. Gradient Descent

The third notebook uses the Iris dataset to introduce gradient descent concepts through regression-style fitting and visual interpretation.

It includes:

  • loading the dataset
  • selecting numeric variables
  • plotting the relationship between features
  • fitting a linear model
  • discussing gradient-descent-related optimization ideas

Datasets Used

Height and Weight Dataset

From the project files:

Iris Dataset

From the project files:

  • file: Iris.csv
  • rows: 150
  • includes classic Iris flower measurements such as: SepalLengthCm, SepalWidthCm, PetalLengthCm, PetalWidthCm, and Species

Visual Insight

Simple Linear Regression Plot

This chart comes directly from the notebook and shows the regression line fitted against the height-weight data.

It helps communicate the central idea of linear regression clearly: finding the best-fit line that explains the relationship between an input feature and an output variable.

Simple linear regression plot

Code Example

One of the core model training steps in the repo looks like this:

from sklearn import linear_model

regr = linear_model.LinearRegression()
regr.fit(hw_X_train, hw_Y_train)
hw_Y_pred = regr.predict(hw_X_test)

This small example captures the main regression workflow:

  • create a model
  • train it on feature and target data
  • generate predictions

Why This Project Matters

This repository is useful because it demonstrates:

  • machine learning fundamentals in a beginner-friendly way
  • progression from regression error metrics to actual models
  • use of scikit-learn for supervised learning
  • practical dataset handling with Pandas
  • visual interpretation of regression results

It works well as a portfolio project because it shows both concept teaching and hands-on implementation.

How to Run

  1. Clone the repository.
  2. Open the project folder.
  3. Install the required libraries:
pip install numpy pandas matplotlib scikit-learn jupyter
  1. Launch Jupyter Notebook:
jupyter notebook
  1. Open any of the notebooks and run the cells:

Future Improvements

This repository could be improved further by:

  • exporting more charts from the notebooks and embedding them in the README
  • adding a short summary of results under each notebook
  • including a “Regression Concepts At a Glance” section
  • adding polynomial regression or multivariate regression examples
  • adding model evaluation metrics beyond MSE

Author

Divya Thakur

About

A code bucket for my videos on Regression

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors