Skip to content

Bekamgenene/Elevvo-Internship-Program

Repository files navigation

๐ŸŒŸ Elevvo Internship Program โ€” Machine Learning Track

Welcome to my Elevvo Internship Program repository!
This repository documents my journey, tasks, and completed projects during the Elevvo internship focused on Machine Learning and AI-driven problem-solving.

Each notebook demonstrates a different real-world use case โ€” from predictive modeling and clustering to computer vision and recommendation systems.


๐Ÿงญ Overview

The Elevvo Internship Program allowed me to apply data science and ML concepts on various datasets.
It strengthened my understanding of the end-to-end ML workflow โ€” data preprocessing, model training, hyperparameter tuning, and evaluation.

Participants were expected to complete:

  • โœ… 4+ tasks for a 1-month internship

This repository contains all completed core tasks.


๐Ÿง  Learning Objectives

  • Apply supervised and unsupervised learning techniques on diverse datasets.
  • Explore data preprocessing, feature engineering, and model evaluation.
  • Build real-world machine learning projects using Python and Scikit-learn.
  • Understand and compare performance of various models and metrics.

๐Ÿงฐ Tools & Libraries

  • Python
  • NumPy, Pandas
  • Matplotlib, Seaborn
  • Scikit-learn
  • TensorFlow / Keras
  • OpenCV
  • XGBoost / LightGBM
  • Google Colab

๐Ÿงฉ Projects and Tasks

๐Ÿงฎ 1. Student Score Prediction

Goal: Predict studentsโ€™ exam scores based on study hours and related academic factors.
Dataset: Student Performance Dataset โ€“ Kaggle

๐Ÿง  Process:

  • Loaded and cleaned student performance data
  • Performed exploratory data analysis (EDA) using Matplotlib and Seaborn
  • Trained a Linear Regression model to predict final exam scores
  • Evaluated with metrics like Rยฒ, MAE, and RMSE

๐Ÿ“Š Insights:

  • Study hours and participation levels had a strong positive correlation with scores
  • The Linear Regression model achieved an Rยฒ score above 0.85, showing strong predictive power
  • Bonus experiments with Polynomial Regression improved results slightly

๐Ÿงฐ Techniques:

Regression | EDA | Feature Engineering | Model Evaluation


๐Ÿงญ 2. Customer Segmentation

Goal: Group customers into clusters based on annual income and spending behavior.
Dataset: Mall Customer Dataset โ€“ Kaggle

๐Ÿง  Process:

  • Scaled features using StandardScaler
  • Applied K-Means Clustering and determined optimal cluster number using the Elbow Method
  • Visualized customer groups in 2D space (Income vs. Spending Score)

๐Ÿ“Š Insights:

  • Identified 5 distinct clusters representing different spending behaviors (e.g., high-income low-spending vs. low-income high-spending)
  • Helped visualize how customers differ across spending habits โ€” useful for marketing strategies
  • Bonus: Tested DBSCAN clustering for better separation

๐Ÿงฐ Techniques:

Unsupervised Learning | K-Means | DBSCAN | Data Visualization


๐Ÿ’ณ 3. Loan Approval Prediction

Goal: Predict whether a bank loan application will be approved based on applicant information.
Dataset: Loan Approval Prediction Dataset โ€“ Kaggle

๐Ÿง  Process:

  • Handled missing values and categorical features using Label Encoding and One-Hot Encoding
  • Split the dataset into training/testing subsets
  • Trained and compared Logistic Regression, Decision Tree, and Random Forest classifiers

๐Ÿ“Š Insights:

  • Random Forest achieved the highest accuracy (~95%), outperforming Logistic Regression
  • Gender, ApplicantIncome, and Credit_History were key factors influencing predictions
  • Used SMOTE to handle class imbalance and improve recall

๐Ÿงฐ Techniques:

Binary Classification | Data Encoding | Imbalanced Learning | Evaluation Metrics


๐ŸŽฌ 4. Movie Recommendation System

Goal: Build a movie recommender using collaborative filtering techniques.
Dataset: MovieLens 100K Dataset โ€“ Kaggle

๐Ÿง  Process:

  • Created a user-item rating matrix
  • Computed similarity scores using cosine similarity between users
  • Recommended top-rated unseen movies based on similar usersโ€™ preferences

๐Ÿ“Š Insights:

  • Successfully recommended personalized movie lists using user-based collaborative filtering
  • Experimented with item-based filtering and SVD matrix factorization for improved performance
  • Evaluated recommendations using Precision@K

๐Ÿงฐ Techniques:

Recommendation System | Collaborative Filtering | Cosine Similarity | Matrix Factorization


๐Ÿšฆ 5. Traffic Sign Recognition

Goal: Classify German traffic signs using Convolutional Neural Networks (CNN).
Dataset: GTSRB โ€“ German Traffic Sign Recognition Benchmark

๐Ÿง  Process:

  • Preprocessed images (resizing, normalization)
  • Built a custom CNN using Keras
  • Trained the model on 40+ sign categories
  • Evaluated performance using accuracy and confusion matrix

๐Ÿ“Š Insights:

  • The CNN achieved an accuracy of 98% on the test set
  • Using data augmentation improved generalization

๐Ÿงฐ Techniques:

Deep Learning | CNN | Image Preprocessing | Transfer Learning



๐Ÿ“Š Evaluation Metrics Used

Category Metrics
Regression Rยฒ, MAE, RMSE
Classification Accuracy, Precision, Recall, F1-score
Clustering Silhouette Score, Inertia
Recommendation Precision@K
Deep Learning Accuracy, Loss Curve, Confusion Matrix

๐Ÿงช How to Run Locally

  1. Clone the repository:
    git clone https://github.com/Bekamgenene/Elevvo-Internship-Program.git
    cd Elevvo-Internship-Program

About

This repository contains all the work, tasks, and learning materials completed during my Elevvo Internship Track. The repository includes: ๐Ÿ“„ Internship Offer Letter โœ… Completed Internship Tasks The goal of this repository is to document my internship journey, track progress, and showcase the skills and projects I developed the program.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors