Skip to content

Achraf921/Hackathon-CodeML-2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hackathon CodeML 2024

New to machine learning, this was our introduction into the world of supervised learning, where models are trained on labeled datasets to make predictions.

Purpose

The primary goal of this project is to develop a predictive model that estimates customer choices based on historical flight booking data. Understanding customer preferences is crucial for airlines and travel agencies, as it can inform marketing strategies, improve customer satisfaction, and optimize service offerings. This project aims to provide a data-driven approach to predicting customer behavior, thereby enabling businesses to make informed decisions.

Overview

This repository contains a Python implementation of a predictive model that utilizes historical flight data to predict customer choices (e.g., preferred flights, types of services). The model is designed to handle various data types, preprocess them effectively, and produce reliable predictions that can be used for strategic decision-making.

Technologies Used

The following tools and libraries were chosen for this project based on their suitability for the task and their strengths:

  • Pandas:

    Primairly for data manipulation and analysis. Pandas provides powerful data structures and functions for efficiently handling structured data. It simplifies data preprocessing tasks, such as cleaning, filtering, and transforming datasets.
  • NumPy:

    For numerical operations and data manipulation. NumPy is essential for handling arrays and performing mathematical operations. It is used in conjunction with Pandas to enhance the performance of data manipulations.
  • Scikit-learn:

    For machine learning tasks, including model training and evaluation. Scikit-learn is a robust library that offers various algorithms for classification, regression, and clustering. It provides a simple and efficient tool for model training and evaluation, making it an excellent choice for implementing the Random Forest algorithm.
  • Random Forest Classifier:

    To build a predictive model for customer choices. The Random Forest algorithm is an ensemble learning method that is highly effective for classification tasks. It is robust against overfitting and performs well on various datasets, making it a suitable choice for our prediction problem. Features Data preprocessing that includes: Date parsing and feature extraction. Handling missing values. Encoding categorical variables. A robust machine learning model that predicts customer choices based on historical data. Easy-to-use CSV output for predicted choices.

Conclusion

This project aims to leverage machine learning techniques to provide insights into customer behavior in the airline industry. By utilizing effective tools like Pandas, NumPy, and Scikit-learn, the project ensures a smooth workflow from data preprocessing to model prediction. The goal is to create a reliable predictive model that can assist businesses in understanding and anticipating customer needs.


By Fares Laadjel, Mohammed Amine Dakli, Achraf Bayi and Wiame Kotbi

About

New to machine learning, this was our introduction into the world of supervised learning, where models are trained on labeled datasets to make predictions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages