Project-3

Citibike Classification Project

Citibike is a bike share system in NYC that uses docking stations throughout the greater NYC area in all borough except Staten Island. The project is attempting to develop a model that will detect the correct number of customer in relation to the number of subscribers. Subscribers are members of the system that live area or within the system and make up the majority of trips within citibike's network. Customers use the system in a one-off or 3 day pass. They don't typically live in the area meaning they are visiting or testing the system out before becoming subscribers. This project tested four different models to each bring different results and insights into the data.

Logistic Regression
Naive Bayes
Random Forest
XGBoost

All of the models predicted for the smaller class customers that number 1:3 in relation to subscribers. This allows our metrics to adequately reflect the imbalance in the data. After analyzing possible scores for evaluation: fbeta with beta at 0.25 offered the best balance between recall and precision. The model's goal was the reduce the number of inaccurately classified subscribers as customers instead of customers labeled as subscribers.

Files

Database-Build
- Python Notebook: Walk through of the files uploaded from Citibike's database that were uploaded to a local SQL database. Used SQL to filter data into one dataset for cleaning and preprocessing.
Logistic Regression
- Python Notebook: Logistic RegressionCV used. Ruled out as potential model for project
Naive Bayes
- Python Notebook:
Random Forest
- Python Notebook:
XGBoost
- Python Notebook: Warning Errors and delayed installation prevented the model from being used further. Insufficient time to tune hyperparameters to get the model to perform optimally.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.DS_Store		.DS_Store
Citibike Classification.pdf		Citibike Classification.pdf
Database-Build.ipynb		Database-Build.ipynb
Logistic Regression.ipynb		Logistic Regression.ipynb
Naive-Bayes.ipynb		Naive-Bayes.ipynb
README.md		README.md
Random Forest.ipynb		Random Forest.ipynb
XG-Boost.ipynb		XG-Boost.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project-3

Citibike Classification Project

Files

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project-3

Citibike Classification Project

Files

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages