Skip to content
View HarlH's full-sized avatar

Block or report HarlH

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HarlH/README.md

Hi, I'm Ngoc Bao Chan Le 👋

I'm a University of British Columbia graduate with a Combined Major in Computer Science and Statistics. I am a data-driven and analytical thinker, passionate about applying statistical learning and programming expertise to real-world problems in data science and AI.My focus is on data analysis, machine learning, and predictive modeling.


Skills & Technologies

Category Key Skills
Programming Language Python, R, SQL, Java, C/C++
Data & ML Libraries scikit-learn, ggplot2, pandas, NumPy, matplotlib, seaborn, dash, plotly
Framework and Tools Power BI(DAX, Power Query), Tableau, MySQL, PostgreSQL, Git, Linux
ETL & Cloud dbt, GCP, Databricks

Featured Projects & Open-Source

My work centers on developing robust models and generating clear insights from complex datasets.

  • Developed regression models using R to predict real estate prices from 414 property sales from Taipei & New Taipei City.
  • Achieved an adjusted R² of 0.737, explaining 74% of the variance, by performing feature engineering with transformations (log price, quadratic age, $\sqrt{\text{MRT distance}}$).
  • Key Insight: Identified MRT proximity and convenience store access as key price drivers.
  • Processed the HTRU2 dataset ($\sim 17\text{k}$ observations) to classify pulsar stars using Random Forest and SVM models in Python.
  • Optimized performance through cross-validation, hyperparameter tuning, and feature engineering.
  • Result: Achieved ROC-AUC > 0.95, with kurtosis and skewness identified as key discriminative features.
  • Analyzed theft crime trends before and after COVID-19 using R[cite: 15, 16].
  • Conducted hypothesis testing and constructed confidence intervals to assess changes in crime proportions.
  • Built clear and impactful data visualizations to communicate temporal trends and policy implications.

Education

  • University of British Columbia (UBC)
    • Bachelor of Science: Combined Major in Computer Science and Statistics (Expected May 2025)
    • Award: Outstanding International Student Award (OIS)

Get In Touch

Contact Channel Details
Email lengocbaochan@gmail.com
LinkedIn Ngoc Bao Chan Le

Pinned Loading

  1. churn_telecom churn_telecom Public

    About Web application built using Streamlit — for analysis and prediction of churn cases in a company in the telephone sector

    Jupyter Notebook

  2. data_warehouse_delivery data_warehouse_delivery Public

    Data Warehouse Project using the "Delivery Center: Food & Goods orders in Brazil" dataset.

    Python

  3. Real-Estate-Valuation Real-Estate-Valuation Public

    Statistical analysis of residential housing prices in Taipei and New Taipei City. Utilizing R for regression modeling, data transformations (Log, Square Root), and backward elimination to identify …

    HTML