Skip to content
View boshi19920920-tech's full-sized avatar

Block or report boshi19920920-tech

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
boshi19920920-tech/README.md

Hi, I'm Boshi Zhao πŸ‘‹

Profile views

I am a statistician and data scientist with expertise in statistical modeling, machine learning, Bayesian methods, and health data analytics. I recently completed my PhD in Mathematical Sciences with a Statistics focus at Northern Illinois University, where my research focused on multivariate longitudinal data and Bayesian functional factor models.

My work combines rigorous statistical methods, reproducible data science workflows, and clear communication to solve real-world problems across healthcare, public health, business analytics, and applied research.


πŸ” Research Interests

  • Bayesian modeling and inference
  • Functional data analysis
  • Factor models and dimension reduction
  • Longitudinal and multilevel models
  • Biostatistics and health disparities
  • Machine learning methods

πŸ›  Skills

Programming and Data Tools

  • Python
  • R
  • SQL
  • SAS
  • SPSS
  • Git and GitHub
  • Jupyter Notebook
  • R Markdown
  • LaTeX

Data Analysis and Visualization

  • Data cleaning
  • Data wrangling
  • Exploratory data analysis
  • Feature engineering
  • Reproducible reporting
  • Statistical visualization
  • Tableau
  • Power BI
  • matplotlib
  • ggplot2

Statistical Modeling

  • Linear regression
  • Logistic regression
  • Poisson regression
  • Negative binomial regression
  • Mixed models
  • Survival analysis
  • Longitudinal data analysis
  • Multivariate analysis
  • Bayesian modeling
  • Principal component analysis
  • Factor analysis

Machine Learning

  • Train and test split
  • Cross validation
  • Classification model evaluation
  • Logistic regression
  • Random forest
  • XGBoost
  • AUC
  • Accuracy
  • Precision
  • Recall
  • Calibration
  • Feature importance
  • SHAP interpretation

Applied Areas

  • Health data analytics
  • Public health research
  • Social determinants of health
  • Biostatistics
  • Statistical consulting
  • Business analytics
  • Predictive modeling

πŸ“„ Documents


πŸ“š Selected Projects

End to End Machine Learning Pipeline

A reproducible supervised machine learning project for diabetes prediction using Python. This project demonstrates the full data science workflow, including data cleaning, exploratory data analysis, feature engineering, train/test split, preprocessing, logistic regression, random forest, XGBoost, cross-validation, model evaluation, calibration, feature importance, SHAP interpretation, and visualization.

Repository: end-to-end-machine-learning-pipeline

Bayesian Functional Time Series

Bayesian modeling for extracting latent signals from high dimensional time series data. This project connects Bayesian factor models, functional data analysis, smoothing, and temporal latent structure interpretation.

Repository: bayesian-functional-time-series

Social Determinants of Health Analytics

Statistical modeling using large scale health data to study pain outcomes, perceived healthcare discrimination, prescription drug use patterns, and health disparities. Methods include data cleaning, regression modeling, negative binomial models, subgroup analysis, reproducible reporting, and communication of findings to both technical and nontechnical audiences.

Consulting and Applied Statistics

Applied statistical consulting experience across medical, educational, transportation, and public health projects. Responsibilities include client communication, data management, statistical modeling, visualization, report writing, and translating technical results for nontechnical audiences.

πŸ“Š GitHub Stats

GitHub stats Top languages


🌐 Find Me Online


πŸ“¬ Contact

Email: boshi19920920@gmail.com

Feel free to reach out if you would like to collaborate or connect.


⭐ Thank you for visiting my GitHub profile. More updates coming soon.

Pinned Loading

  1. Boshi-Zhao.github.io Boshi-Zhao.github.io Public

    My personal academic website, including research, projects, and CV.

    HTML

  2. boshi19920920-tech boshi19920920-tech Public

  3. bayesian-functional-time-series bayesian-functional-time-series Public

    Bayesian modeling for extracting latent signals from high dimensional time series data.

    Jupyter Notebook

  4. end-to-end-machine-learning-pipeline end-to-end-machine-learning-pipeline Public

    A reproducible machine learning project covering data cleaning, feature engineering, model training, cross-validation, evaluation, calibration, interpretation, and visualization.

    Jupyter Notebook