Skip to content
View ChiQuynhDo's full-sized avatar

Block or report ChiQuynhDo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ChiQuynhDo/README.md

👋 Hi, I'm Chi Do

Profile views

ChiQuynhDo's GitHub stats Top Languages

---

🚀 Data • AI • Cloud • Automation – Building Reliable Systems That Create Impact

I am a data scientist and engineer with a Master of Data Science from the University of Virginia (UVA MSDS ’25).

I build practical, reliable, and scalable data systems—combining machine learning, cloud tools, automation patterns, and modern AI techniques.

My recent work includes a year-long collaboration with the U.S. Census Bureau, developing a large-scale metadata extraction and document-classification pipeline to improve public data usability.

I bring a unique combination of:

  • Strong technical execution
  • Real project delivery experience
  • Professional discipline from 20+ years in financial services
  • High energy, curiosity, and continuous learning

I am open to roles in Data Science, Machine Learning, AI Engineering, Data Engineering, Cloud/DataOps, and GovTech.


📌 Current and Recent Projects

1️⃣ U.S. Census Bureau – Metadata & Citation Tool (Capstone Project)

Tech: Python · Web Crawling · Metadata Extraction · Regex · NLP · LLMs · Automation

  • Designed a recursive web crawler (10-level depth) to map and analyze 3,532 Census Bureau pages.
  • Extracted metadata from PDFs, CSVs, XLS/XLSX files using parsing and classification pipelines.
  • Built a hybrid rule-based + LLM classification workflow.
  • Delivered structured datasets and documentation to Census research leads.

Repository: (will be linked to your capstone repo)


2️⃣ ICU Mortality Prediction – LSTM Deep Learning Model

Tech: Python · TensorFlow · Keras · Time-Series Modeling · Explainability

  • Modeled patient sequences using LSTM for mortality-risk prediction after cardiac surgery (MIMIC-IV).
  • Achieved high accuracy (~98.6%) on LSTM, outperforming baseline RNN/GRU.
  • Improving the project with SHAP explainability.
  • Rebuilding the pipeline for cleaner engineering and GitHub readiness.

Repository: (link when ready)


3️⃣ Bayesian Modeling & Statistical Inference (Stan/MCMC)

Tech: Stan · CmdStanPy · MCMC · Variational Inference

  • Implemented stochastic volatility and flight price Bayesian models.
  • Conducted posterior diagnostics, trace plots, and correlation analysis.
  • Compared MCMC vs. Variational Inference behaviors.

Repository: (add link when ready)


4️⃣ Deep Learning Codeathon Series (UVA DS6050)

Tech: CNN · Text Classification · Generative Models (DCGAN)

  • Built image classification models (CNN).
  • Implemented sentiment analysis on IMDB reviews (Bidirectional LSTM, ~86% accuracy).
  • Developed generative models using DCGAN.
  • Practiced modular deep learning across data, modeling, and evaluation.

Repository: (add links to codeathon repos)


🛠 Skills & Tools

💻 Programming & Tools

Python · R · SQL · Bash · Git/GitHub · Linux · Jupyter · VS Code · Makefile · GitHub Actions

🤖 Machine Learning & AI

Neural Networks · CNN · LSTM · GRU · GAN · scikit-learn · TensorFlow · Keras
NLP pipelines · spaCy · Regex · Prompt Engineering · LLM-assisted classification
Generative AI · Explainability (SHAP)

📊 Statistics & Modeling

Bayesian inference (Stan, CmdStanPy)
MCMC · Variational Inference
Regression · Probability · Hypothesis Testing · Time Series

🧱 Data Engineering & Automation

Web scraping · BeautifulSoup · Requests
Large-scale crawling (DFS/BFS hybrid)
Metadata extraction (PDF/CSV/Excel)
Cron jobs · Automation patterns · Prototype AWS S3 workflows


🎯 What I’m Focusing on Now

  • Strengthening MLOps and cloud deployment fundamentals.
  • Enhancing the ICU LSTM model with SHAP explainability.
  • Expanding the Census metadata system into a more automated pipeline.
  • Building clean, recruiter-ready repositories for my portfolio.

📬 Contact


Popular repositories Loading

  1. pytorch pytorch Public

    Forked from pytorch/pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    C++

  2. DS5100_Grp_Project DS5100_Grp_Project Public

    Forked from IanDoran/DS5100_Grp_Project

    Repository for files related to our DS5100 group project.

  3. STAT_6021_Project_2 STAT_6021_Project_2 Public

    Forked from grm7q/STAT_6021_Project_2

    Project 2_STAT6021

    HTML

  4. Project-1-Repo Project-1-Repo Public

    Forked from alexdlilly/Project-1-Repo

    Repository for Project 1 in STAT6021.

  5. STAT6021_Project_1_Repo STAT6021_Project_1_Repo Public