Skip to content
View tuannm3812's full-sized avatar
  • University of Technology Sydney
  • Sydney
  • 13:53 (UTC +10:00)
  • LinkedIn in/tuan-m-nguyen

Block or report tuannm3812

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tuannm3812/README.md

Hi, I'm Tuan Nguyen

I'm a Machine Learning Engineer and Data Professional based in Sydney. I build practical ML and data systems that turn messy data into workflows people can inspect, rerun, and use.

Right now my public work is focused on three things:

  • Applied AI products: text-to-SQL agents, retrieval workflows, Gemini/Ollama prototypes, and product-shaped AI apps.
  • Deployable ML systems: computer vision, audio ML, forecasting, calibrated classifiers, FastAPI services, and Streamlit/React interfaces.
  • Analytics and data engineering: Airflow, dbt, Snowflake, Databricks, PySpark, SQL modelling, and portfolio-ready lakehouse projects.

Selected Work

Project Why it matters
Enterprise Text-to-SQL Agent Turns natural-language questions into safer local SQL using hybrid schema RAG, Gemini/Ollama generation, SQLGlot validation, SQLite execution, and Streamlit delivery.
AI Meal Planner Combines FastAPI, Streamlit, calorie prediction, local meal retrieval, nutrition checks, feedback capture, and CI-tested backend contracts into a practical planning workflow.
FoodLens Calibrated Food-101 recognition with ResNet50, confidence routing, multi-food crop detection, and a FastAPI + React prototype.
Airbnb ELT Warehouse End-to-end Sydney Airbnb and Census analytics warehouse using Airflow, dbt, PostgreSQL, medallion modelling, and SCD Type 2 snapshots.
Bioacoustic Species Classification BirdCLEF+ audio ML workspace with EfficientNet-B0, Perch v2 probes, reusable artifacts, and CPU-safe inference packaging.
NFL Player Contact Detection Sports ML workflow using tracking features, helmet-derived video probes, temporal smoothing, type-specific models, and LightGBM blending.
NYC Taxi Databricks Databricks lakehouse workflow with PySpark, Spark SQL, Delta Lake curation, trip feature engineering, ridge regression, and segment diagnostics.
Solana Price Forecasting Live forecasting app with Kraken OHLCV ingestion, technical indicators, residual modelling, FastAPI option, tests, and Streamlit delivery.

How I Work

I like the part of ML and data work where the model becomes a usable system:

  • clear inputs and outputs
  • reproducible pipelines
  • visible assumptions and failure modes
  • diagnostics that explain what changed
  • interfaces another person can actually use

That usually means taking a project beyond a notebook: tightening data contracts, adding validation, packaging artifacts, writing the README, and making the result easy to review.

Find Me

I'm open to conversations about machine learning, data engineering, MLOps, applied AI, Kaggle workflows, and turning rough technical work into something clear enough to trust.

Pinned Loading

  1. airbnb-ELT-warehouse airbnb-ELT-warehouse Public

    End-to-end ELT warehouse for Sydney Airbnb and Census analytics using Airflow, dbt, PostgreSQL, and a Bronze-Silver-Gold architecture.

    Python 1

  2. solana-price-prediction solana-price-prediction Public

    Live Solana next-day high prediction dashboard using Kraken OHLCV data, an anchored residual ML model, and Streamlit.

    Python 1

  3. kaggle-birdclef-2026 kaggle-birdclef-2026 Public

    Kaggle BirdCLEF+ 2026 workspace with curated EDA, EfficientNet-B0 baseline, and Google Perch v2 probe notebooks plus reusable training utilities.

    Jupyter Notebook

  4. aipa-text-to-sql-agent aipa-text-to-sql-agent Public

    Forked from huyducv/aipa-text-to-sql-agent

    Text-to-SQL Enterprise Agent for university assignment

    Python 1

  5. foodlens-calibrated-food-recognition foodlens-calibrated-food-recognition Public

    Calibrated Food-101 recognition system with PyTorch ResNet50, confidence-based decision routing, multi-food crop detection, and a FastAPI + React FoodLens prototype for image, video, and URL analysis.

    Jupyter Notebook

  6. kaggle-s6e6-predicting-stellar-class kaggle-s6e6-predicting-stellar-class Public

    Advanced ensembling pipeline for Kaggle S6E6 (Predicting Stellar Class). Features 6-model probability stacking, Nelder-Mead threshold calibration, and │ public consensus hybrid blending.

    Jupyter Notebook