Skip to content
View clementina-tom's full-sized avatar

Block or report clementina-tom

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
clementina-tom/README.md

Hi, I'm Clementina Tom πŸ‘‹

ML Engineer building production AI systems β€” RAG pipelines, MLOps, and predictive modeling.


What I Do

I design and ship end-to-end AI systems β€” from data pipelines and model training through to deployment and monitoring. My focus is on building things that actually work in production, not just in notebooks.

  • LLM & RAG Systems β€” retrieval-augmented generation, semantic search, LangChain, LlamaIndex, FAISS, vector databases
  • MLOps & Deployment β€” model serving, pipeline automation, cloud deployment (AWS/GCP), Docker, experiment tracking
  • Predictive Modeling β€” time-series forecasting, anomaly detection, classification and regression at scale
  • Data Engineering β€” automated ETL pipelines, data transformation, business intelligence workflows

Tech Stack

Languages: Python, SQL
ML/AI: Scikit-learn, XGBoost, LightGBM, CatBoost, TensorFlow, PyTorch, HuggingFace Transformers
LLM Tooling: LangChain, LlamaIndex, FAISS, ChromaDB, OpenAI API
MLOps: Docker, MLflow, FastAPI, GitHub Actions
Cloud: AWS (S3, EC2, Lambda), GCP
Data: Pandas, NumPy, Spark (basics), PostgreSQL


Featured Projects

πŸ€– RAG Job Matcher

AI-powered job matching system using retrieval-augmented generation. Ingests job descriptions, embeds them into a vector store, and matches candidates based on semantic similarity.
LangChain FAISS OpenAI FastAPI

πŸ“ˆ Trade & Market AI Forecasting

End-to-end time-series forecasting system for financial market data. Includes walk-forward validation, SARIMA, Prophet, and ensemble modeling.
Prophet LightGBM Time-series Python

πŸ›’ Farm to Feed β€” Shopping Basket Recommendation (Zindi)

Production-grade ML pipeline for surplus produce recommendation. 5-seed hybrid ensemble (LightGBM + CatBoost) achieving public AUC of 0.945.
LightGBM CatBoost Ensemble MLOps

βš™οΈ Automated ETL Pipeline Suite

Modular data workflow system for extract β†’ transform β†’ load operations. Designed for business intelligence and data sync use cases.
Python Automation Data Engineering


Currently

  • πŸ”­ Building production-grade AI systems for real-world deployment
  • 🌍 Competing on Zindi β€” applied ML challenges
  • πŸ’Ό Available for roles in ML Engineering, AI Engineering, and Data Science

Let's Connect

LinkedIn Twitter/X Hashnode Medium Kaggle

"I build AI systems that ship."

Popular repositories Loading

  1. Feed-to-farm-competition Feed-to-farm-competition Public

    A modular, production-ready Machine Learning pipeline that predicts future shopping baskets for surplus fresh produce. Built with a 5-seed Hybrid Ensemble (LGBM + CatBoost) to optimize both AUC and…

    Jupyter Notebook 1

  2. Personalized-Learning-Recommendation-System Personalized-Learning-Recommendation-System Public

    A Constraint-Aware Personalized Learning Recommendation System (PLRS) built with Self Attentive Knowledge Tracing (SAKT), Attentive Knowledge Tracing (AKT) and Spaced repetition to enforce pedagogi…

    Python 1

  3. clementina-tom clementina-tom Public

  4. Supermarket-sales-predictor- Supermarket-sales-predictor- Public

    Supermarket sales predictor

    Jupyter Notebook

  5. My-Agribora-challenge-source-code My-Agribora-challenge-source-code Public

    This repository contains all of my coded works submitted for the Agribora challenge.

    Jupyter Notebook

  6. Automated-ETL-Pipeline-Suite.- Automated-ETL-Pipeline-Suite.- Public

    Automated data workflows (extract β†’ transform β†’ load) for business intelligence or data sync.

    Python