Data Scientist building credit risk models, AML detection systems, and ML pipelines for fintech.
- Feature engineering pipelines and credit risk scoring systems
- Large-scale data transformations with PySpark and SQL
- End-to-end ML solutions from raw data to production deployment
| Project | What it does | Key tech |
|---|---|---|
| credit-risk-scorecard-engine | WoE encoding, PDO credit scoring (Gini 0.636), PSI monitoring, GDPR explainability | Python, optbinning, XGBoost, SQL, Docker |
| pyspark-aml-transaction-analysis | PySpark feature pipelines, XGBoost, SHAP, MLflow, Azure ML SDK v2, Airflow 2.9 | PySpark, XGBoost, SHAP, MLflow, Azure ML, Airflow |
| marketing-ai-automation-platform | Statistical A/B testing — z-test, Welch's t-test, Fleiss 1981 sample size calculator | Python, Streamlit, Pandas, Plotly |
| customer-support-intelligence-platform | Random Forest SLA classifier, live Streamlit app | scikit-learn, Streamlit |
| ai-ops-workflow-automation-platform | LangGraph agent, pgvector RAG, Terraform GCP, Prometheus/Grafana monitoring | LangGraph, pgvector, Terraform, Docker, Prometheus |