| 🎓 | MSc Computer Science · TU Darmstadt · LLMs, CV, PGMs, Quantum Computing, Scalable Data Management |
| 📍 | Darmstadt, DE 🇩🇪 — by way of Chennai, IN |
| 🌙 | Architecture decisions happen on late-night walks. The whiteboard comes later. |
| 🎮 | Call of Duty Mobile between training runs |
| 📖 | Reads documentation the way other people read fiction. Unironically. |
| 🤝 | Open to: healthcare-AI research collabs, applied ML roles, LLM systems engineering |
| 🧭 | Currently circling: evaluation frameworks, responsible AI design, multi-agent orchestration, fine-tuning pipelines |
| ⚡ | "If it has no eval loop, it's just a demo." |
|
Intrinsic multi-author style change detection — no reference texts, no author profiles, pure internal comparison, framed as pairwise binary classification.
Dual-stream ensemble: 163-dim stylometric difference vectors → SVM, plus a Siamese Transformer over frozen all-mpnet-base-v2, fused by an LR meta-learner trained strictly on the validation split — leakage kept out by construction. Three difficulty tiers (Easy/Medium/Hard by topic diversity) double as a built-in ablation axis; the Hard tier shows exactly where pure stylometry hits its ceiling once topic is controlled — consistent with published PAN SOTA.
Ensemble Macro F1 = 0.606 · AUC-PR = 0.404 · PAN @ CLEF 2025
→ repo |
Agentic resume intelligence — bias detection runs in a parallel LangGraph branch, decoupled from scoring by design, not as an afterthought.
RAG Fusion with Reciprocal Rank Fusion over Qdrant, parallel DAG orchestration, and QLoRA fine-tuning with automatic device detection (CUDA/MPS/CPU). Retrieval and judgment are separate concerns, wired that way on purpose. RRF fusion · parallel DAG · QLoRA
|
|
Non-diagnostic by design — structured JSON triage with graceful Markdown fallback when parsing fails, because clinical systems can't crash silently.
Clinician-supervised multimodal assistant: image + text intake, severity triage, and localization across 9 languages, streamed over WebSockets. 9 languages · graceful-degradation parser
|
Built as a mini eval framework — same input, two engines, one /benchmark endpoint reporting per-method accuracy and per-sample agreement.
LLM-based and lexicon-based sentiment side by side, so disagreement is a first-class signal instead of noise you average away. /benchmark endpoint · per-sample agreement
|
Treats CV as a streaming data system — per-frame JSON records with DETECTED / PARTIAL / NO_LANES tier classification and summary metrics on exit.
Classical pipeline (Canny + Hough + ROI masking) instrumented like a production service, not a notebook demo. per-frame JSON telemetry
|
Beyond the Siren — emergency-response analysis work; the depth indicator, not the headline.
Maxwell's Rule in AR — physics visualization in augmented reality, because some intuitions need to be walked around, not read about. research · AR / physics
|
More projects
- MLJAK2-Biotech — ML pipeline for JAK2 mutation analysis in biotech workflows.
- Company Culture Analysis — NLP over employee-review corpora to surface culture signals beyond star ratings.
⚡ Current Focus
LLM & Agents
RAG & Data
ML · CV · NLP
Backend & Web
Languages & Infra
|
|
|
If you're building something where evaluation rigor matters — healthcare AI, LLM systems, agentic pipelines — my inbox is open. I'd rather see your failure cases than your demo video.
Currently benchmarking on PAN @ CLEF 2025 — if you're working on authorship analysis, forensic NLP, or style-based evaluation, I want to hear from you.
