2027 New Grad · MS Applied Machine Learning, University of Maryland College Park (CMNS Science Academy)
LLM inference systems · Mechanistic interpretability · Production ML infrastructure · Agentic AI
AI Data & Analytics Intern, HARMAN International (Samsung) · Dec 2024 - Jun 2025 Multi-agent TypeScript testing system: 93% code coverage vs GitHub Copilot's 70-80%. Found 5 Redis reliability gaps before production. Designed 10+ unit test cases validating memory, tool-use, and prompt components.
Research Assistant, Woxsen University · Aug 2022 - Dec 2024 Led 4-person team on Battery Management System ML (Random Forest, 90-95% accuracy). Co-authored 5 peer-reviewed publications. Translated ML outputs into technical documentation for non-specialist faculty.
Junior Data Analyst Intern, SeriGreen Technologies · Feb 2024 - Jul 2024 Transformed Karnataka cocoon market datasets (10K-100K records) using SQL and Python. Built the SeriGreen Farm Management Web Application (MERN stack). Findings presented directly to founders.
Research Intern, AppsTek Corp · Feb 2023 - Jul 2023 Built multimodal sentiment classification combining video frames, audio, and transcripts via deep learning. 90%+ accuracy across 3 labels. Demoed to AI team.
Can you predict a reasoning model will fail before it visibly fails? A linear probe on DeepSeek-R1 hidden states detects failure at 150 tokens with AUC 0.612 vs 0.445 behavioral baseline (p=0.001). The signal emerges at 100 tokens when surface features are anti-informative. early_detection
Do vision-language models fail differently across image domains? Yes, and the pattern is stark. LLaVA scores 0% on chart OCR while InternVL2 hits 71.1% on identical probes. 945 probes, chi-square p<0.0001. LLaVA has 88% yes-bias on adversarial existence questions. vlm-hallucination
Does more thinking tokens help reasoning models? Mostly no. On GSM8K and MATH-500, accuracy plateaus at 256 tokens. On AIME, bimodal split: 57% of problems converge at ~4,100 tokens (96.5% acc), 43% never converge even at 10,000 tokens (11.5% acc). token-efficiency-math-reasoning
| Project | What it does | Numbers |
|---|---|---|
| inference-server | Continuous batching + paged KV-cache for GPT-2 from scratch (no vLLM). 3 backends benchmarked. Static batching underperforms naive serial under mixed-length traffic. | 2.91 req/s, 0 failures, SSE streaming |
| feature-store | Kafka ingestion with 15% reordering, 5% dupes, 10% late arrivals. 3-node Redis Cluster, hash-tag sharding, schema registry. Three-way consistency validation. | 0 mismatches / 800 checks, p95 4.8ms, 9,300 req/s |
| adaptive_agent | LangGraph 6-node state graph routing to Haiku 4.5 or Sonnet 4 via OpenRouter. Input guard (regex + LLM injection detection). Output guard (hallucination, completeness, format). | 98% routing accuracy, 28.2% cost reduction |
| recsys | SASRec on MovieLens-1M deployed on AWS EC2. Full-ranking eval exposes sampled metric inflation (6.23% vs 70.68% Hit@10). 93% popularity bias documented. | HR@10 78.49%, NDCG@10 58.11%, 8,366 req/s |
| Project | What it found | Numbers |
|---|---|---|
| early_detection | Activation probing on DeepSeek-R1-Distill-Qwen-7B predicts reasoning failure before behavioral signals exist. 200 AIME problems. | AUC 0.612 vs 0.445 at 150 tokens (p=0.001) |
| vlm-hallucination | LLaVA-1.5-7B vs InternVL2-8B across 4 domains. 6-category failure taxonomy. Complete capability absence, not gradual degradation. | 945 probes, chi-square p<0.0001 |
| llm-post-training-pipeline | SFT, reward model, DPO on LLaMA-3.2-1B. Diagnosed TRL bug causing negative KL divergence across 8 failed PPO runs. | +9pp factual (p=0.030), -16.7pp format (p=0.0003) |
| knowledge-agent | Belief graph from documents with cross-entity contradiction detection. MCP server exposing 5 tools. 2 embedding calls per document regardless of size. | 936 entities, 32 conflicts, 0 false positives |
| factuality-verification | Compared 3 fact-checking methods on 14,525 atomic facts. Calibration matters more than model choice. NLI threshold 0.50 to 0.10 improves F1 by +0.076. | F1 0.727, Precision 0.919 |
| Project | What it demonstrated | Numbers |
|---|---|---|
| cuda-attention-kernel | Naive vs tiled attention kernels on A100. Diagnosed why tiling underperforms theory: 40MB L2 cache masks benefit below seq_len=2048. Connects to Flash Attention design rationale. | 515 GFLOPS/s tiled, ~145x over CPU |
| cpp-simd-quant | ARM NEON SIMD on Apple Silicon. Proves SIMD helps attention (11.1x) but not Black-Scholes (1.03x) because 89% of runtime is transcendental functions. Roofline analysis. | 31.88 GFLOPS/s attention, 103.8M options/sec |
| sparse-factor-modeling | 9 LASSO solvers from scratch. Walk-forward backtest, no look-ahead bias. Novel finding: FISTA degrades at high sparsity. KKT-based factor ranking. | Sharpe 5.061, Spearman rho=0.906 |
| Project | What it does | Numbers |
|---|---|---|
| code-memory-agent | Coding agent with persistent SQLite memory. SHA-256 staleness detection as non-bypassable gate. Indexes file purposes, symbols, cross-file dependencies. | 42.9% fewer file reads, 19 decision-reuse events |
| mindmirror | Real-time interview coach analyzing eye contact, facial expressions, speech, vocal patterns every 2 seconds. MediaPipe + faster-whisper + LangGraph. | ~1.2s full pipeline cycle, 6 behavioral states |
| Paper | Venue | Year |
|---|---|---|
| Stem Cell Reviews and Reports | Springer Nature | 2025 |
| Digital Forensics and Cybersecurity | Wiley-Scrivener | 2024 |
| Economic Perspectives | IGI Global | 2024 |
| YOLOv8 Traffic Sign Detection (80.64% acc) | IJSRA | 2024 |
| BERT Sentiment Analysis (F1 0.88) | J. Trends in CS | 2024 |
| Udacity Agentic AI Nanodegree | Jan 2026 |
| Oracle OCI 2025 Certified AI Foundations Associate | Dec 2025 |
| National Hackathon Best Demonstration Award | Oriental Institute of Science & Technology, Bhopal 2023 (Team Leader) |
| Dean's List + Best Student for Research Inclination | Woxsen University |
36 repositories · 5 publications · Python, C++, CUDA, TypeScript