AI Researcher — Knowledge Distillation · Mechanistic Interpretability · World Models
AI Researcher · Erdős AI Lab · 2026–Present B.Tech Artificial Intelligence & Robotics · Dayananda Sagar University · 2023–2027
AI Researcher at Erdős AI Lab, working on the boundary between how language models compress information and how that information is mechanistically structured inside the network.
I treat knowledge distillation, sparse autoencoders, pruning, and quantization as analytical probes rather than just engineering optimizations — using compression to surface what models actually represent, and using interpretability to figure out what we can or cannot afford to compress.
Current threads: Knowledge distillation · Sparse autoencoders (SAEs) · World models
LLM architecture & compression · Protein structure prediction
Medical imaging · Agentic RAG · Indic NLP
First-author · Under review at COLM 2026 · Erdős AI Lab
A theoretical and empirical study on the dictionary width at which a sparse autoencoder's reconstruction loss bottoms out, given a target sparsity. Toy-model validation followed by three real-LM trials on Pythia-410M (24 layers × 6 token checkpoints each).
Checkpoints on Hugging Face:
colm-run-exp-2-t1(L1 = 3e-4 fixed, dense regime)colm-run-exp-2-t2(L1 = 8e-5 fixed, paper-exact)colm-run-trial-2(L1 = 5e-4 adaptive, target L0 ≈ 150)
PyTorch Pythia SAE Mechanistic Interpretability
Active · Erdős AI Lab
Probing transformer internals with sparse autoencoders, attention-circuit analysis, and feature-attribution studies. Extending to attention-circuit analysis on IOI / induction heads, feature-attribution probes on reasoning benchmarks, and cross-model generalisation (Pythia-410M → 1.4B → 2.8B).
Adjacent thread: a framework for better generalisation on low-sample medical imaging without generative deepfake augmentation.
PyTorch SAEs Probing Activation Patching
Published in journal · Erdős AI Lab
"A Systematic Deep Learning Framework for PCOS Detection Using Deduplicated Ultrasound Images: Comparative Analysis of CNN and Vision Transformer Models." A novel three-stage deduplication pipeline (MD5 + perceptual hashing + cross-class removal) cleaned the PCOS-XAI dataset from 11,784 to 3,490 images (70.4% removed). Systematic benchmark of 18 architectures (13 CNNs + 5 ViTs) under identical conditions for 200 epochs.
Top result: EfficientFormer-L1 and MobileViT-Small (hybrid CNN-Transformer) tied at 99.81% test accuracy with AUC up to 1.0. Pure ViT-Base and Swin Transformer Base failed to converge on this dataset size.
Compute: NVIDIA A100 (80 GB VRAM) · 64 GB system RAM · Intel Xeon 42-core CPU.
PyTorch Vision Transformers CNNs Medical Imaging
Active · Erdős AI Lab
Structure-prediction studies on small proteins — pLDDT-style confidence calibration, folding-trajectory dynamics (Q, Cα RMSD, R_g, Q–RMSD landscape), and head-to-head comparisons between transformer folding stacks and classical MD baselines.
PyTorch ESMFold Computational Biology
Active · 2025–Present
Compression and deployment experiments across 0.5B–7B parameter models. Teacher–student distillation for Hindi and Kannada low-resource instruction datasets. Deployed a Gemma 3 1B model on NVIDIA Jetson Nano for real-time on-device inference. Now exploring diffusion-based language models for Indic text generation.
QLoRA LoRA Quantization NVIDIA Jetson Indic NLP
Industry research · Moog India Technology Centre
Agentic retrieval systems for multi-step reasoning over large structured aerospace engineering corpora. Query-aware routing, citation-grounded retrieval, structured reasoning pipelines. Improved document retrieval accuracy from ~70% to 90%+. Includes an MCP (Model Context Protocol) server for tool integration.
LangChain LangGraph n8n RAG MCP Vector Databases
Active · Sep 2025–Present
Vision-based navigation pipelines for all-terrain UAVs — real-time obstacle detection, monocular depth estimation, and sensor fusion for autonomous flight in unstructured environments.
OpenCV ROS2 Depth Estimation Sensor Fusion
Active · Jun 2025–Present
Perception-driven servo control and actuation systems for humanoid prosthetic arm prototypes, integrating real-time visual feedback for adaptive grasping.
ROS2 Servo Control Computer Vision Hardware-in-the-Loop
Research & Modeling
LLM & Agent Systems
Compression & Interpretability
Robotics & Edge
Tools & Infra
Languages
- First-author paper under review at COLM 2026 — Knowledge Distillation: A Minimum-Width Theorem (Erdős AI Lab)
- Published journal paper — A Systematic Deep Learning Framework for PCOS Detection Using Deduplicated Ultrasound Images
- India AI Impact Summit 2026 — Represented Dayananda Sagar University; presented on LLM architectures, medical AI, and autonomous drones
- Exceptional Volunteering & Community Service Award — IEEE RAS & CIS (2025)
- Kaggle Machine Learning Certification (2025)
- RapidMiner Certified Data Science Professional (2024)
AI Researcher, Erdős AI Lab — Founding research lab focused on knowledge distillation, mechanistic interpretability, and world models. Student-founded, incubated at IIT Bombay.
Co-Founder, RoboVerse Club — Built a 100+ member robotics & AI community at DSU; organized 30+ technical workshops on LLMs, robotics, and edge AI.
Tech Lead, E-Cell DSU — Leading technology initiatives for the university startup ecosystem.
Executive Committee Member, IEEE RAS & IEEE CIS — Organized 5+ technical events and student research programs.
Always open to research collaborations, interesting problems, and good conversations about AI.