Skip to content
View oladri-renuka's full-sized avatar

Block or report oladri-renuka

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
oladri-renuka/README.md

Renuka Oladri

2027 New Grad · MS Applied Machine Learning, University of Maryland College Park (CMNS Science Academy)
LLM inference systems · Mechanistic interpretability · Production ML infrastructure · Agentic AI


Experience

AI Data & Analytics Intern, HARMAN International (Samsung) · Dec 2024 - Jun 2025 Multi-agent TypeScript testing system: 93% code coverage vs GitHub Copilot's 70-80%. Found 5 Redis reliability gaps before production. Designed 10+ unit test cases validating memory, tool-use, and prompt components.

Research Assistant, Woxsen University · Aug 2022 - Dec 2024 Led 4-person team on Battery Management System ML (Random Forest, 90-95% accuracy). Co-authored 5 peer-reviewed publications. Translated ML outputs into technical documentation for non-specialist faculty.

Junior Data Analyst Intern, SeriGreen Technologies · Feb 2024 - Jul 2024 Transformed Karnataka cocoon market datasets (10K-100K records) using SQL and Python. Built the SeriGreen Farm Management Web Application (MERN stack). Findings presented directly to founders.

Research Intern, AppsTek Corp · Feb 2023 - Jul 2023 Built multimodal sentiment classification combining video frames, audio, and transcripts via deep learning. 90%+ accuracy across 3 labels. Demoed to AI team.


Research Highlights

Can you predict a reasoning model will fail before it visibly fails? A linear probe on DeepSeek-R1 hidden states detects failure at 150 tokens with AUC 0.612 vs 0.445 behavioral baseline (p=0.001). The signal emerges at 100 tokens when surface features are anti-informative. early_detection

Do vision-language models fail differently across image domains? Yes, and the pattern is stark. LLaVA scores 0% on chart OCR while InternVL2 hits 71.1% on identical probes. 945 probes, chi-square p<0.0001. LLaVA has 88% yes-bias on adversarial existence questions. vlm-hallucination

Does more thinking tokens help reasoning models? Mostly no. On GSM8K and MATH-500, accuracy plateaus at 256 tokens. On AIME, bimodal split: 57% of problems converge at ~4,100 tokens (96.5% acc), 43% never converge even at 10,000 tokens (11.5% acc). token-efficiency-math-reasoning


Systems and Infrastructure

Project What it does Numbers
inference-server Continuous batching + paged KV-cache for GPT-2 from scratch (no vLLM). 3 backends benchmarked. Static batching underperforms naive serial under mixed-length traffic. 2.91 req/s, 0 failures, SSE streaming
feature-store Kafka ingestion with 15% reordering, 5% dupes, 10% late arrivals. 3-node Redis Cluster, hash-tag sharding, schema registry. Three-way consistency validation. 0 mismatches / 800 checks, p95 4.8ms, 9,300 req/s
adaptive_agent LangGraph 6-node state graph routing to Haiku 4.5 or Sonnet 4 via OpenRouter. Input guard (regex + LLM injection detection). Output guard (hallucination, completeness, format). 98% routing accuracy, 28.2% cost reduction
recsys SASRec on MovieLens-1M deployed on AWS EC2. Full-ranking eval exposes sampled metric inflation (6.23% vs 70.68% Hit@10). 93% popularity bias documented. HR@10 78.49%, NDCG@10 58.11%, 8,366 req/s

Research and Evaluation

Project What it found Numbers
early_detection Activation probing on DeepSeek-R1-Distill-Qwen-7B predicts reasoning failure before behavioral signals exist. 200 AIME problems. AUC 0.612 vs 0.445 at 150 tokens (p=0.001)
vlm-hallucination LLaVA-1.5-7B vs InternVL2-8B across 4 domains. 6-category failure taxonomy. Complete capability absence, not gradual degradation. 945 probes, chi-square p<0.0001
llm-post-training-pipeline SFT, reward model, DPO on LLaMA-3.2-1B. Diagnosed TRL bug causing negative KL divergence across 8 failed PPO runs. +9pp factual (p=0.030), -16.7pp format (p=0.0003)
knowledge-agent Belief graph from documents with cross-entity contradiction detection. MCP server exposing 5 tools. 2 embedding calls per document regardless of size. 936 entities, 32 conflicts, 0 false positives
factuality-verification Compared 3 fact-checking methods on 14,525 atomic facts. Calibration matters more than model choice. NLI threshold 0.50 to 0.10 improves F1 by +0.076. F1 0.727, Precision 0.919

Low-Level Performance

Project What it demonstrated Numbers
cuda-attention-kernel Naive vs tiled attention kernels on A100. Diagnosed why tiling underperforms theory: 40MB L2 cache masks benefit below seq_len=2048. Connects to Flash Attention design rationale. 515 GFLOPS/s tiled, ~145x over CPU
cpp-simd-quant ARM NEON SIMD on Apple Silicon. Proves SIMD helps attention (11.1x) but not Black-Scholes (1.03x) because 89% of runtime is transcendental functions. Roofline analysis. 31.88 GFLOPS/s attention, 103.8M options/sec
sparse-factor-modeling 9 LASSO solvers from scratch. Walk-forward backtest, no look-ahead bias. Novel finding: FISTA degrades at high sparsity. KKT-based factor ranking. Sharpe 5.061, Spearman rho=0.906

Agent Systems

Project What it does Numbers
code-memory-agent Coding agent with persistent SQLite memory. SHA-256 staleness detection as non-bypassable gate. Indexes file purposes, symbols, cross-file dependencies. 42.9% fewer file reads, 19 decision-reuse events
mindmirror Real-time interview coach analyzing eye contact, facial expressions, speech, vocal patterns every 2 seconds. MediaPipe + faster-whisper + LangGraph. ~1.2s full pipeline cycle, 6 behavioral states

Skills


Publications

Paper Venue Year
Stem Cell Reviews and Reports Springer Nature 2025
Digital Forensics and Cybersecurity Wiley-Scrivener 2024
Economic Perspectives IGI Global 2024
YOLOv8 Traffic Sign Detection (80.64% acc) IJSRA 2024
BERT Sentiment Analysis (F1 0.88) J. Trends in CS 2024

Certifications & Awards

Udacity Agentic AI Nanodegree Jan 2026
Oracle OCI 2025 Certified AI Foundations Associate Dec 2025
National Hackathon Best Demonstration Award Oriental Institute of Science & Technology, Bhopal 2023 (Team Leader)
Dean's List + Best Student for Research Inclination Woxsen University

36 repositories · 5 publications · Python, C++, CUDA, TypeScript

Pinned Loading

  1. Fine-grained-Factual-Consistency-Evaluation-for-LLM Fine-grained-Factual-Consistency-Evaluation-for-LLM Public

    Multi-stage factual accuracy pipeline: Mistral-7B generates, T5-Flan-Large decomposes into atomic facts, RoBERTa-Large-MNLI verifies via NLI. 400 ASQA samples. Entailment 28.7%, contradiction 11.2%…

    Jupyter Notebook

  2. Autonomous-Game-Playing-Agent-using-Deep-Reinforcement-Learning Autonomous-Game-Playing-Agent-using-Deep-Reinforcement-Learning Public

    Board game agent learning strategy through PPO self-play over 5,000 episodes. 3-layer network (256->128->64), action masking for valid moves. Achieves 78% win rate vs random opponent, up from 56% a…

    Jupyter Notebook

  3. Multimodal-AI-Image-Caption-Audio-Generation-System Multimodal-AI-Image-Caption-Audio-Generation-System Public

    Image-to-audio pipeline: BLIP generates captions, custom neural network synthesizes audio from visual features. 8,475 triplets. BLEU 0.3399, METEOR 0.4878, audio SNR 26.82 dB. Mel-spectrogram proc…

    Jupyter Notebook

  4. arXiv-Research-Paper-Recommendation-System arXiv-Research-Paper-Recommendation-System Public

    Research paper discovery across 12,760+ papers using Universal Sentence Encoder embeddings, knowledge graphs, and Graph Convolutional Networks. Precision@5: 0.80, nDCG@5: 0.84, sub-2s response time…

    Jupyter Notebook

  5. Brain-tumor-detection-ConvNeXt Brain-tumor-detection-ConvNeXt Public

    Brain tumor classification from MRI scans into 42 categories (14 tumor types x 3 imaging modalities) using ConvNeXt Tiny with ImageNet transfer learning. 99.64% validation accuracy vs 92.86% Effici…

    Jupyter Notebook