meetbhanushali2k1 meet302001

Hi, I'm Meet Bhanushali

DevOps / SRE Engineer building toward AI Infrastructure. Based in New York City, NY.

I run production infra for a social trading platform — 30+ EC2 nodes, 10+ microservices, EKS, Prometheus/Grafana/Loki, Vault, APISIX. Lately I've been going deep on GPU serving, LLM inference, and Kubernetes at scale because that's where the interesting problems are.

current_role:    DevOps/MLOps @ Traderware Inc.
focus:           LLM inference infra · GPU workload scheduling · EKS at scale
learning:        CKA · KV-cache-aware autoscaling · vLLM internals
open_to:         AI Infrastructure roles (Anthropic, CoreWeave, Together AI, Baseten, Replicate)

Stack I Run in Production

Orchestration & Compute EKS Kubernetes Helm Karpenter EC2 g5.xlarge (A10G) GPU Operator

LLM & GPU Infra vLLM Mistral 7B Flash Attention v2 CUDA Graphs DCGM Exporter NVIDIA NCA-AIIO

Observability Prometheus Grafana Loki Vector.dev Wazuh HIDS cAdvisor Node Exporter

Data & Storage Qdrant QuestDB Redis S3 EBS

Security & Platform HashiCorp Vault Keycloak APISIX cert-manager Semgrep Trivy

CI/CD & IaC GitHub Actions Terraform ArgoCD Self-hosted runners OIDC federation

What I'm Building

KV-Cache-Aware Pod Autoscaler for LLMs

Standard HPA scales on CPU/memory — useless for LLM inference where KV-cache pressure is the real bottleneck. Building a custom autoscaler that scales vLLM replicas based on cache hit rate, GPU memory utilization, and queued requests. Cuts cold-start cost while protecting tail latency. vLLM Custom Metrics API Prometheus Adapter EKS

GPU Workload Manager

Priority-based job queueing + cost-aware scheduling for mixed inference/training workloads on a shared GPU fleet. Spot-aware fallback. Per-team quota enforcement. Kubernetes Karpenter DCGM Custom Scheduler

LLM Inference & Eval Platform on EKS

End-to-end: vLLM serving Mistral 7B → RAGAS evaluation → MLflow tracking → ArgoCD GitOps. Models ship the same way services do. vLLM RAGAS MLflow ArgoCD EKS

Repos go live as I finish each milestone. Pinned below as they ship.

Certifications

GitHub

Always down to talk GPU schedulers, inference serving, or why your Prometheus is OOMing again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meetbhanushali2k1 meet302001

Achievements

Achievements

Organizations

Block or report meet302001