Most ML models never leave the server. I specialize in getting them to the edge — and in building agent systems that keep working while you sleep.
⚡ Edge AI & Inference Optimization
ONNX export, INT8/FP32 quantization, latency benchmarking, and CPU-only deployment on constrained hardware. Grounded in production edge work with Sony IMX500.
🤖 Agentic AI Systems
Multi-agent orchestration with CrewAI, human-in-the-loop pipelines, persistent memory, and structured output extraction. Built for reliability and production readiness — not just demos.
End-to-end edge deployment: EfficientNetV2-S trained in PyTorch, exported to ONNX, quantized to INT8.
233 MB → 22 MB · 2× faster inference · 98% accuracy on CPU-only hardware.
Full MLOps stack: DVC for data versioning, W&B for experiment tracking, Flask deployed on Render.
5-agent research pipeline (planner → crawler → extractor → synthesizer → reporter) built with CrewAI.
Human-in-the-loop steering after each round, persistent memory across runs, and citation integrity enforced at every stage — every claim traceable to a source URL.
Open to freelance and consulting in edge AI, computer vision, and agentic pipelines — architecture reviews, scoped projects, or technical advisory.
