Building production LLM systems: multi-agent orchestration · RAG pipelines · LLMOps on AWS Bedrock
Multi-Agent Incident Response (Bedrock + LangGraph) — Supervisor-agent architecture with 4 specialized sub-agents (log parser · k8s events · metrics correlator · runbook retriever). Reduced mean-time-to-diagnose from ~18 min → under 4 min across 50 test scenarios. 87% correct root cause on first pass. AgentCore Gateway as MCP tool server with IAM auth + VPC isolation. Full OpenTelemetry instrumentation: token usage, tool call latency, cost-per-invocation exported to Grafana + LangSmith.
Enterprise RAG Copilot (Bedrock + OpenSearch + RAGAS) — Hybrid search over 200+ technical runbooks: BM25 + dense vector fusion via RRF, section-aware semantic chunking (400-token / 30-token overlap), cross-encoder reranking. Achieved 0.91 context precision · 0.88 answer relevancy on RAGAS. Automated evaluation pipeline: Lambda-triggered on every KB update, regression gate blocks deployment at >5% metric drop. p95 response under 2.8 seconds end-to-end.
GenAI Amazon Bedrock · AgentCore · LangGraph · LangChain · MCP · Prompt Engineering
Bedrock Guardrails · Structured Outputs · AI Agents
Retrieval OpenSearch · pgvector · Hybrid Search (BM25 + Dense) · RRF · Cross-Encoder Reranking
Semantic Chunking · Embedding Pipelines
Evaluation RAGAS · DeepEval · LangSmith · Automated Deployment Gating
Observability OpenTelemetry · Grafana · CloudWatch · LangSmith
AWS Bedrock · AgentCore · Lambda · S3 · EC2 · ECS · API Gateway · IAM · VPC · Terraform
Dev Python · asyncio · FastAPI · Pydantic · Docker · CI/CD · REST APIs · Bash · Git
| Project | What it is | Stack |
|---|---|---|
| argus-sre-agent | Multi-agent SRE platform — supervisor + 4 sub-agents for autonomous root cause analysis | Bedrock · LangGraph · OTel · Pydantic |
| india-equity-rag | Financial research assistant over Indian corporate annual reports (Reliance · HDFC · TCS) | Bedrock · OpenSearch · RAGAS · FastAPI |
| rag-copilot-eval | Production RAG evaluation pipeline — RAGAS + DeepEval + automated regression gating | Python · RAGAS · DeepEval · Lambda |
3 years AWS infrastructure (CloudAge Global Services) → Self-directed GenAI transition (2022–2025, PG cert AI/ML, Great Learning 2023) → GenAI Engineer (AllOps Technologies, Oct 2025–present)
Cloud foundation: provisioned and managed AWS for 8+ enterprise clients, 100GB+ daily workloads, 99.7% uptime SLA, Terraform automation (70% reduction in manual deployment effort), SOC 2 Type II audit preparation.
Building production AI systems in Pune. Open to remote-first roles and relocation.
piyush.chau.ai@outlook.com · LinkedIn
Contribution philosophy
I push to public repos when the implementation has something worth showing — a specific architectural decision, a working evaluation pipeline, a retrieval approach with documented metrics. Not volume for the sake of activity graphs.
