AI-Native Observability Architect Β· Full-Stack Engineer (Cloud β On-Prem) Β· OSS Builder Β· Extreme Executor
I design and build systems that observe, explain, and fix themselves β from the kernel-level instrumentation all the way up to the AI agents reasoning over the signals.
I work at the intersection of:
- AI-native systems β agents, RAG, autonomous reasoning
- Software & Systems Architecture β clean architecture, distributed systems at scale
- Observability β the full LGTM stack, deep OpenTelemetry, zero-code instrumentation
- Kubernetes & Cloud β cloud-native and bare-metal on-prem
- Autonomous Root Cause Analysis β incidents that explain themselves
I'm a real full-stack engineer β backend, infra, AI, and the glue in between β and I lead engineering at Elven Works. I build production-grade systems with obsessive attention to reliability, performance, and clarity.
- Architect and lead AI-powered observability platforms end to end
- Build zero-code instrumentation layers for Lambda, Node.js, .NET, Python, and React Native
- Design autonomous root cause analysis systems backed by knowledge graphs
- Engineer distributed microservices at scale (Go + Python, clean architecture)
- Create OSS tooling around OpenTelemetry, logs, traces, and metrics
- Run Kubernetes from managed cloud (EKS/AKS) down to on-prem (RKE2, bare metal)
- Turn chaos into structured, actionable signals
- AI for Observability & Autonomous Root Cause Analysis
- OpenTelemetry deep instrumentation (eBPF, tail sampling, cardinality control)
- Kubernetes β cloud-native and on-prem
- LGTM Stack (Loki, Grafana, Tempo, Mimir) + OpenObserve
- AI-native backend systems (multi-agent orchestration, Temporal workflows)
- High-performance, zero-code Lambda instrumentation
- Vector search & embeddings (Qdrant, RAG, MiniLM)
- Knowledge graphs for systems reasoning (Neo4j)
- Infrastructure as Code & platform engineering
Languages
Go Β· Python Β· TypeScript Β· Node.js Β· Lua
AI / ML
LLMs Β· RAG Β· Vector DBs Β· Qdrant Β· vLLM Β· Temporal Β· Multi-Agent Systems
Observability
OpenTelemetry Β· Grafana Β· Loki Β· Tempo Β· Mimir Β· OpenObserve Β· Beyla / eBPF Β· Zabbix Β· k6
Cloud & Infra
AWS Β· EKS Β· AKS Β· Kubernetes Β· RKE2 Β· On-Prem / Bare Metal Β· Terraform Β· Kong Β· Keycloak Β· NATS Β· Cloudflare
Data
Neo4j Β· MongoDB Β· Redis Β· Kafka
- Elven Observability β AI-powered, LGTM-based observability SaaS (Datadog-quality, OSS core)
- Sentinel β AIOps platform: microservices fleet, AI agents, RAG, Temporal workflows, time-series ML
- Autonomous RCA Engine β hypothesis-driven incident investigation over Neo4j evidence graphs
- Elven Connect β secure reverse-tunnel datasource connectivity (gRPC / mTLS)
- Kyrvex β secrets management platform
- Zero-Code Telemetry β multi-language log/trace collectors (Go, .NET, JS, Python, React Native)
- Kubernetes Copilot β AI-assisted cluster interaction
- Extreme focus β I go deep, full-stack, top to bottom.
- I don't fear complexity β I instrument it.
- I build what I wish existed, then ship it.
- I solve the problems most people avoid.
- I lead by building alongside the team, not just above it.
To make systems explain themselves. To build AI-native infrastructure that reduces operational anxiety. To turn observability into intelligence.
- AI-native observability platform with autonomous SRE assistants
- Observability + Security convergence
- Zero-code telemetry for everything
- Knowledge-graph-driven reasoning over live systems
I'm married, I have two kids, two cats, and so many plants that I basically run Kubernetes at home too β just with less documentation and more crying.
"Complexity is inevitable. Confusion is optional."