Skip to content
View pillaiharish's full-sized avatar
🎯
Building AI infra labs
🎯
Building AI infra labs

Block or report pillaiharish

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pillaiharish/README.md

Harishkumar Pillai

Backend/security engineer building AI infrastructure, LLM serving labs, local LLM tooling, RAG systems, and agent workflows.

I build practical systems around local and production-style AI workflows: serving experiments, verification tools, retrieval pipelines, agent orchestration, and backend/security automation. My work sits at the intersection of infrastructure, reliability, security, observability, and developer tooling.

Featured projects

  • llm-serving-performance-lab: vLLM serving and benchmarking lab for GPU inference, with TTFT/ITL tracking, concurrency and length sweeps, raw evidence, plots, reports, and clear caveats around what the numbers do and do not prove.
  • chakra-vault: Safe local LLM model verification and download tooling focused on SHA-256 checks, safe writes, provider-neutral planning, and restore testing and recovery workflows.
  • opencode-ollama-steroids: Receipt-first OpenCode + Ollama multi-agent workflow with builder/reviewer agents, validation gates, redaction tooling, local session receipts, and a human commit boundary.
  • harish-llm-wiki: Static LLM learning wiki pipeline for transcripts, articles, notes, citation-aware chunks, search, retrieval experiments, and knowledge graph views.
  • LLM-AI-Agents-Learning-Journey: Learning and research repo covering LLMs, AI agents, tokenization, AI security, optimization, and hands-on code.

Current focus

  • vLLM benchmarking and serving-performance interpretation
  • RTX 5070 Ti local inference experiments
  • LLM model backup, verification, and restore tooling
  • OpenCode/Ollama agent workflows with review and validation gates
  • RAG and knowledge graph systems for personal knowledge workflows

Technical stack

  • Languages: Go, Python, Shell, SQL
  • AI/LLM: vLLM, Ollama, llama.cpp, RAG, local LLMs
  • Platform: Docker, Kubernetes, GitHub Actions, Jenkins
  • Data and observability: Prometheus, Grafana, ClickHouse, MySQL, MongoDB
  • Engineering: DNS/security, backend services, CI/CD

GitHub snapshot

GitHub profile details

GitHub stats

Top repository languages

Writing

I keep a small writing trail on Medium, mostly around AI infrastructure, local LLM workflows, Go, Kubernetes, CI/CD, DNS/security, and practical engineering notes.

Connect

Pinned Loading

  1. llm-serving-performance-lab llm-serving-performance-lab Public

    Production-style LLM serving performance lab: vLLM, GPU inference, TTFT/ITL benchmarks, observability, RAG/agent evaluation.

    Python

  2. chakra-vault chakra-vault Public

    Never redownload a 600GB model because your cache broke. Chakra Vault verifies, cold-stores, restore-tests, and monitors your local LLM collection.

    Python

  3. opencode-ollama-steroids opencode-ollama-steroids Public

    OpenCode + Ollama multi-agent workflow with builder/reviewer agents, headless runs, skills, validation gates, local session evidence, and redaction tools.

    Python

  4. harish-llm-wiki harish-llm-wiki Public

    LLM wiki for tracking old and new concepts.

    Python

  5. LLM-AI-Agents-Learning-Journey LLM-AI-Agents-Learning-Journey Public

    This repository is a continuously evolving collection of blogs, hands-on code, and research on Large Language Models (LLMs), AI Agents, tokenization techniques, AI security risks, and optimization …

    Python 5 1