kubeflow · abhijeet-dhumal · Apr 9, 2026 · May 12, 2026
diff --git a/.claude/settings.json b/.claude/settings.json
@@ -0,0 +1,3 @@
+{
+  "enabledMcpjsonServers": ["kubeflow"]
+}
diff --git a/.cursor/mcp.json b/.cursor/mcp.json
@@ -0,0 +1,3 @@
+{
+  "mcpServers": {}
+}
diff --git a/.cursor/rules/kubeflow.mdc b/.cursor/rules/kubeflow.mdc
@@ -0,0 +1,13 @@
+---
+description: Kubeflow MCP server — training tools and workflows
+globs:
+  - "**/*"
+---
+
+# Kubeflow MCP
+
+This repo implements the **Kubeflow MCP server** (`kubeflow-mcp serve`). Prefer MCP tools for cluster checks, training job lifecycle, and previews before mutating (`confirmed=false` then user confirmation).
+
+- CLI: `uv run kubeflow-mcp serve` (stdio default). See `README.md` for HTTP/SSE and auth.
+- Interactive local agent: `uv run kubeflow-mcp agent --provider ollama` (install `uv sync --extra agents-ollama`; use `--extra agents` for all backends; default `--mode full`; `--mode progressive` for meta-tools).
+- Design: `docs/design/agent-provider-architecture.md`.
diff --git a/.mcp.json b/.mcp.json
@@ -1,9 +1,8 @@
 {
   "mcpServers": {
-    "kubeflow-mcp-dev": {
+    "kubeflow": {
       "command": "uv",
-      "args": ["run", "kubeflow-mcp", "serve"],
-      "env": {}
+      "args": ["run", "kubeflow-mcp", "serve"]
     }
   }
 }
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,7 @@
+# Kubeflow MCP (this repository)
+
+- **MCP server**: `uv run kubeflow-mcp serve` — exposes Kubeflow training tools over MCP (stdio by default).
+- **Local agent** (optional LLM): `uv sync --extra agents-ollama` then `uv run kubeflow-mcp agent --provider ollama --model qwen3:8b` (default `--mode full`; add `--mode progressive` to match smaller tool schemas like `serve --mode progressive`). Use `--extra agents` for Ollama + LiteLLM together.
+- **Docs**: `README.md`, `ROADMAP.md`, `docs/design/agent-provider-architecture.md`.
+
+Use MCP tools instead of guessing kubectl; respect preview-before-submit (`confirmed` flags on mutating tools).
diff --git a/README.md b/README.md
@@ -28,6 +28,21 @@ cd mcp-server
 pip install .
 ```
 
+### Optional: interactive `kubeflow-mcp agent` (local LLM)
+
+Extras are split so you do not need LiteLLM for the Ollama agent:
+
+| Extra | Installs |
+|-------|----------|
+| `agents-ollama` | LlamaIndex + Ollama + Rich (default for `--provider ollama`) |
+| `agents-litellm` | LiteLLM + Rich (`--provider litellm`) |
+| `agents` | Both (convenience) |
+
+```bash
+uv sync --extra agents-ollama
+# or: pip install 'kubeflow-mcp[agents-ollama]'
+```
+
 ### Run the server
 
 ```bash
@@ -157,10 +172,10 @@ Without auth configured, the server logs a warning that the HTTP endpoint is ope
 
 ```bash
 kubeflow-mcp agent \
-  --backend ollama \              # ollama (default; more backends planned)
-  --model qwen3:8b \              # model name for the backend
-  --mode full \                   # full | progressive | semantic
-  --thinking                      # enable thinking output (supported models)
+  --provider ollama \             # ollama | litellm (entry-point registry)
+  --model qwen3:8b \              # model name (provider default if omitted)
+  --mode full \                   # full (default) | progressive | semantic — same as serve --mode
+  --thinking                      # enable thinking output (ollama; supported models)
 ```
 
 </details>

diff --git a/docs/design/agent-provider-architecture.md b/docs/design/agent-provider-architecture.md
@@ -0,0 +1,83 @@
+# Agent provider architecture
+
+## Summary
+
+This document proposes a **pluggable agent provider** model for the `kubeflow-mcp agent` CLI. Providers register via Python `importlib.metadata` entry points (`kubeflow_mcp.providers`), implement a small `AgentProvider` protocol, and ship optional dependencies behind extras: **`agents-ollama`** (LlamaIndex + Ollama + Rich), **`agents-litellm`** (LiteLLM + Rich), or **`agents`** for both.
+
+## Two execution planes
+
+1. **MCP plane (vision-primary):** `kubeflow-mcp serve` → [`create_server`](../../kubeflow_mcp/core/server.py) → FastMCP tools with `_audit_wrap`, persona, policy, optional progressive/semantic meta-tools from [`core/dynamic_tools`](../../kubeflow_mcp/core/dynamic_tools.py).
+2. **CLI agent plane (dev / convenience):** `kubeflow-mcp agent` → provider → LlamaIndex + same trainer callables; progressive/semantic use the **same** `core.dynamic_tools` implementation after `agents/dynamic_tools` calls `init_dynamic_tools`. Instructions and short tool descriptions align with `serve` via [`build_agent_instruction_text`](../../kubeflow_mcp/core/server.py) / [`get_merged_client_tool_descriptions`](../../kubeflow_mcp/core/server.py) (default trainer + health, persona `readonly`).
+
+**`--mode`:** `serve` and `agent` both accept `full` | `progressive` | `semantic` with the **same** meaning; default is **`full`** for both (use `progressive` or `semantic` when you need smaller tool schemas). **Note:** `serve -m` is tool mode; `agent -m` is **`--model`**, not tool mode—use `agent --mode …` for tool mode.
+
+**Registry note:** `init_dynamic_tools` mutates global state in `core.dynamic_tools`; avoid loading server and in-process agent meta-tools in one process without re-init expectations.
+
+## Motivation
+
+- Agent code previously lived under `src/kubeflow_mcp/agents/`, **outside** the wheel package, so imports only worked in editable installs.
+- The CLI hard-coded `--backend ollama`, forcing code changes for every new backend.
+- Community contributors need a clear pattern (protocol + entry point + example script).
+
+## Goals
+
+- Move agents into `kubeflow_mcp/agents/` as part of the published package.
+- Dynamic discovery: `kubeflow-mcp agent --provider <name>`.
+- Documented protocol and reference implementations: Ollama (full tools), LiteLLM (minimal REPL).
+
+## Non-goals
+
+- Changing MCP server transport or core tool registration.
+- Mandating a single LLM stack (providers remain optional extras: `agents-ollama`, `agents-litellm`, or `agents`).
+
+## Proposal
+
+### Protocol
+
+```text
+kubeflow_mcp/agents/base.py → AgentProvider
+  name: str
+  default_model: str
+  requires: list[str]   # pip package names for error messages
+  run(self, model: str, mode: str, **kwargs) -> None
+```
+
+### Entry points
+
+Registered in `pyproject.toml`:
+
+```toml
+[project.entry-points."kubeflow_mcp.providers"]
+ollama = "kubeflow_mcp.agents.ollama:OllamaProvider"
+litellm = "kubeflow_mcp.agents.litellm_provider:LiteLLMProvider"
+```
+
+### CLI
+
+```bash
+kubeflow-mcp agent --provider ollama --model qwen3:8b --mode full
+kubeflow-mcp agent --provider ollama --model qwen3:8b --mode progressive
+kubeflow-mcp agent --provider litellm --model gpt-4o-mini
+```
+
+Optional: `--url` for Ollama base URL; `--thinking` toggles reasoning-friendly models.
+
+### Providers (status)
+
+| Provider   | Status   | Notes                                      |
+|-----------|----------|--------------------------------------------|
+| `ollama`  | Shipped  | LlamaIndex + Kubeflow tools, full modes   |
+| `litellm` | Minimal  | LiteLLM chat loop; tool parity TBD         |
+
+## Implementation plan
+
+1. Consolidate `src/kubeflow_mcp/agents/` into `kubeflow_mcp/agents/`.
+2. Add `base.py`, entry points, CLI refactor.
+3. Split Ollama REPL helpers to satisfy complexity limits.
+4. Add `examples/agents/` and optional `examples/deployment/litellm-gateway/` notes.
+
+## References
+
+- [Model Context Protocol](https://modelcontextprotocol.io/)
+- [Python importlib.metadata entry points](https://docs.python.org/3/library/importlib.metadata.html#entry-points)
+- [LlamaIndex FunctionAgent](https://docs.llamaindex.ai/)
diff --git a/examples/agents/README.md b/examples/agents/README.md
@@ -0,0 +1,28 @@
+# Example agent runners
+
+Use the main CLI (recommended):
+
+```bash
+uv sync --extra agents-ollama
+uv run kubeflow-mcp agent --provider ollama --model qwen3:8b
+# Smaller tool schema (same as serve --mode progressive):
+uv run kubeflow-mcp agent --provider ollama --model qwen3:8b --mode progressive
+```
+
+LiteLLM provider (separate extra):
+
+```bash
+uv sync --extra agents-litellm
+uv run kubeflow-mcp agent --provider litellm --model gpt-4o-mini
+```
+
+All agent backends (Ollama + LiteLLM): `uv sync --extra agents`.
+
+Or run the thin wrappers in this directory (same behavior, explicit `PYTHONPATH` not required when the package is installed):
+
+```bash
+uv run python examples/agents/ollama/run.py --model qwen3:8b
+uv run python examples/agents/ollama/run.py --model qwen3:8b --mode progressive
+```
+
+See each subfolder `README.md` for provider-specific notes.
diff --git a/examples/agents/litellm/README.md b/examples/agents/litellm/README.md
@@ -0,0 +1,9 @@
+# LiteLLM example
+
+Install `uv sync --extra agents-litellm` (or `agents`), set `OPENAI_API_KEY` (or other provider env vars), then:
+
+```bash
+uv run python examples/agents/litellm/run.py --model gpt-4o-mini
+```
+
+This is a minimal chat loop. For full Kubeflow tool calling, use `--provider ollama` today.
diff --git a/examples/agents/litellm/run.py b/examples/agents/litellm/run.py
@@ -0,0 +1,22 @@
+#!/usr/bin/env python3
+# Copyright 2026 The Kubeflow Authors.
+#
+# SPDX-License-Identifier: Apache-2.0
+"""Thin wrapper around :class:`kubeflow_mcp.agents.litellm_provider.LiteLLMProvider`."""
+
+from __future__ import annotations
+
+import argparse
+
+from kubeflow_mcp.agents.litellm_provider import LiteLLMProvider
+
+
+def main() -> None:
+    p = argparse.ArgumentParser(description="Run the LiteLLM chat loop")
+    p.add_argument("--model", default=LiteLLMProvider.default_model)
+    args = p.parse_args()
+    LiteLLMProvider().run(model=args.model, mode="full")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/agents/ollama/README.md b/examples/agents/ollama/README.md
@@ -0,0 +1,7 @@
+# Ollama example
+
+Requires `uv sync --extra agents-ollama` (or `agents` for all backends) and a running `ollama serve`.
+
+```bash
+uv run python examples/agents/ollama/run.py --model qwen3:8b --mode full
+```
diff --git a/examples/agents/ollama/run.py b/examples/agents/ollama/run.py
@@ -0,0 +1,34 @@
+#!/usr/bin/env python3
+# Copyright 2026 The Kubeflow Authors.
+#
+# SPDX-License-Identifier: Apache-2.0
+"""Thin wrapper around :class:`kubeflow_mcp.agents.ollama.OllamaProvider`."""
+
+from __future__ import annotations
+
+import argparse
+
+from kubeflow_mcp.agents.ollama import DEFAULT_MODEL, DEFAULT_URL, OllamaProvider
+
+
+def main() -> None:
+    p = argparse.ArgumentParser(description="Run the Ollama Kubeflow agent")
+    p.add_argument("--model", default=DEFAULT_MODEL)
+    p.add_argument("--url", default=DEFAULT_URL)
+    p.add_argument(
+        "--mode",
+        default="full",
+        choices=["full", "progressive", "semantic", "static", "mcp"],
+    )
+    p.add_argument("--thinking", action="store_true")
+    args = p.parse_args()
+    OllamaProvider().run(
+        model=args.model,
+        mode=args.mode,
+        url=args.url,
+        thinking=args.thinking,
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/deployment/litellm-gateway/README.md b/examples/deployment/litellm-gateway/README.md
@@ -0,0 +1,5 @@
+# LiteLLM gateway (reference)
+
+Deploy [LiteLLM Proxy](https://docs.litellm.ai/docs/proxy/deploy) in front of your model vendors when you want unified routing, rate limits, and audit logs before agents or IDE clients call upstream LLMs.
+
+This folder is reserved for future Helm/Kustomize or compose snippets aligned with Kubeflow MCP deployment guides. No manifests are committed yet.
diff --git a/kubeflow_mcp/agents/__init__.py b/kubeflow_mcp/agents/__init__.py
@@ -0,0 +1,45 @@
+# Copyright 2026 The Kubeflow Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Pluggable CLI agents for Kubeflow MCP.
+
+Heavy providers (Ollama, LiteLLM) are loaded lazily via :func:`__getattr__`
+so ``import kubeflow_mcp.agents`` does not require optional dependencies.
+"""
+
+from kubeflow_mcp.agents.base import AgentProvider
+
+__all__ = [
+    "AgentProvider",
+    "LiteLLMProvider",
+    "OllamaAgent",
+    "OllamaProvider",
+]
+
+
+def __getattr__(name: str):
+    if name == "OllamaProvider":
+        from kubeflow_mcp.agents.ollama import OllamaProvider as _OllamaProvider
+
+        return _OllamaProvider
+    if name == "OllamaAgent":
+        from kubeflow_mcp.agents.ollama import OllamaAgent as _OllamaAgent
+
+        return _OllamaAgent
+    if name == "LiteLLMProvider":
+        from kubeflow_mcp.agents.litellm_provider import LiteLLMProvider as _LiteLLMProvider
+
+        return _LiteLLMProvider
+    msg = f"module {__name__!r} has no attribute {name!r}"
+    raise AttributeError(msg)
diff --git a/kubeflow_mcp/agents/base.py b/kubeflow_mcp/agents/base.py
@@ -0,0 +1,30 @@
+# Copyright 2026 The Kubeflow Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Agent provider protocol for pluggable LLM backends."""
+
+from typing import Any, Protocol, runtime_checkable
+
+
+@runtime_checkable
+class AgentProvider(Protocol):
+    """Contract for `kubeflow_mcp.providers` entry-point implementations."""
+
+    name: str
+    default_model: str
+    requires: list[str]
+
+    def run(self, model: str, mode: str, **kwargs: Any) -> None:
+        """Start the interactive agent (blocking)."""
+        ...