Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"enabledMcpjsonServers": ["kubeflow"]
}
3 changes: 3 additions & 0 deletions .cursor/mcp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"mcpServers": {}
}
13 changes: 13 additions & 0 deletions .cursor/rules/kubeflow.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
description: Kubeflow MCP server — training tools and workflows
globs:
- "**/*"
---

# Kubeflow MCP

This repo implements the **Kubeflow MCP server** (`kubeflow-mcp serve`). Prefer MCP tools for cluster checks, training job lifecycle, and previews before mutating (`confirmed=false` then user confirmation).

- CLI: `uv run kubeflow-mcp serve` (stdio default). See `README.md` for HTTP/SSE and auth.
- Interactive local agent: `uv run kubeflow-mcp agent --provider ollama` (install `uv sync --extra agents-ollama`; use `--extra agents` for all backends; default `--mode full`; `--mode progressive` for meta-tools).
- Design: `docs/design/agent-provider-architecture.md`.
5 changes: 2 additions & 3 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
{
"mcpServers": {
"kubeflow-mcp-dev": {
"kubeflow": {
"command": "uv",
"args": ["run", "kubeflow-mcp", "serve"],
"env": {}
"args": ["run", "kubeflow-mcp", "serve"]
}
}
}
7 changes: 7 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Kubeflow MCP (this repository)

- **MCP server**: `uv run kubeflow-mcp serve` — exposes Kubeflow training tools over MCP (stdio by default).
- **Local agent** (optional LLM): `uv sync --extra agents-ollama` then `uv run kubeflow-mcp agent --provider ollama --model qwen3:8b` (default `--mode full`; add `--mode progressive` to match smaller tool schemas like `serve --mode progressive`). Use `--extra agents` for Ollama + LiteLLM together.
- **Docs**: `README.md`, `ROADMAP.md`, `docs/design/agent-provider-architecture.md`.

Use MCP tools instead of guessing kubectl; respect preview-before-submit (`confirmed` flags on mutating tools).
23 changes: 19 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,21 @@ cd mcp-server
pip install .
```

### Optional: interactive `kubeflow-mcp agent` (local LLM)

Extras are split so you do not need LiteLLM for the Ollama agent:

| Extra | Installs |
|-------|----------|
| `agents-ollama` | LlamaIndex + Ollama + Rich (default for `--provider ollama`) |
| `agents-litellm` | LiteLLM + Rich (`--provider litellm`) |
| `agents` | Both (convenience) |

```bash
uv sync --extra agents-ollama
# or: pip install 'kubeflow-mcp[agents-ollama]'
```

### Run the server

```bash
Expand Down Expand Up @@ -157,10 +172,10 @@ Without auth configured, the server logs a warning that the HTTP endpoint is ope

```bash
kubeflow-mcp agent \
--backend ollama \ # ollama (default; more backends planned)
--model qwen3:8b \ # model name for the backend
--mode full \ # full | progressive | semantic
--thinking # enable thinking output (supported models)
--provider ollama \ # ollama | litellm (entry-point registry)
--model qwen3:8b \ # model name (provider default if omitted)
--mode full \ # full (default) | progressive | semantic — same as serve --mode
--thinking # enable thinking output (ollama; supported models)
```

</details>
Expand Down
83 changes: 83 additions & 0 deletions docs/design/agent-provider-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Agent provider architecture

## Summary

This document proposes a **pluggable agent provider** model for the `kubeflow-mcp agent` CLI. Providers register via Python `importlib.metadata` entry points (`kubeflow_mcp.providers`), implement a small `AgentProvider` protocol, and ship optional dependencies behind extras: **`agents-ollama`** (LlamaIndex + Ollama + Rich), **`agents-litellm`** (LiteLLM + Rich), or **`agents`** for both.

## Two execution planes

1. **MCP plane (vision-primary):** `kubeflow-mcp serve` → [`create_server`](../../kubeflow_mcp/core/server.py) → FastMCP tools with `_audit_wrap`, persona, policy, optional progressive/semantic meta-tools from [`core/dynamic_tools`](../../kubeflow_mcp/core/dynamic_tools.py).
2. **CLI agent plane (dev / convenience):** `kubeflow-mcp agent` → provider → LlamaIndex + same trainer callables; progressive/semantic use the **same** `core.dynamic_tools` implementation after `agents/dynamic_tools` calls `init_dynamic_tools`. Instructions and short tool descriptions align with `serve` via [`build_agent_instruction_text`](../../kubeflow_mcp/core/server.py) / [`get_merged_client_tool_descriptions`](../../kubeflow_mcp/core/server.py) (default trainer + health, persona `readonly`).

**`--mode`:** `serve` and `agent` both accept `full` | `progressive` | `semantic` with the **same** meaning; default is **`full`** for both (use `progressive` or `semantic` when you need smaller tool schemas). **Note:** `serve -m` is tool mode; `agent -m` is **`--model`**, not tool mode—use `agent --mode …` for tool mode.

**Registry note:** `init_dynamic_tools` mutates global state in `core.dynamic_tools`; avoid loading server and in-process agent meta-tools in one process without re-init expectations.

## Motivation

- Agent code previously lived under `src/kubeflow_mcp/agents/`, **outside** the wheel package, so imports only worked in editable installs.
- The CLI hard-coded `--backend ollama`, forcing code changes for every new backend.
- Community contributors need a clear pattern (protocol + entry point + example script).

## Goals

- Move agents into `kubeflow_mcp/agents/` as part of the published package.
- Dynamic discovery: `kubeflow-mcp agent --provider <name>`.
- Documented protocol and reference implementations: Ollama (full tools), LiteLLM (minimal REPL).

## Non-goals

- Changing MCP server transport or core tool registration.
- Mandating a single LLM stack (providers remain optional extras: `agents-ollama`, `agents-litellm`, or `agents`).

## Proposal

### Protocol

```text
kubeflow_mcp/agents/base.py → AgentProvider
name: str
default_model: str
requires: list[str] # pip package names for error messages
run(self, model: str, mode: str, **kwargs) -> None
```

### Entry points

Registered in `pyproject.toml`:

```toml
[project.entry-points."kubeflow_mcp.providers"]
ollama = "kubeflow_mcp.agents.ollama:OllamaProvider"
litellm = "kubeflow_mcp.agents.litellm_provider:LiteLLMProvider"
```

### CLI

```bash
kubeflow-mcp agent --provider ollama --model qwen3:8b --mode full
kubeflow-mcp agent --provider ollama --model qwen3:8b --mode progressive
kubeflow-mcp agent --provider litellm --model gpt-4o-mini
```

Optional: `--url` for Ollama base URL; `--thinking` toggles reasoning-friendly models.

### Providers (status)

| Provider | Status | Notes |
|-----------|----------|--------------------------------------------|
| `ollama` | Shipped | LlamaIndex + Kubeflow tools, full modes |
| `litellm` | Minimal | LiteLLM chat loop; tool parity TBD |

## Implementation plan

1. Consolidate `src/kubeflow_mcp/agents/` into `kubeflow_mcp/agents/`.
2. Add `base.py`, entry points, CLI refactor.
3. Split Ollama REPL helpers to satisfy complexity limits.
4. Add `examples/agents/` and optional `examples/deployment/litellm-gateway/` notes.

## References

- [Model Context Protocol](https://modelcontextprotocol.io/)
- [Python importlib.metadata entry points](https://docs.python.org/3/library/importlib.metadata.html#entry-points)
- [LlamaIndex FunctionAgent](https://docs.llamaindex.ai/)
28 changes: 28 additions & 0 deletions examples/agents/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Example agent runners

Use the main CLI (recommended):

```bash
uv sync --extra agents-ollama
uv run kubeflow-mcp agent --provider ollama --model qwen3:8b
# Smaller tool schema (same as serve --mode progressive):
uv run kubeflow-mcp agent --provider ollama --model qwen3:8b --mode progressive
```

LiteLLM provider (separate extra):

```bash
uv sync --extra agents-litellm
uv run kubeflow-mcp agent --provider litellm --model gpt-4o-mini
```

All agent backends (Ollama + LiteLLM): `uv sync --extra agents`.

Or run the thin wrappers in this directory (same behavior, explicit `PYTHONPATH` not required when the package is installed):

```bash
uv run python examples/agents/ollama/run.py --model qwen3:8b
uv run python examples/agents/ollama/run.py --model qwen3:8b --mode progressive
```

See each subfolder `README.md` for provider-specific notes.
9 changes: 9 additions & 0 deletions examples/agents/litellm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# LiteLLM example

Install `uv sync --extra agents-litellm` (or `agents`), set `OPENAI_API_KEY` (or other provider env vars), then:

```bash
uv run python examples/agents/litellm/run.py --model gpt-4o-mini
```

This is a minimal chat loop. For full Kubeflow tool calling, use `--provider ollama` today.
22 changes: 22 additions & 0 deletions examples/agents/litellm/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env python3
# Copyright 2026 The Kubeflow Authors.
#
# SPDX-License-Identifier: Apache-2.0
"""Thin wrapper around :class:`kubeflow_mcp.agents.litellm_provider.LiteLLMProvider`."""

from __future__ import annotations

import argparse

from kubeflow_mcp.agents.litellm_provider import LiteLLMProvider


def main() -> None:
p = argparse.ArgumentParser(description="Run the LiteLLM chat loop")
p.add_argument("--model", default=LiteLLMProvider.default_model)
args = p.parse_args()
LiteLLMProvider().run(model=args.model, mode="full")


if __name__ == "__main__":
main()
7 changes: 7 additions & 0 deletions examples/agents/ollama/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Ollama example

Requires `uv sync --extra agents-ollama` (or `agents` for all backends) and a running `ollama serve`.

```bash
uv run python examples/agents/ollama/run.py --model qwen3:8b --mode full
```
34 changes: 34 additions & 0 deletions examples/agents/ollama/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/usr/bin/env python3
# Copyright 2026 The Kubeflow Authors.
#
# SPDX-License-Identifier: Apache-2.0
"""Thin wrapper around :class:`kubeflow_mcp.agents.ollama.OllamaProvider`."""

from __future__ import annotations

import argparse

from kubeflow_mcp.agents.ollama import DEFAULT_MODEL, DEFAULT_URL, OllamaProvider


def main() -> None:
p = argparse.ArgumentParser(description="Run the Ollama Kubeflow agent")
p.add_argument("--model", default=DEFAULT_MODEL)
p.add_argument("--url", default=DEFAULT_URL)
p.add_argument(
"--mode",
default="full",
choices=["full", "progressive", "semantic", "static", "mcp"],
)
p.add_argument("--thinking", action="store_true")
args = p.parse_args()
OllamaProvider().run(
model=args.model,
mode=args.mode,
url=args.url,
thinking=args.thinking,
)


if __name__ == "__main__":
main()
5 changes: 5 additions & 0 deletions examples/deployment/litellm-gateway/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# LiteLLM gateway (reference)

Deploy [LiteLLM Proxy](https://docs.litellm.ai/docs/proxy/deploy) in front of your model vendors when you want unified routing, rate limits, and audit logs before agents or IDE clients call upstream LLMs.

This folder is reserved for future Helm/Kustomize or compose snippets aligned with Kubeflow MCP deployment guides. No manifests are committed yet.
45 changes: 45 additions & 0 deletions kubeflow_mcp/agents/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2026 The Kubeflow Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Pluggable CLI agents for Kubeflow MCP.

Heavy providers (Ollama, LiteLLM) are loaded lazily via :func:`__getattr__`
so ``import kubeflow_mcp.agents`` does not require optional dependencies.
"""

from kubeflow_mcp.agents.base import AgentProvider

__all__ = [
"AgentProvider",
"LiteLLMProvider",
"OllamaAgent",
"OllamaProvider",
]


def __getattr__(name: str):
if name == "OllamaProvider":
from kubeflow_mcp.agents.ollama import OllamaProvider as _OllamaProvider

return _OllamaProvider
if name == "OllamaAgent":
from kubeflow_mcp.agents.ollama import OllamaAgent as _OllamaAgent

return _OllamaAgent
if name == "LiteLLMProvider":
from kubeflow_mcp.agents.litellm_provider import LiteLLMProvider as _LiteLLMProvider

return _LiteLLMProvider
msg = f"module {__name__!r} has no attribute {name!r}"
raise AttributeError(msg)
30 changes: 30 additions & 0 deletions kubeflow_mcp/agents/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright 2026 The Kubeflow Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Agent provider protocol for pluggable LLM backends."""

from typing import Any, Protocol, runtime_checkable


@runtime_checkable
class AgentProvider(Protocol):
"""Contract for `kubeflow_mcp.providers` entry-point implementations."""

name: str
default_model: str
requires: list[str]

def run(self, model: str, mode: str, **kwargs: Any) -> None:
"""Start the interactive agent (blocking)."""
...
Loading
Loading