A local-first MCP bioinformatics research agent for papers, datasets, and Scanpy visualisations.
Bio-Agent is a full-stack AI portfolio project that demonstrates agentic tool use in a realistic biotech workflow. A researcher can ask natural-language questions about curated single-cell RNA-seq papers, inspect linked datasets, run exploratory bioinformatics analyses, and see every MCP tool call and generated artifact as the agent works.
Bio-Agent is designed around a practical researcher workflow:
- Find relevant papers from a curated local single-cell literature set.
- Inspect linked datasets with organism, modality, cell count, marker genes, and provenance.
- Run exploratory analysis such as UMAPs, QC summaries, marker expression, and composition plots.
- Watch the agent work through live streamed reasoning summaries, MCP tool calls, tool results, and artifact creation events.
- Save context by adding useful papers, datasets, and generated artifacts to the selected context for the active chat.
The current MVP is intentionally local and curated. It does not claim live PubMed, GEO, or CELLxGENE search yet; those are natural next connectors.
Try this flow locally:
Find single-cell papers about PBMC immune profiling.Which dataset should I inspect first and why?Run a UMAP colored by cell type for the selected PBMC dataset.Show marker expression for CD3D, MS4A1, and LST1.
The UI is built to make the agent behavior visible, not hidden behind a chat bubble.
| Live tool trace | Artifact and response inspector |
|---|---|
![]() |
![]() |
| Generated analysis artifact | Selected context |
|---|---|
![]() |
![]() |
flowchart LR
UI["React + TypeScript + Vite UI"] -->|SSE chat stream| API["FastAPI agent service"]
API --> Agent["Pydantic AI agent"]
Agent -->|OpenAI-compatible API| LLM["Ollama local LLM<br/>qwen3:8b by default"]
Agent -->|MCP client<br/>Streamable HTTP| MCP["FastMCP bio server"]
MCP --> Retrieval["LanceDB + curated manifests<br/>papers and datasets"]
MCP --> Scanpy["Scanpy / AnnData-style EDA"]
Scanpy --> Artifacts["Cached plots and summaries"]
API --> Sessions["Chat history + selected context"]
Artifacts --> UI
Sessions --> UI
The agent service creates an MCP client with Pydantic AI's MCPServerStreamableHTTP toolset. At
runtime, Pydantic AI discovers the FastMCP tools, converts their Python type hints and docstrings
into model-visible tool schemas, and sends those schemas to the local OpenAI-compatible model. The
model chooses when to call tools; the app streams tool-start, tool-result, artifact, and final-message
events back to the UI.
- Strict agent-first runtime: no deterministic fallback in production chat.
- MCP tool discovery: paper search, paper metadata, dataset search, dataset profiles, and Scanpy EDA are exposed as structured FastMCP tools.
- Visible agent orchestration:
/api/chat/streamemitsreasoning_summary,tool_started,tool_completed,artifact_created,message_delta,final, anderrorevents. - Local LLM by default: Ollama serves
qwen3:8bthrough an OpenAI-compatible endpoint, with the architecture kept swappable for other compatible runtimes. - Bioinformatics artifacts: Scanpy-style analyses generate cached local plots that render inline in chat and in the response inspector.
- Portfolio-grade UI: React/TypeScript app with previous chats, response-level inspection, selected context, stoppable requests, and live tool-call timelines.
cp .env.example .env
uv sync --all-packages --dev
npm install
ollama pull qwen3:8bRun each service in a separate terminal:
uv run python -m bio_mcp.serveruv run uvicorn bio_agent.main:app --reload --port 8000npm run dev:webOpen the app at:
http://127.0.0.1:5173
Default local service URLs:
- Web app:
http://127.0.0.1:5173 - Agent API:
http://127.0.0.1:8000 - MCP server:
http://127.0.0.1:8001/mcp - Ollama OpenAI-compatible API:
http://localhost:11434/v1
The selected model must support tool calling reliably. If Ollama or the MCP server is unavailable, the UI shows a visible agent error instead of falling back to scripted answers.
.env.example contains the default local setup:
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=ollama
MODEL_NAME=qwen3:8b
MCP_URL=http://127.0.0.1:8001/mcp
CORS_ORIGIN=http://localhost:5173
BIO_AGENT_DATA_DIR=./dataBecause the agent uses an OpenAI-compatible model configuration, Ollama can be replaced later with another compatible local runtime without changing the app architecture.
uv run pytest
uv run ruff check .
npm run build:webThese checks cover the shared bioinformatics logic, MCP tool behavior, agent API/session behavior, stream event handling, and the TypeScript production build.
- Add live PubMed, GEO, and CELLxGENE connectors behind MCP tools.
- Feed selected context back into the agent prompt so saved papers/datasets/artifacts guide later analysis.
- Expand Scanpy workflows with differential expression, pathway-level summaries, and richer QC.
- Compare local model runtimes and tool-calling reliability across Ollama, vLLM, and llama.cpp.
- Add a deployment-ready demo mode for reviewers who do not want to run a local LLM.




