Skip to content

M-J-Murray/bio-agent

Repository files navigation

Bio-Agent

A local-first MCP bioinformatics research agent for papers, datasets, and Scanpy visualisations.

Bio-Agent is a full-stack AI portfolio project that demonstrates agentic tool use in a realistic biotech workflow. A researcher can ask natural-language questions about curated single-cell RNA-seq papers, inspect linked datasets, run exploratory bioinformatics analyses, and see every MCP tool call and generated artifact as the agent works.

React TypeScript FastAPI Pydantic AI MCP Ollama LanceDB Scanpy

Bio-Agent demo preview

What It Does

Bio-Agent is designed around a practical researcher workflow:

  • Find relevant papers from a curated local single-cell literature set.
  • Inspect linked datasets with organism, modality, cell count, marker genes, and provenance.
  • Run exploratory analysis such as UMAPs, QC summaries, marker expression, and composition plots.
  • Watch the agent work through live streamed reasoning summaries, MCP tool calls, tool results, and artifact creation events.
  • Save context by adding useful papers, datasets, and generated artifacts to the selected context for the active chat.

The current MVP is intentionally local and curated. It does not claim live PubMed, GEO, or CELLxGENE search yet; those are natural next connectors.

Demo Walkthrough

Try this flow locally:

  1. Find single-cell papers about PBMC immune profiling.
  2. Which dataset should I inspect first and why?
  3. Run a UMAP colored by cell type for the selected PBMC dataset.
  4. Show marker expression for CD3D, MS4A1, and LST1.

The UI is built to make the agent behavior visible, not hidden behind a chat bubble.

Live tool trace Artifact and response inspector
Tool calls and paper search Response inspector with artifact
Generated analysis artifact Selected context
Inline artifact rendering Selected context sidebar

Architecture

flowchart LR
    UI["React + TypeScript + Vite UI"] -->|SSE chat stream| API["FastAPI agent service"]
    API --> Agent["Pydantic AI agent"]
    Agent -->|OpenAI-compatible API| LLM["Ollama local LLM<br/>qwen3:8b by default"]
    Agent -->|MCP client<br/>Streamable HTTP| MCP["FastMCP bio server"]
    MCP --> Retrieval["LanceDB + curated manifests<br/>papers and datasets"]
    MCP --> Scanpy["Scanpy / AnnData-style EDA"]
    Scanpy --> Artifacts["Cached plots and summaries"]
    API --> Sessions["Chat history + selected context"]
    Artifacts --> UI
    Sessions --> UI
Loading

The agent service creates an MCP client with Pydantic AI's MCPServerStreamableHTTP toolset. At runtime, Pydantic AI discovers the FastMCP tools, converts their Python type hints and docstrings into model-visible tool schemas, and sends those schemas to the local OpenAI-compatible model. The model chooses when to call tools; the app streams tool-start, tool-result, artifact, and final-message events back to the UI.

Technical Highlights

  • Strict agent-first runtime: no deterministic fallback in production chat.
  • MCP tool discovery: paper search, paper metadata, dataset search, dataset profiles, and Scanpy EDA are exposed as structured FastMCP tools.
  • Visible agent orchestration: /api/chat/stream emits reasoning_summary, tool_started, tool_completed, artifact_created, message_delta, final, and error events.
  • Local LLM by default: Ollama serves qwen3:8b through an OpenAI-compatible endpoint, with the architecture kept swappable for other compatible runtimes.
  • Bioinformatics artifacts: Scanpy-style analyses generate cached local plots that render inline in chat and in the response inspector.
  • Portfolio-grade UI: React/TypeScript app with previous chats, response-level inspection, selected context, stoppable requests, and live tool-call timelines.

Running Locally

Prerequisites

  • Python 3.12+
  • uv
  • Node.js 20+
  • Ollama for local model serving

Install

cp .env.example .env
uv sync --all-packages --dev
npm install
ollama pull qwen3:8b

Start the services

Run each service in a separate terminal:

uv run python -m bio_mcp.server
uv run uvicorn bio_agent.main:app --reload --port 8000
npm run dev:web

Open the app at:

http://127.0.0.1:5173

Default local service URLs:

  • Web app: http://127.0.0.1:5173
  • Agent API: http://127.0.0.1:8000
  • MCP server: http://127.0.0.1:8001/mcp
  • Ollama OpenAI-compatible API: http://localhost:11434/v1

The selected model must support tool calling reliably. If Ollama or the MCP server is unavailable, the UI shows a visible agent error instead of falling back to scripted answers.

Configuration

.env.example contains the default local setup:

OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=ollama
MODEL_NAME=qwen3:8b
MCP_URL=http://127.0.0.1:8001/mcp
CORS_ORIGIN=http://localhost:5173
BIO_AGENT_DATA_DIR=./data

Because the agent uses an OpenAI-compatible model configuration, Ollama can be replaced later with another compatible local runtime without changing the app architecture.

Testing

uv run pytest
uv run ruff check .
npm run build:web

These checks cover the shared bioinformatics logic, MCP tool behavior, agent API/session behavior, stream event handling, and the TypeScript production build.

Roadmap

  • Add live PubMed, GEO, and CELLxGENE connectors behind MCP tools.
  • Feed selected context back into the agent prompt so saved papers/datasets/artifacts guide later analysis.
  • Expand Scanpy workflows with differential expression, pathway-level summaries, and richer QC.
  • Compare local model runtimes and tool-calling reliability across Ollama, vLLM, and llama.cpp.
  • Add a deployment-ready demo mode for reviewers who do not want to run a local LLM.

About

A bioinformatics research agent for scientists to ask about papers, inspect datasets, and generate exploratory insights powered by a React/TypeScript web app.

Resources

Stars

Watchers

Forks

Contributors