CoMSES AgentSpace

A proof-of-concept agentic RAG application over the CoMSES Computational Model Library, built on Temporal.io.

A 5-minute setup and demo video on youtube: https://www.youtube.com/watch?v=sfjV-Id7-vg

What this is

A POC that lets researchers ask natural-language questions across computational model data — metadata, documentation, and source code (see sample_data/) — and get answers with paragraph-level citations back to the source material.

The agent itself is a Temporal workflow (AgentWorkflow) whose tools can be either Temporal activities (fast, mostly side-effect-free) or Temporal child workflows (multi-step, durable, with their own progress events).

Screenshots

What this is not

Not production-ready. No auth hardening, no rate-limiting at the public edge, etc.
Not a search box or a chatbot wrapper around a single model — it decomposes queries, resolves relevant models (with optional human-in-the-loop), and runs hybrid (dense + sparse) vector search before generating cited answers.

Architecture & Intent

Per-module intent.md files document the why behind each major decision:

intent.md — system-level rationale: agentic RAG over CoMSES, layered code structure, Temporal, worker split, event sourcing, LiteLLM proxy
src/modules/agent/intent.md — the conversation runtime: AgentWorkflow, three tool types, transactional outbox, context propagation
src/modules/ingestion/intent.md — write side: marker-pdf, synthetic Q&A enrichment, hybrid embeddings (dense + BM42 sparse), tree-sitter for code
src/modules/retrieval/intent.md — read side: intent analysis, query decomposition, model relevance + HITL, hybrid RRF search, source attribution, almost real-time progress

Prerequisites

Software

linux, wsl2, macos (didn't test)
Docker + Docker Compose (for the infrastructure stack)
./setup.sh will install missing dependencies automatically

Hardware

16 GB RAM minimum; 24 GB+ recommended (PyTorch + marker-pdf + embedding models share host memory)
~10 GB disk for ML model weights and Docker images
GPU: NVIDIA GPU with CUDA for faster PDF parsing and embeddings.

Verified on Windows 11 laptop on WSL2 with 32 GB RAM, 8 GB VRAM (NVIDIA RTX 2000 Ada Generation)

LLM access

An API key from at least one provider — OpenAI, Anthropic, OpenRouter, Groq, Google — or a local Ollama instance reachable at OLLAMA_HOST. setup.sh probes the keys you supply and auto-picks the first live profile.
Embeddings (dense + sparse BM42) run locally via FastEmbed by default — no separate API needed. A GPU is highly recommended for embeddings computation.

Setup

./setup.sh

The script bootstraps everything in phases: toolchain install, .env generation with auto-generated secrets, Docker stack startup (Postgres, Qdrant, Redis, MinIO, Temporal, LiteLLM), database migrations, model warming, sample-data ingestion. It will prompt for LLM API KEY and llm/embeddings configuration and worker startup. When it finishes you'll have a UI at http://localhost:5173 and a sample dataset to query.

Run ./setup.sh --help for individual phase verbs (re-run a phase, recreate, etc.).

Setup phases

Each phase is idempotent (sentinel-gated) and resumable — a re-run picks up at the first incomplete or invalidated phase.

#	Phase	What it does
1	`toolchain`	Detects required CLIs (`uv`, `node`, `pnpm`, `zellij`, `shellcheck`, `jq`) and installs anything missing via the official installers. Docker is required by later phases but not auto-installed here — install it yourself if absent.
2	`uv_sync`	Runs `uv sync --group pdf` (and `--group gpu` when an NVIDIA GPU is detected). First run downloads ~2 GB (PyTorch + marker-pdf), plus ~600 MB of cuDNN/cuBLAS wheels on GPU hosts.
3	`hardware_preflight`	Warn-only RAM / swap / CPU / GPU posture check. Suggests `.env` overrides for low-memory hosts (e.g. `INGEST_WORKER_MAX_CONCURRENT_ACTIVITIES=2`); never hard-fails.
4	`env_bootstrap`	Creates `.env` from `.env.example` (or appends new keys to an existing one), generates per-deployment secrets (`LITELLM_MASTER_KEY`, `MINIO_ROOT_PASSWORD`, `QDRANT_API_KEY`, DB passwords, UI passwords).
5	`app_hostnames`	Prompts for the public host the browser will use (default `localhost`; FQDN/IP for remote VMs - see Deploying CoMSES AgentSpace on a remote VM). Coherently writes `CORS_ALLOWED_ORIGINS`, `MINIO_EXTERNAL_ENDPOINT`, `VITE_API_BASE_URL`, `VITE_WS_BASE_URL`, `VITE_HOST`, and `VITE_ALLOWED_HOSTS`. RFC-1123-validates the input.
6	`env_triage`	Detects and refuses to start when a sibling Temporal stack is already running on the same ports (7233 / 8080 / 9090 / 8085 / 16686).
7	`provider_keys`	Probes every supported LLM provider (OpenAI, Anthropic, Groq, OpenRouter, xAI, Google, GPUStack) and prompts for a chat-provider key when none are alive.
8	`embedding_backend`	Picks the dense embedding backend by writing `EMBEDDING_DENSE_BACKEND` to `.env` — `fastembed` (in-process default), `ollama-container` (Docker), or `cloud-<provider>` (any LiteLLM-supported embed provider). Derives `EMBEDDING_DENSE_PROVIDER` / `EMBEDDING_DENSE_MODEL` / `OLLAMA_API_BASE` from the choice. Sparse (BM42) is always local.
9	`litellm_config_seed`	Picks a profile (e.g. `cloud-openrouter`, `local`) and seeds `config/litellm/litellm_config.yaml` from `config/litellm/profiles/<profile>.yaml`. Preserves user edits by detecting a seed marker; re-seeds (with backup) when the profile or embedding backend changes, or with `--reseed`.
10	`litellm_config_review`	Prints a banner of the seeded role → model mappings and pauses for inspection so you can edit the YAML before launch. Skipped under `--auto-confirm` or non-interactive mode.
11	`marker_prewarm`	Pre-downloads marker-pdf layout / OCR / text-recognition models (~1.5 GB) into `~/.cache/huggingface/` so the first PDF ingest doesn't stall.
12	`fastembed_prewarm`	Pre-downloads dense + sparse (BM42) embedding models locally. Dense is skipped when `EMBEDDING_DENSE_BACKEND` is `ollama-container` or `cloud-*`; sparse is always local.
13	`docker_up`	Brings up the Temporal stack (`docker-compose.temporal.yml`) then the infra stack (`compose.yml`), then health-checks 8 services in order: temporal-postgresql → temporal → comses-rag-db → litellm-db → redis → minio → qdrant → litellm-proxy.
14	`ollama_prewarm`	Only runs when `EMBEDDING_DENSE_BACKEND=ollama-container`. Waits for the `ollama-pull-llama` init container to finish and ensures `nomic-embed-text` is pulled inside the `ollama` container. No-op for `fastembed` and `cloud-*` backends.
15	`litellm_key`	Calls `POST /key/generate` against the running LiteLLM proxy to mint a virtual API key and writes it to `LITELLM_PROXY_API_KEY` in `.env`.
16	`litellm_routing_probe`	Per-role smoke calls (`smart` / `default` / `fast` / `long` / `embed`) against the proxy. Hard-fails if no chat role responds 2xx or if embed returns no vector.
17	`migrations`	Runs `make db-check` then `make db-upgrade` to bring the `comses-rag-db` schema to the latest Alembic head.
18	`hosts_file`	Validates that the Docker DNS names workers connect to (`minio`, `redis`, `qdrant`, `ollama`, `litellm-proxy`, `litellm-db`, `comses-rag-db`) resolve from the host. If any are missing, offers `[a]uto sudo` / `[m]anual` / `[s]kip` to append `127.0.0.1 …` to `/etc/hosts`.
19	`workers`	Prompts you to start the 10-pane Zellij worker layout in a second terminal (`make w`) and polls each worker's metrics port (10090–10099) until ready.
20	`sample_data`	Stages and ingests two bundled CoMSES codebases through the full pipeline (marker-pdf → fastembed → Qdrant + Postgres + MinIO).
21	`dashboard`	Prints the final dashboard: service URLs + credentials, Temporal CLI hint, Zellij attach command, sample-data summary, and a "Try it" pointer at the configured host.

After setup completes

Service	URL	Credentials
Chat UI	http://localhost:5173	API key `dev-key-1` (from `API_KEY_MAPPING` in `.env`)
FastAPI	http://localhost:8000	—
Temporal UI	http://localhost:8080	—
Grafana	http://localhost:8085	`admin` / `$GRAFANA_ADMIN_PASSWORD`
LiteLLM UI	http://localhost:4000/ui	`admin` / `$LITELLM_PROXY_UI_PASSWORD`
Jaeger	http://localhost:16686	—
Prometheus	http://localhost:9090	—
Qdrant dashboard	http://localhost:6333/dashboard	`$QDRANT_API_KEY`
MinIO Console	http://localhost:9001	`minio_admin` / `$MINIO_ROOT_PASSWORD`
pgAdmin	http://localhost:8888	`$PGADMIN_DEFAULT_EMAIL` / `$PGADMIN_DEFAULT_PASSWORD`
Databasus	http://localhost:4005	—

$VAR references are auto-generated values written into .env by the env-bootstrap phase — setup.sh also prints them once on completion. Look them up in .env, not here.

Temporal CLI

docker exec -it temporal-admin-tools temporal workflow list

Workers (Zellij)

zellij attach comses-workers

Sample data

Two actual models from the CoMSES Model Library are ingested on the first run of setup.sh:

761c91b8-897b-4e59-8b5f-83715d6c9471 - MicroAnts 2.5
dd847e79-bb37-43e1-ae3a-27de57573376 - Ants Digging Networks

Try it

Open http://localhost:5173, log in with API key dev-key-1, and ask a multi-part question — e.g. "What ant-foraging models are in the library, and how do they differ?"

Remote VM deployment (Jetstream2, EC2, etc.)

See deployment/README.md for the full recipe — SSH-tunnel mode (recommended for solo dev) and HTTPS-via-Caddy mode (for sharing a public demo URL).

Develop

Contributor setup (one-time)

The repo enforces quality gates via qlty (ruff + mypy + deptry + a pre-commit / pre-push hook combo). qlty is a standalone Rust CLI, not a Python package — setup.sh does not install it because it isn't a runtime dependency. Install it yourself, then wire the hooks:

curl -fsSL https://qlty.sh | sh                          # one-time install → ~/.qlty/bin/
echo 'export PATH="$HOME/.qlty/bin:$PATH"' >> ~/.bashrc  # or ~/.zshrc — persist on PATH
make install-git-hooks                                   # symlink .git/hooks/{pre-commit,pre-push}

make install-git-hooks symlinks .git/hooks/pre-commit → .qlty/hooks/pre-commit.sh and .git/hooks/pre-push → .qlty/hooks/pre-push.sh. The repo ships custom, version-controlled hooks (biome + .env.example host-path guard + uv run mypy + tsc --noEmit alongside qlty check), so make install-git-hooks deliberately does not call qlty githooks install — that subcommand regenerates .qlty/hooks/*.sh from defaults and would wipe the custom logic.

Day-to-day commands

make d                   # start infrastructure (Postgres, Qdrant, Redis, MinIO, Temporal, LiteLLM)
make w                   # start all 10 Temporal workers (Zellij layout) + the chat app (backend + frontend)
make k                   # stop infra
make kw                  # kill all workers + chat app
make test                # unit tests (fast, mocked)
make test-integration    # integration tests (PMR containers)
make check               # ruff + mypy + deptry + qlty

Module-specific develop notes live in the per-module READMEs: backend/, frontend/, shared/, shared/worker_base/.

Contributing

Contributions are welcome.

Thanks

Temporal — the durable workflow engine that is the execution backbone of the ingestion workflows, agent runtime, every retrieval tool and the event-streaming outbox
marker-pdf — layout-aware PDF parsing for academic model documentation
Zellij — terminal multiplexer that hosts the 10-pane worker layout via make w

License

This project is released under the MIT License.

⚠️ Caveat — GPL-3.0 dependency. The PDF ingestion pipeline depends on marker-pdf (and its sub-dependency surya-ocr), both of which are licensed under GPL-3.0-or-later. While this project's own source code is MIT-licensed, anyone distributing or running the combined application with marker-pdf linked in is bound by GPL-3.0 obligations for that combined work.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
.qlty		.qlty
admin		admin
alembic		alembic
assets		assets
config		config
deployment		deployment
sample_data/codebases		sample_data/codebases
scripts		scripts
src		src
tests		tests
.claudeignore		.claudeignore
.dockerignore		.dockerignore
.env.example		.env.example
.env.test		.env.test
.env.test.example		.env.test.example
.gitignore		.gitignore
.prettierignore		.prettierignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
compose.yml		compose.yml
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.temporal.yml		docker-compose.temporal.yml
docker-compose.workers.dev.yml		docker-compose.workers.dev.yml
docker-compose.workers.yml		docker-compose.workers.yml
docker-compose.yml		docker-compose.yml
intent.md		intent.md
pyproject.toml		pyproject.toml
setup.sh		setup.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoMSES AgentSpace

What this is

What this is not

Architecture & Intent

Prerequisites

Setup

Setup phases

After setup completes

Remote VM deployment (Jetstream2, EC2, etc.)

Develop

Contributor setup (one-time)

Day-to-day commands

Contributing

Thanks

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CoMSES AgentSpace

What this is

What this is not

Architecture & Intent

Prerequisites

Setup

Setup phases

After setup completes

Remote VM deployment (Jetstream2, EC2, etc.)

Develop

Contributor setup (one-time)

Day-to-day commands

Contributing

Thanks

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages