AI-powered native plant recommendations. Tell jardin where your garden is and what you care about β pollinators, soil health, food, biodiversity β and it returns a ranked list of native species scored by six specialist AI agents running in parallel.
Requirements: Docker Desktop (or Docker + Compose plugin), an Anthropic API key.
git clone <repo-url> && cd jardin
cp .env.example .env
# Open .env and set ANTHROPIC_API_KEY=sk-ant-...
docker compose up --build| App | http://localhost:3000 |
| API | http://localhost:8000 |
| API docs | http://localhost:8000/docs |
| Health | http://localhost:8000/health |
The first build downloads the BAAI/bge-base-en-v1.5 embedding model (~400 MB). Subsequent starts use the cached image.
The app needs species data to make recommendations. Run this after docker compose up:
# Load 200 French native species and AI-score them
docker compose exec backend python -m scripts.load_species --country FR --limit 200 --score
# Add more countries
docker compose exec backend python -m scripts.load_species \
--country GB --country DE --country IE --limit 300 --scoreThe --score flag calls Claude to assign ecological scores. Without it, placeholder scores (0.5) are stored and the recommendation pipeline still works but ranks will be less meaningful.
docker compose down # stop containers, keep database volume
docker compose down -v # stop and delete the database volume (full reset)Browser β Next.js (port 3000) β FastAPI (port 8000) β Postgres + pgvector
β
POST /api/recommendations
β
Background task
β
βββββββ fetch_candidates βββββββ
β top-50 species from DB β
ββββββββββββββββ¬βββββββββββββββββ
β fan-out (parallel)
ββββββββββββ¬βββββββββββ¬ββββββββΌββββββββ¬βββββββββββββ¬βββββββββββ
βΌ βΌ βΌ βΌ βΌ βΌ βΌ
Pollinators Insects Soil Environment Food/Utility Size
agent agent agent agent agent agent
ββββββββββββ΄βββββββββββ΄ββββββββΌββββββββ΄βββββββββββββ΄βββββββββββ
β fan-in
merge_and_rank
(top-20, weighted)
β
SSE stream β browser
Results stream live to the browser as each agent finishes via Server-Sent Events. Each species detail page has a streaming chat interface backed by the same Claude model.
| Dimension | What the agent evaluates |
|---|---|
| Pollinators | Value to bees, butterflies, moths, hoverflies |
| Insects | Host plant for caterpillars, beetles, other invertebrates |
| Soil | Nitrogen fixation, mycorrhizal networks, organic matter |
| Environment | Carbon sequestration, erosion control, microclimate |
| Food / utility | Edible parts, medicinal uses, practical harvests |
| Size | Suitability for compact spaces and containers (high = compact, low = large tree) |
Each priority can be weighted 0β1 by the user; the final score is a weighted average.
| Path | What it does |
|---|---|
backend/agents/graph.py |
LangGraph pipeline β 8 nodes, parallel fan-out/fan-in |
backend/agents/prompts.py |
Versioned system prompts for all 6 scoring agents |
backend/core/logging.py |
Structlog JSON logging (human-readable in dev, JSON in prod) |
backend/core/tracing.py |
Per-run traces written to backend/traces/YYYYMMDD/ |
backend/core/resilience.py |
@with_retry β exponential backoff on Anthropic 429s |
backend/app/routers/recommendations.py |
POST + SSE stream endpoints |
backend/app/routers/chat.py |
Streaming species chat endpoint |
backend/app/services/recommendation_cache.py |
Content-hash caching β avoids re-running the full agent graph for identical requests |
backend/services/gbif_loader.py |
GBIF fetch + sentence-transformers embeddings |
backend/services/species_scorer.py |
Per-species Claude scoring (used by --score) |
frontend/components/ScoreBar.tsx |
Shared score bar + colour logic used across cards and detail pages |
Copy .env.example to .env and fill in the values before running.
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY |
Yes | β | Anthropic API key |
DATABASE_URL |
No | postgresql://postgres:postgres@db:5432/garden_planner |
Postgres connection string |
GBIF_API_BASE |
No | https://api.gbif.org/v1 |
GBIF API base URL |
NEXT_PUBLIC_API_URL |
No | http://localhost:8000 |
API URL seen by the browser |
LOG_LEVEL |
No | INFO |
DEBUG / INFO / WARNING |
LOG_FORMAT |
No | auto | json forces JSON logs; omit for auto-detect |
cd backend
source .venv/bin/activate
python -m pytestTests cover agents/graph.py (scoring, ranking, JSON parsing), services/gbif_loader.py (kingdom filtering, name extraction), app/services/recommendation_cache.py (content-hash determinism), and the FastAPI endpoints via TestClient.
cd frontend
npm testTests cover ScoreBar colour thresholds and the API client helpers (makeStreamUrl, createRecommendation, geocode).
The eval suite tests the full agent pipeline against 11 real-world garden scenarios across Europe and uses Claude as an automated evaluator.
# Using the shell wrapper (handles venv activation and env export)
./evals/run_evals.sh
# Single test case with verbose output
./evals/run_evals.sh --test-id tc_001 --verbose
# Rate-limit-safe run: stagger agents 10 s apart, 20 candidates, 60 s cooldown between cases
./evals/run_evals.sh --candidates 20 --stagger 10 --cooldown 60Or call the Python script directly (note: use set -a to export env vars):
cd backend && source .venv/bin/activate
set -a && source ../.env && set +a
python ../evals/run_evals.py --candidates 20 --stagger 10 --cooldown 60| Flag | Default | Description |
|---|---|---|
--test-id ID |
all | Run only a single test case (e.g. tc_001) |
--verbose |
off | Enable debug logging and show evaluator prompt details |
--candidates N |
20 | Species per agent call. Use 20 for rate-limit safety; 50 for production-realistic runs |
--stagger SECS |
10 | Delay between agent calls. Spreads the 6 parallel agents over SECS Γ 5 seconds to stay under the Anthropic output-token rate limit |
--cooldown SECS |
15 | Wait between test cases |
The suite exits with code 1 if the pass rate falls below 70%. Reports are written to evals/reports/eval_{timestamp}.json.
| ID | Location | Priority focus |
|---|---|---|
| tc_001 | Brittany, France | High pollinators |
| tc_002 | Yorkshire Dales, UK | High insect hosts |
| tc_003 | Bavaria, Germany | Soil health |
| tc_004 | Amsterdam, Netherlands | Environmental resilience |
| tc_005 | Connemara, Ireland | Food & medicinal utility |
| tc_006 | Seville, Spain | All dimensions balanced |
| tc_007 | Kiruna, Sweden | Subarctic conditions |
| tc_008 | Scottish Highlands | Extreme pollinator focus |
| tc_009 | Strasbourg, France | Soil only (near-zero other) |
| tc_010 | Bastogne, Belgium | Pollinators + food utility |
| tc_011 | Paris, France | Compact / container garden (high size priority) |
jardin/
βββ backend/
β βββ agents/
β β βββ graph.py # LangGraph multi-agent pipeline (6 scoring agents)
β β βββ prompts.py # Versioned system prompts (v1.0.0)
β β βββ runner.py # run_garden_planner() β programmatic wrapper
β βββ app/
β β βββ main.py # FastAPI app + /health endpoint
β β βββ models/ # SQLAlchemy models (species, garden_request, recommendation)
β β βββ db/ # Database session / connection setup
β β βββ services/
β β β βββ recommendation_cache.py # Content-hash cache logic
β β βββ routers/
β β βββ recommendations.py
β β βββ chat.py
β βββ core/
β β βββ logging.py # Structlog JSON setup
β β βββ tracing.py # Node traces β backend/traces/
β β βββ resilience.py # @with_retry decorator
β βββ services/
β β βββ gbif_loader.py # GBIF + embeddings
β β βββ species_scorer.py # Per-species Claude scoring
β βββ scripts/
β β βββ load_species.py # Data loading CLI
β βββ tests/
β β βββ conftest.py # Shared fixtures and path bootstrap
β β βββ test_graph.py # _safe_score, _parse_json_response, merge_and_rank
β β βββ test_gbif_loader.py
β β βββ test_recommendation_cache.py
β β βββ test_api_integration.py
β βββ alembic/ # DB migrations
βββ frontend/
β βββ app/
β β βββ page.tsx # 3-step garden wizard
β β βββ results/[requestId]/page.tsx
β β βββ species/[id]/page.tsx
β βββ components/
β β βββ ScoreBar.tsx # Shared score bar + colour logic
β β βββ SpeciesCard.tsx
β β βββ PrioritySlider.tsx
β βββ __tests__/
β βββ scoreBar.test.ts
β βββ api.test.ts
βββ evals/
β βββ test_cases.json # 11 test cases
β βββ run_evals.py # Eval harness (direct graph invocation, no HTTP)
β βββ run_evals.sh # Shell wrapper (handles venv + env export)
βββ docker-compose.yml