Skip to content

katydidnot/jardin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌿 jardin

AI-powered native plant recommendations. Tell jardin where your garden is and what you care about β€” pollinators, soil health, food, biodiversity β€” and it returns a ranked list of native species scored by six specialist AI agents running in parallel.


Running with Docker Compose

Requirements: Docker Desktop (or Docker + Compose plugin), an Anthropic API key.

git clone <repo-url> && cd jardin
cp .env.example .env
# Open .env and set ANTHROPIC_API_KEY=sk-ant-...
docker compose up --build
App http://localhost:3000
API http://localhost:8000
API docs http://localhost:8000/docs
Health http://localhost:8000/health

The first build downloads the BAAI/bge-base-en-v1.5 embedding model (~400 MB). Subsequent starts use the cached image.

Seed the species database

The app needs species data to make recommendations. Run this after docker compose up:

# Load 200 French native species and AI-score them
docker compose exec backend python -m scripts.load_species --country FR --limit 200 --score

# Add more countries
docker compose exec backend python -m scripts.load_species \
  --country GB --country DE --country IE --limit 300 --score

The --score flag calls Claude to assign ecological scores. Without it, placeholder scores (0.5) are stored and the recommendation pipeline still works but ranks will be less meaningful.

Stop / reset

docker compose down          # stop containers, keep database volume
docker compose down -v       # stop and delete the database volume (full reset)

Architecture

Browser  β†’  Next.js (port 3000)  β†’  FastAPI (port 8000)  β†’  Postgres + pgvector
                                          β”‚
                              POST /api/recommendations
                                          β”‚
                                  Background task
                                          β”‚
                         β”Œβ”€β”€β”€β”€β”€β”€ fetch_candidates ──────┐
                         β”‚   top-50 species from DB      β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚  fan-out (parallel)
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β–Ό          β–Ό          β–Ό       β–Ό       β–Ό            β–Ό          β–Ό
     Pollinators  Insects     Soil  Environment Food/Utility  Size
       agent       agent     agent    agent       agent      agent
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚  fan-in
                                 merge_and_rank
                                 (top-20, weighted)
                                        β”‚
                              SSE stream β†’ browser

Results stream live to the browser as each agent finishes via Server-Sent Events. Each species detail page has a streaming chat interface backed by the same Claude model.

Scoring dimensions

Dimension What the agent evaluates
Pollinators Value to bees, butterflies, moths, hoverflies
Insects Host plant for caterpillars, beetles, other invertebrates
Soil Nitrogen fixation, mycorrhizal networks, organic matter
Environment Carbon sequestration, erosion control, microclimate
Food / utility Edible parts, medicinal uses, practical harvests
Size Suitability for compact spaces and containers (high = compact, low = large tree)

Each priority can be weighted 0–1 by the user; the final score is a weighted average.

Key components

Path What it does
backend/agents/graph.py LangGraph pipeline β€” 8 nodes, parallel fan-out/fan-in
backend/agents/prompts.py Versioned system prompts for all 6 scoring agents
backend/core/logging.py Structlog JSON logging (human-readable in dev, JSON in prod)
backend/core/tracing.py Per-run traces written to backend/traces/YYYYMMDD/
backend/core/resilience.py @with_retry β€” exponential backoff on Anthropic 429s
backend/app/routers/recommendations.py POST + SSE stream endpoints
backend/app/routers/chat.py Streaming species chat endpoint
backend/app/services/recommendation_cache.py Content-hash caching β€” avoids re-running the full agent graph for identical requests
backend/services/gbif_loader.py GBIF fetch + sentence-transformers embeddings
backend/services/species_scorer.py Per-species Claude scoring (used by --score)
frontend/components/ScoreBar.tsx Shared score bar + colour logic used across cards and detail pages

Environment variables

Copy .env.example to .env and fill in the values before running.

Variable Required Default Description
ANTHROPIC_API_KEY Yes β€” Anthropic API key
DATABASE_URL No postgresql://postgres:postgres@db:5432/garden_planner Postgres connection string
GBIF_API_BASE No https://api.gbif.org/v1 GBIF API base URL
NEXT_PUBLIC_API_URL No http://localhost:8000 API URL seen by the browser
LOG_LEVEL No INFO DEBUG / INFO / WARNING
LOG_FORMAT No auto json forces JSON logs; omit for auto-detect

Running tests

Backend (pytest)

cd backend
source .venv/bin/activate
python -m pytest

Tests cover agents/graph.py (scoring, ranking, JSON parsing), services/gbif_loader.py (kingdom filtering, name extraction), app/services/recommendation_cache.py (content-hash determinism), and the FastAPI endpoints via TestClient.

Frontend (vitest)

cd frontend
npm test

Tests cover ScoreBar colour thresholds and the API client helpers (makeStreamUrl, createRecommendation, geocode).


Running evals

The eval suite tests the full agent pipeline against 11 real-world garden scenarios across Europe and uses Claude as an automated evaluator.

# Using the shell wrapper (handles venv activation and env export)
./evals/run_evals.sh

# Single test case with verbose output
./evals/run_evals.sh --test-id tc_001 --verbose

# Rate-limit-safe run: stagger agents 10 s apart, 20 candidates, 60 s cooldown between cases
./evals/run_evals.sh --candidates 20 --stagger 10 --cooldown 60

Or call the Python script directly (note: use set -a to export env vars):

cd backend && source .venv/bin/activate
set -a && source ../.env && set +a
python ../evals/run_evals.py --candidates 20 --stagger 10 --cooldown 60

Eval CLI flags

Flag Default Description
--test-id ID all Run only a single test case (e.g. tc_001)
--verbose off Enable debug logging and show evaluator prompt details
--candidates N 20 Species per agent call. Use 20 for rate-limit safety; 50 for production-realistic runs
--stagger SECS 10 Delay between agent calls. Spreads the 6 parallel agents over SECS Γ— 5 seconds to stay under the Anthropic output-token rate limit
--cooldown SECS 15 Wait between test cases

The suite exits with code 1 if the pass rate falls below 70%. Reports are written to evals/reports/eval_{timestamp}.json.

Test case coverage

ID Location Priority focus
tc_001 Brittany, France High pollinators
tc_002 Yorkshire Dales, UK High insect hosts
tc_003 Bavaria, Germany Soil health
tc_004 Amsterdam, Netherlands Environmental resilience
tc_005 Connemara, Ireland Food & medicinal utility
tc_006 Seville, Spain All dimensions balanced
tc_007 Kiruna, Sweden Subarctic conditions
tc_008 Scottish Highlands Extreme pollinator focus
tc_009 Strasbourg, France Soil only (near-zero other)
tc_010 Bastogne, Belgium Pollinators + food utility
tc_011 Paris, France Compact / container garden (high size priority)

Project layout

jardin/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ graph.py          # LangGraph multi-agent pipeline (6 scoring agents)
β”‚   β”‚   β”œβ”€β”€ prompts.py        # Versioned system prompts (v1.0.0)
β”‚   β”‚   └── runner.py         # run_garden_planner() β€” programmatic wrapper
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ main.py           # FastAPI app + /health endpoint
β”‚   β”‚   β”œβ”€β”€ models/           # SQLAlchemy models (species, garden_request, recommendation)
β”‚   β”‚   β”œβ”€β”€ db/               # Database session / connection setup
β”‚   β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”‚   └── recommendation_cache.py  # Content-hash cache logic
β”‚   β”‚   └── routers/
β”‚   β”‚       β”œβ”€β”€ recommendations.py
β”‚   β”‚       └── chat.py
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ logging.py        # Structlog JSON setup
β”‚   β”‚   β”œβ”€β”€ tracing.py        # Node traces β†’ backend/traces/
β”‚   β”‚   └── resilience.py     # @with_retry decorator
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ gbif_loader.py    # GBIF + embeddings
β”‚   β”‚   └── species_scorer.py # Per-species Claude scoring
β”‚   β”œβ”€β”€ scripts/
β”‚   β”‚   └── load_species.py   # Data loading CLI
β”‚   β”œβ”€β”€ tests/
β”‚   β”‚   β”œβ”€β”€ conftest.py       # Shared fixtures and path bootstrap
β”‚   β”‚   β”œβ”€β”€ test_graph.py     # _safe_score, _parse_json_response, merge_and_rank
β”‚   β”‚   β”œβ”€β”€ test_gbif_loader.py
β”‚   β”‚   β”œβ”€β”€ test_recommendation_cache.py
β”‚   β”‚   └── test_api_integration.py
β”‚   └── alembic/              # DB migrations
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ page.tsx          # 3-step garden wizard
β”‚   β”‚   β”œβ”€β”€ results/[requestId]/page.tsx
β”‚   β”‚   └── species/[id]/page.tsx
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ ScoreBar.tsx      # Shared score bar + colour logic
β”‚   β”‚   β”œβ”€β”€ SpeciesCard.tsx
β”‚   β”‚   └── PrioritySlider.tsx
β”‚   └── __tests__/
β”‚       β”œβ”€β”€ scoreBar.test.ts
β”‚       └── api.test.ts
β”œβ”€β”€ evals/
β”‚   β”œβ”€β”€ test_cases.json       # 11 test cases
β”‚   β”œβ”€β”€ run_evals.py          # Eval harness (direct graph invocation, no HTTP)
β”‚   └── run_evals.sh          # Shell wrapper (handles venv + env export)
└── docker-compose.yml

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors