Donna — AI Personal Assistant

Donna is a self-hosted AI assistant that manages tasks, schedules, and reminders — and pursues you until things get done. Named after Donna Paulsen from Suits: sharp, proactive, never lets anything slip.

Built solo over 10 weeks (1,000+ commits). The system runs 10 Docker containers on a homelab server, routes work between a cloud LLM (Claude) and a local LLM (Ollama on an RTX 3090), and exposes a full admin console for observability and control.

Why This Exists

I forget to capture tasks, rarely check task lists, and don't schedule time to do work. Most productivity tools are passive — they wait for you to open them. Donna is the opposite: she escalates through Discord, SMS, and phone calls until you respond. She reschedules when you miss deadlines, prepares research before meetings, and learns your preferences from corrections over time.

What's Running

The system is live in production on a homelab Linux server. Here's what's built:

Core Engine (~61,000 lines Python)

Natural language task capture — say "call the mechanic Monday" via Discord or SMS, and Donna parses it into a structured task with deadline, priority, and domain
Config-driven state machine — 22 YAML config files control task types, state transitions, model routing, escalation rules, and agent behavior. Zero hardcoded business logic
4-tier notification escalation — Discord DM → SMS → automated phone call (Twilio TTS) → human escalation. Configurable cadence and cooldowns
Hybrid LLM routing — cloud model (Claude Sonnet) for reasoning-heavy work, local model (Qwen 32B on RTX 3090) for classification and parsing. Config-driven routing with automatic fallback
Shadow mode evaluation — run a secondary model in parallel on production traffic, compare outputs with Claude-as-judge, track win/loss rates before promoting a model
Skill system with trust progression — capabilities start in sandbox, graduate to trusted after passing quality gates. Automations run on cron schedules (e.g., daily product price watches)
Agent framework — PM Agent triages and routes work, Scheduler Agent manages the calendar, Prep Agent does research before deadlines, Challenger Agent stress-tests plans
Preference learning — corrections logged and surfaced as rules ("Nick prefers morning slots for deep work"), applied to future task scheduling

Admin Console (~18,000 lines TypeScript/React)

A 16-page operations dashboard with a dark gold/black theme:

Page	What It Does
Dashboard	Budget tracking (daily/monthly), cost breakdown by model and task type, LLM gateway health, parse accuracy, task throughput
Chat	Conversational interface to Donna with session management and quick-chat overlay
Tasks	Filterable task table with status, domain, priority, agent assignment, CSV export
Calendar	Weekly view merging Google Calendar events with Donna-scheduled tasks
Logs	Structured event viewer with 30+ event types organized by category, date range filters, Loki integration, trace correlation
Agents	Agent activity feed — dispatches, completions, failures, interrogation history
Configs	Live YAML editor for all 22 config files with validation and diff preview
Prompts	Browse, edit, and audit all 23 externalized LLM prompt templates with variable inspection
Shadow	Side-by-side model comparison dashboard — quality over time, cost savings, spot checks
Skill System	Capability registry with trust states, automation scheduler, run history, GPU status
Claude Inspector	Every Claude API call logged — cost per call, latency, token counts, prompt/response inspection
Escalations	Track notification escalation chains in progress and their outcomes
Preferences	Correction log and extracted preference rules with confidence scores
LLM Gateway	Queue depth, active requests, circuit breaker state, provider health
Vault	Encrypted credential store with commit history

Infrastructure

10 Docker containers managed via multi-file Compose (core, API, UI, orchestrator, Ollama, monitoring stack, browser sidecar, reverse proxy)
Observability stack — Grafana + Loki + Promtail for centralized logging; structured logs on every LLM call with cost, latency, and token tracking
51 Alembic migrations — fully versioned schema evolution from day one
SQLite on NVMe (WAL mode) as primary store + async write-through to Supabase Postgres
Caddy reverse proxy with automatic HTTPS
415 test files covering unit and integration tests

Architecture

                    ┌─────────────────────────────────────────────┐
                    │              Admin Console (React)           │
                    │         localhost:8400 — 16 pages            │
                    └────────────────────┬────────────────────────┘
                                         │
                    ┌────────────────────▼────────────────────────┐
                    │            FastAPI REST Backend              │
                    │        localhost:8200 — auth, CRUD           │
                    └────────────────────┬────────────────────────┘
                                         │
       ┌─────────────────────────────────▼─────────────────────────────────┐
       │                        Orchestrator                                │
       │   Task routing · State machine · Agent dispatch · Scheduling       │
       │                     localhost:8100                                  │
       └──┬──────────┬──────────┬──────────┬──────────┬──────────┬─────────┘
          │          │          │          │          │          │
     ┌────▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────────┐
     │ Claude │ │ Ollama │ │ Dis- │ │ Twilio │ │Gmail │ │  Google    │
     │  API   │ │ RTX    │ │ cord │ │SMS/TTS │ │ API  │ │ Calendar   │
     │(Sonnet)│ │ 3090   │ │ Bot  │ │        │ │      │ │            │
     └────────┘ └────────┘ └──────┘ └────────┘ └──────┘ └────────────┘

Every LLM call goes through a model abstraction layer (complete(prompt, schema, model_alias)) that handles routing, fallback, structured output validation, and cost tracking. Models never call tools directly — the orchestrator validates and executes all tool proposals.

Tech Stack

Layer	Technology
Backend	Python 3.12, asyncio, FastAPI, structlog
Frontend	TypeScript, React, Vite, Recharts
Cloud LLM	Claude API (Sonnet)
Local LLM	Ollama + Qwen 32B (RTX 3090, Q6_K quantization)
Database	SQLite (WAL mode, NVMe) + Supabase Postgres replica
Migrations	Alembic + SQLAlchemy
Messaging	Discord.py, Twilio (SMS + Voice), Gmail API
Observability	Grafana, Loki, Promtail, structured logging
Deployment	Docker Compose (multi-file), Caddy
Testing	pytest (415 test files)
Docs	MkDocs + mkdocstrings (auto-generated API reference)

Project Stats

Metric	Value
Python source	61,000 lines across 305 modules
TypeScript/React	18,000 lines across 161 files
Test code	64,000 lines across 415 files
Commits	1,000+ over ~10 weeks
DB migrations	51 Alembic versions
Config files	22 YAML configs
Prompt templates	23 externalized prompts
JSON schemas	28 structured output schemas
Docker services	10 containers
Admin console pages	16

Key Design Decisions

Config over code. Model routing, task types, state transitions, escalation rules, and prompt templates are all externalized to YAML/JSON. The application reads config at startup; changing behavior means editing a config file, not deploying code.

Safety-first agent autonomy. Agents start with minimal permissions. Email is draft-only. Code goes to feature branches only. Trust is earned through a sandbox → trusted progression with quality gates.

Structured logging on every model call. Every LLM invocation logs task type, model, latency, tokens, cost, and output. The Claude Inspector page makes this queryable. Budget tracking is real-time with daily and monthly thresholds.

Hybrid local/cloud routing. Classification and parsing run on the local RTX 3090 to minimize API costs. Reasoning-heavy tasks (agent work, decomposition, scheduling) route to Claude. Shadow mode lets you validate a model swap on live traffic before committing.

Running It

# Clone and bootstrap
git clone <repo-url> donna && cd donna
./scripts/bootstrap.sh

# Or step by step
./scripts/bootstrap.sh --no-setup
source .venv/bin/activate
donna setup --phase 1

# Dev mode
donna run --dev

# Docker (production)
./scripts/donna-up.sh --with-monitoring

Documentation

Full documentation site: nfeuer.github.io/donna

Built from docs/ via MkDocs with auto-generated API reference. Deployed on every push to main.

Name		Name	Last commit message	Last commit date
Latest commit History 959 Commits
.claude		.claude
.github/workflows		.github/workflows
alembic		alembic
capabilities		capabilities
config		config
docker		docker
docs		docs
donna-ui		donna-ui
fixtures		fixtures
prompts		prompts
schemas		schemas
scripts		scripts
skills		skills
slices		slices
src/donna		src/donna
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
IMPLEMENTATION_GUIDE.md		IMPLEMENTATION_GUIDE.md
INSTALL_DAY.md		INSTALL_DAY.md
LICENSE		LICENSE
README.md		README.md
RECOVERY.md		RECOVERY.md
SETUP.md		SETUP.md
VERIFICATION_REPORT.md		VERIFICATION_REPORT.md
alembic.ini		alembic.ini
donna-diagrams.html		donna-diagrams.html
properdocs.yml		properdocs.yml
pyproject.toml		pyproject.toml
spec_v3.md		spec_v3.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Donna — AI Personal Assistant

Why This Exists

What's Running

Core Engine (~61,000 lines Python)

Admin Console (~18,000 lines TypeScript/React)

Infrastructure

Architecture

Tech Stack

Project Stats

Key Design Decisions

Running It

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Donna — AI Personal Assistant

Why This Exists

What's Running

Core Engine (~61,000 lines Python)

Admin Console (~18,000 lines TypeScript/React)

Infrastructure

Architecture

Tech Stack

Project Stats

Key Design Decisions

Running It

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages