A local-first personal AI coaching system β menu bar app that runs structured morning briefings and evening check-ins, integrating live calendar data, goal tracking, versioned session history, and on-device speech recognition.
Built to explore AI engineering patterns: multi-provider LLM abstraction, structured output extraction, context assembly, and local inference. Every major component is validated in Jupyter notebooks before landing in the backend.
Morning briefing β At session start, the backend pre-computes calendar issues (conflicts, overload, goal gaps) and injects them as structured data into the system prompt. The coach opens with an issue-aware greeting and a focused question β not a generic "how are you?". Sessions end with a dual-LLM post-processing step: one call extracts structured JSON (goal updates, open threads, summary), a second writes a first-person journal entry.
Evening check-in β Reviews the day against active goals and unresolved follow-up threads. Surfaces wins, blockers, and action items. Updates goal progress in the database. Journal entries are written to disk as markdown.
Goal tracking β Goals have horizons (weekly, monthly, custom), progress states, and immutable version history. Text edits deactivate the old row and insert a new one with version+1 and parent_id β full audit trail without a separate history table.
Follow-up threads β Unresolved action items live as first-class database rows, optionally linked to goals. Surfaced to the LLM on every future session. Resolved explicitly, never auto-archived.
Calendar integration β Reads from Google Calendar (parallel per-calendar fetches) or macOS Calendar.app via AppleScript bridge. Computes free blocks (merging overlapping events, 15min minimum gap, 8amβ10pm window). Can create, move, and delete events.
On-device STT β Voice input via Qwen3-ASR (0.6B, MLX-optimized for Apple Silicon). No cloud dependency, no API key. Loaded once on first use, kept in process memory. WAV encoded client-side from raw PCM.
ββββββββββββββββββββββββββββββββββββββββββββββ
β Electron (menu bar tray app) β
β ββ main.js β window + tray lifecycle
β ββ renderer/ (React 19 + Vite 6) β
β ββ SessionOverlay.jsx β audio capture + chat UI
β ββ Goals.jsx β goal CRUD + threads
β ββ App.jsx β tab router β
ββββββββββββββββββββββ¬ββββββββββββββββββββββββ
β HTTP (localhost:8000)
ββββββββββββββββββββββΌββββββββββββββββββββββββ
β FastAPI + Uvicorn (Python 3.12) β
β ββ /session/* session state machine β
β ββ /goals/* goal CRUD + threads β
β ββ /calendar/* schedule + actions β
β ββ /transcribe STT endpoint β
ββββββββββββ¬βββββββββββββββββββ¬βββββββββββββββ
β β
ββββββββΌβββββββ βββββββββΌβββββββββββββββββ
β SQLite 3 β β LLM Provider Layer β
β WAL mode β β ββ GeminiProvider β
β 3 tables β β ββ OpenAICompatible β
βββββββββββββββ β (OpenAI, Groq, β
β Ollama, LMStudio) β
ββββββββββββββββββββββββββ
ββββββββββββββββββββββββββ
β Calendar Layer β
β ββ Google Calendar API β
β ββ AppleScript bridge β
ββββββββββββββββββββββββββ
ββββββββββββββββββββββββββ
β STT: Qwen3-ASR (MLX) β
ββββββββββββββββββββββββββ
All coach logic calls a single abstract LLMProvider interface (backend/llm/base.py). Concrete providers:
GeminiProviderβ google-genai SDK, runs sync in thread pool executor (SDK not async-native), role translation (assistantβmodel)OpenAICompatibleProviderβ async via AsyncOpenAI client, covers OpenAI, OpenRouter, Groq, Ollama, LMStudio viabase_url
Provider is factory-selected from config.json at request time β hot-swappable with no restart. Each provider uses tenacity for 3-attempt exponential backoff on transient failures.
Adding a new provider: subclass LLMProvider, implement chat() and provider_name(), register one line in factory.py.
Session end triggers two sequential LLM calls with separate system prompts:
- Summary call β strict JSON extraction:
{wins, blockers, goal_updates, open_threads, one_line_summary}. Separate prompt avoids the fragility of mixing structured extraction with prose generation. - Narrative call β first-person journal entry written in the user's voice.
The JSON result drives database writes (goal progress, thread creation). The narrative goes to disk. This separation keeps each call's task well-scoped and the outputs independently useful.
Before the first LLM call, build_morning_context() assembles:
- Active goals + unresolved threads (formatted as goal β nested threads)
- Today's schedule with free blocks (time-sorted, interleaved)
- Pre-computed issues:
- Conflicts β overlapping calendar events
- Overload β >5 events or <60min total free time
- Goal gaps β active goal with no corresponding calendar block
- Last 7 session summaries (oldest β newest, goal updates summarized)
Issues are structured data injected into the prompt β the LLM sees type: conflict, description: "1:1 with PM overlaps design review (2:00β3:00)", not a free-text blob. The opening message is generated from detected issues, so it's always specific.
Goal text edits use immutable versioning:
goals table: id | text | version | parent_id | active
update "improve sleep" β
row 1: active=False, version=1, parent_id=NULL
row 2: active=True, version=2, parent_id=1, text="sleep 8hrs by 11pm"
Progress changes mutate the active row in place (progress is not historical data). Full text history is preserved without a separate changelog table.
compute_free_blocks() in backend/calendar/parser.py:
- Filters to timed events only (skips all-day)
- Merges overlapping events (handles double-booked slots)
- Finds gaps in configurable window (default 8amβ10pm)
- Filters gaps shorter than 15 minutes
- Returns free blocks as
[{start, end, duration_min}]
Passed to morning coach to suggest high-leverage work blocks.
Client-side (React):
getUserMedia()βAudioContext+ScriptProcessorat system sample rateonaudioprocessappends float32 PCM chunks (4096 samples each)- On stop: merge chunks β quantize float32 β int16 (with clipping) β build RIFF WAV header β base64-encode
- POST to
/transcribe
Server-side:
- Decode base64 β write temp file
- Run
transcribe_file()in thread pool (1 worker, avoids semaphore leak) - Return
{text, language, latency_sec, audio_duration_sec}
STT model (Qwen3-ASR 0.6B via mlx-qwen3-asr) is a lazy-loaded singleton β cold start ~2s on first call, negligible thereafter.
Active session state (history, system prompt, context) lives in a dict in the FastAPI process β not the database. This keeps multi-turn state simple and avoids DB round-trips per message. On session end, the full transcript + summary is persisted. The pattern assumes one active session at a time, which matches the UX.
- FastAPI async handlers for I/O-bound paths
- Sync SDKs (google-genai, httplib2) run in
asyncio.run_in_executorthread pool - Google Calendar fetches:
ThreadPoolExecutor(max_workers=min(calendar_count, 8))β one thread per calendar - STT: dedicated
ThreadPoolExecutor(max_workers=1)β serialised to prevent model semaphore contention - AppleScript bridge: global
threading.Lock()β osascript subprocess is not thread-safe
| Layer | Technology |
|---|---|
| Desktop shell | Electron 35, Node 22 |
| Frontend | React 19, Vite 6 |
| Backend | FastAPI 0.135, Uvicorn 0.42, Python 3.12 |
| ORM / DB | SQLAlchemy 2.0, SQLite 3 (WAL mode) |
| Validation | Pydantic 2.12 |
| LLM (default) | Google Gemini via google-genai SDK |
| LLM (alt) | Any OpenAI-compatible endpoint |
| STT | mlx-qwen3-asr 0.3.2 (Qwen3 0.6B, MLX) |
| Calendar | Google Calendar API v3, AppleScript bridge |
| Retry logic | tenacity (exponential backoff) |
| Audio processing | numpy, soundfile |
lifepilot/
βββ backend/
β βββ main.py # FastAPI app, CORS, lifespan
β βββ config.py # config.json loader with hot-swap
β βββ llm/
β β βββ base.py # Abstract LLMProvider interface
β β βββ factory.py # Provider factory (config-driven)
β β βββ gemini.py # Google Gemini provider
β β βββ openai_compat.py # OpenAI-compatible provider
β βββ db/
β β βββ models.py # SQLAlchemy models: Goal, Session, Thread
β β βββ database.py # Engine, WAL mode, session factory
β β βββ goals.py # Goal CRUD + versioning + context builder
β β βββ sessions.py # Session persistence + history fetch
β β βββ threads.py # Thread CRUD + surface unresolved
β βββ coach/
β β βββ context.py # Context assemblers (morning + evening)
β β βββ morning.py # Morning session state machine + issue detection
β β βββ evening.py # Evening session state machine
β β βββ prompts.py # System prompts + injection templates
β βββ calendar/
β β βββ provider.py # Calendar backend dispatcher
β β βββ google_cal.py # Google Calendar API integration
β β βββ applescript.py # macOS Calendar.app bridge
β β βββ parser.py # Event parsing + free block computation
β β βββ actions.py # Calendar action schema + executor
β β βββ auth.py # One-time Google OAuth flow
β βββ routers/
β β βββ sessions.py # /session/* endpoints
β β βββ goals.py # /goals/* endpoints
β β βββ calendar.py # /calendar/* endpoints
β β βββ audio.py # /transcribe endpoint
β βββ stt/
β βββ qwen3_asr.py # Qwen3-ASR wrapper (lazy-load singleton)
β βββ schemas.py # STTResult schema
βββ electron/
β βββ main.js # Electron main process, tray, window
β βββ renderer/ # React + Vite frontend
β βββ src/
β βββ App.jsx
β βββ SessionOverlay.jsx
β βββ Goals.jsx
βββ notebooks/
β βββ 01_llm_abstraction.ipynb # Provider abstraction + structured output validation
β βββ 02_stt_qwen3_asr.ipynb # STT latency + accuracy experiments
β βββ 03_database_goals.ipynb # Goal versioning + query validation
βββ config.json # Runtime config (gitignored)
- macOS (Apple Silicon recommended for local STT)
- Python 3.12+
- Node 22+
- Conda or venv
conda create -n lifepilot python=3.12
conda activate lifepilot
pip install -r requirements.txtCopy the template and fill in your API key:
cp config.example.json config.json # or create manuallyconfig.json:
{
"provider": "gemini",
"api_key": "YOUR_GEMINI_API_KEY",
"model": "gemini-2.0-flash",
"morning_time": "09:00",
"evening_time": "21:00",
"journal_dir": "/path/to/journal",
"db_path": "/path/to/lifepilot.db",
"calendar_backend": "google"
}To use a different provider, set provider to openai, groq, ollama, or lmstudio and update api_key/base_url/model accordingly. No code changes required.
a. Create OAuth credentials
- Go to Google Cloud Console β create a project
- Enable the Google Calendar API
- Create credentials β OAuth 2.0 Client ID β Desktop app
- Download JSON β save as
google_client_secret.jsonin the project root
b. Run the one-time auth flow
python -m backend.calendar.authThis opens a browser window, asks for calendar access, and saves google_credentials.json to the project root. The backend auto-refreshes the token on expiry.
c. Set calendar backend in config
"calendar_backend": "google"For macOS Calendar.app without Google, set "calendar_backend": "applescript" β no credentials needed.
uvicorn backend.main:app --reload --port 8000cd electron
npm install
npm run dev # starts renderer dev server + ElectronOr build for production:
npm run build # Vite bundle
npm run electron # run built appThe notebooks/ directory contains the experimental validation work done before each component landed in the backend:
| Notebook | What it validates |
|---|---|
01_llm_abstraction.ipynb |
Provider abstraction, multi-turn context, retry logic, structured JSON extraction, end-to-end session simulation |
02_stt_qwen3_asr.ipynb |
Qwen3-ASR model loading, transcription latency, accuracy on voice input |
03_database_goals.ipynb |
Goal versioning (create/update/version chain), progress mutation, goal context builder for LLM |
Run with:
jupyter notebook notebooks/| Key | Default | Description |
|---|---|---|
provider |
gemini |
LLM backend: gemini, openai, groq, ollama, lmstudio, openrouter |
api_key |
"" |
API key for the selected provider |
model |
gemini-2.0-flash |
Model identifier |
base_url |
null |
Base URL for OpenAI-compatible providers (Ollama: http://localhost:11434/v1) |
morning_time |
09:00 |
Morning briefing time (display only, sessions triggered manually) |
evening_time |
21:00 |
Evening check-in time |
journal_dir |
./journal |
Directory for markdown journal entries |
db_path |
./lifepilot.db |
SQLite database path |
calendar_backend |
google |
google or applescript |
Config is re-read on every request. Swap providers or models at runtime with no restart.