Ambient audio recorder daemon for Linux workstations. Captures microphone and system audio via PipeWire, transcribes with a local whisper server, attributes speakers via Chrome DevTools Protocol, segments conversations, and produces structured summaries via a local LLM.
┌────────────────┐
parec (mic) ──┐ │ whisper-server │
├─→ RMS gate ─→ WAV ─│ (Silero VAD + │─→ LLM cleanup ────┐
parec (sys) ──┘ (silence │ ASR decode) │ │
detection) └────────────────┘ │
▼
Chrome CDP ──→ SpeakerTimeline ←── Transcription Worker ──→ DailyTranscript
(Meet/Teams) (who spoke when) │ (append-only)
└──→ MeetingState ──────→ IncrementalSegmenter
(tab changes) │
▼
LLM Summarization ──→ Segment Files
Goroutines: capture loop (main), transcription worker, speaker collector (CDP polling). Only the transcription worker writes to the transcript file.
- Linux with PipeWire (or PulseAudio)
parec(frompipewire-utilsorpulseaudio-utils)- whisper-server with OpenAI-compatible
/v1/audio/transcriptionsendpoint - LLM server with OpenAI-compatible
/v1/chat/completionsendpoint - Chrome/Chromium with
--remote-debugging-port(for speaker + meeting detection)
go build -o ~/.local/bin/recorder .Config is loaded from $XDG_CONFIG_HOME/recorder/config.json (default:
~/.config/recorder/config.json). If the file does not exist, built-in
defaults are used.
See config.example.json for all available fields.
| Section | Field | Default | Description |
|---|---|---|---|
whisper |
url |
http://localhost:8178/v1/... |
Whisper server transcription endpoint |
whisper |
timeoutS |
60 |
HTTP timeout for transcription requests |
llm |
url |
http://localhost:8179/v1/... |
LLM server chat completions endpoint |
llm |
model |
default |
Model name sent in API requests |
llm |
timeoutS |
180 |
HTTP timeout for LLM requests |
transcript |
outputDir |
~/.local/share/recorder/transcripts |
Directory for daily transcript files |
segments |
outputDir |
~/.local/share/recorder/segments |
Directory for segment summary files |
dedup |
threshold |
0.6 |
Token overlap threshold for mic/sys dedup |
signals |
silenceThresholdS |
180 |
Silence duration before segment boundary |
signals |
cdpPorts |
[] |
Chrome DevTools Protocol ports to poll |
promptVars |
(see below) | built-in defaults | Template variables for LLM system prompts |
prompts |
cleanup |
"" (embedded) |
Optional path to cleanup prompt template |
prompts |
summarize |
"" (embedded) |
Optional path to summarize prompt template |
prompts |
combine |
"" (embedded) |
Optional path to combine prompt template |
Three system prompts drive the pipeline: cleanup (transcript post-processing),
summarize (segment summaries), and combine (map-reduce merge for long
segments). Defaults are embedded in the binary
(internal/config/prompts/) and rendered with Go
text/template at startup.
promptVars — personalize prompts without forking the full template:
| Field | Description |
|---|---|
languages |
Languages spoken (e.g. ["Swedish", "English"]) |
fillerWords |
Filler words to strip during cleanup |
owner.role |
Role framing for summarize intro (e.g. "software engineer") |
owner.summaryFor |
Summary destination (e.g. "a human inbox") |
includeInSummary |
Bullet list of what to capture in summaries |
titleMaxWords |
Max words in segment titles (default 8) |
skipMaxGreetLines |
Skip threshold for pure greeting segments (default 3) |
titleStopWords |
Stop words to avoid in titles |
summaryLabels |
Suggested bold labels in summaries (Decided:, Insight:, …) |
prompts — optional file paths to override the embedded templates. Paths
are templates too (vars still apply). If a configured file is missing, the
embedded template is seeded to that path on first load.
Not configurable: mic/sys channel definitions, JSON output schema, and
other recorder-specific semantics baked into the templates.
Example:
{
"promptVars": {
"languages": ["English"],
"owner": { "role": "product manager", "summaryFor": "weekly notes" }
},
"prompts": {
"summarize": "~/.config/recorder/prompts/summarize.md"
}
}recorder run # start the daemon
recorder note # interactive note (stdin)
recorder note "meeting started late" # note via CLI argument
recorder segment <transcript> # show segments (dry-run)
recorder segment <transcript> --write # write segment files + transcript markers
recorder segment <transcript> --boundaries # show boundaries only (no LLM)
recorder prompts # print resolved system prompts (debug)
recorder prompts cleanup # print one prompt
recorder prompts summarize combine # print a subsetInspect final rendered prompts after config changes:
recorder prompts summarizeStreaming, append-only markdown files in transcript.outputDir:
<output_dir>/YYYY-MM-DD-recorder.md
Each file has YAML frontmatter and timestamped event lines:
---
date: 2026-05-23
type: recorder-transcript
---
[15:04:32] 🔊 **sys** [Alice Smith] Let's migrate the API
[15:04:35] 🎤 **mic** We should start with the schema
[15:04:47] 🪟 **mtg** joined: Meet - API Planning
[15:05:01] 👥 **ppl** Alice Smith, Bob Johnson
[15:20:15] 💤 **idl** 15 min
[15:20:16] ✂️ **seg** | 1504 api-migrationEvent tags: sys (system audio), mic (microphone), mtg (meeting
change), ppl (participants), idl (silence), nfo (user note), pin
(boundary hint), seg (segment boundary), rec (start/stop).
Atomic markdown files in segments.outputDir:
<output_dir>/YYYY-MM-DD-HHMM-<slug>.md
Each file has YAML frontmatter with metadata and contains the LLM-generated summary followed by the full segment transcript:
---
title: "API Migration & Query Optimization"
date: 2026-05-23
time: "15:04–15:45"
duration: 41m
type: segment
source: "[[raw/transcripts/2026-05-23-recorder.md]]"
participants: ["Alice Smith", "Bob Johnson"]
---Speaker attribution and meeting detection require Chrome launched with remote
debugging. Configure the ports in config.json under signals.cdpPorts:
google-chrome --remote-debugging-port=9222The recorder polls all configured CDP ports, auto-detects meeting tabs (Google Meet, Microsoft Teams), identifies active speakers by observing CSS class changes on participant tiles, and detects meeting changes when tabs appear/disappear or titles change.