recorder

Ambient audio recorder daemon for Linux workstations. Captures microphone and system audio via PipeWire, transcribes with a local whisper server, attributes speakers via Chrome DevTools Protocol, segments conversations, and produces structured summaries via a local LLM.

Architecture

                                    ┌────────────────┐
 parec (mic) ──┐                    │ whisper-server │
               ├─→ RMS gate ─→ WAV ─│ (Silero VAD +  │─→ LLM cleanup ────┐
 parec (sys) ──┘   (silence         │  ASR decode)   │                   │
                    detection)      └────────────────┘                   │
                                                                         ▼
 Chrome CDP ──→ SpeakerTimeline ←── Transcription Worker ──→ DailyTranscript
 (Meet/Teams)   (who spoke when)            │                (append-only)
           └──→ MeetingState ──────→ IncrementalSegmenter
                (tab changes)               │
                                            ▼
                                    LLM Summarization ──→ Segment Files

Goroutines: capture loop (main), transcription worker, speaker collector (CDP polling). Only the transcription worker writes to the transcript file.

Requirements

Linux with PipeWire (or PulseAudio)
parec (from pipewire-utils or pulseaudio-utils)
whisper-server with OpenAI-compatible /v1/audio/transcriptions endpoint
LLM server with OpenAI-compatible /v1/chat/completions endpoint
Chrome/Chromium with --remote-debugging-port (for speaker + meeting detection)

Installation

go build -o ~/.local/bin/recorder .

Configuration

Config is loaded from $XDG_CONFIG_HOME/recorder/config.json (default: ~/.config/recorder/config.json). If the file does not exist, built-in defaults are used.

See config.example.json for all available fields.

Section	Field	Default	Description
`whisper`	`url`	`http://localhost:8178/v1/...`	Whisper server transcription endpoint
`whisper`	`timeoutS`	`60`	HTTP timeout for transcription requests
`llm`	`url`	`http://localhost:8179/v1/...`	LLM server chat completions endpoint
`llm`	`model`	`default`	Model name sent in API requests
`llm`	`timeoutS`	`180`	HTTP timeout for LLM requests
`transcript`	`outputDir`	`~/.local/share/recorder/transcripts`	Directory for daily transcript files
`segments`	`outputDir`	`~/.local/share/recorder/segments`	Directory for segment summary files
`dedup`	`threshold`	`0.6`	Token overlap threshold for mic/sys dedup
`signals`	`silenceThresholdS`	`180`	Silence duration before segment boundary
`signals`	`cdpPorts`	`[]`	Chrome DevTools Protocol ports to poll
`promptVars`	(see below)	built-in defaults	Template variables for LLM system prompts
`prompts`	`cleanup`	`""` (embedded)	Optional path to cleanup prompt template
`prompts`	`summarize`	`""` (embedded)	Optional path to summarize prompt template
`prompts`	`combine`	`""` (embedded)	Optional path to combine prompt template

LLM prompts

Three system prompts drive the pipeline: cleanup (transcript post-processing), summarize (segment summaries), and combine (map-reduce merge for long segments). Defaults are embedded in the binary (internal/config/prompts/) and rendered with Go text/template at startup.

promptVars — personalize prompts without forking the full template:

Field	Description
`languages`	Languages spoken (e.g. `["Swedish", "English"]`)
`fillerWords`	Filler words to strip during cleanup
`owner.role`	Role framing for summarize intro (e.g. `"software engineer"`)
`owner.summaryFor`	Summary destination (e.g. `"a human inbox"`)
`includeInSummary`	Bullet list of what to capture in summaries
`titleMaxWords`	Max words in segment titles (default `8`)
`skipMaxGreetLines`	Skip threshold for pure greeting segments (default `3`)
`titleStopWords`	Stop words to avoid in titles
`summaryLabels`	Suggested bold labels in summaries (`Decided:`, `Insight:`, …)

prompts — optional file paths to override the embedded templates. Paths are templates too (vars still apply). If a configured file is missing, the embedded template is seeded to that path on first load.

Not configurable: mic/sys channel definitions, JSON output schema, and other recorder-specific semantics baked into the templates.

Example:

{
  "promptVars": {
    "languages": ["English"],
    "owner": { "role": "product manager", "summaryFor": "weekly notes" }
  },
  "prompts": {
    "summarize": "~/.config/recorder/prompts/summarize.md"
  }
}

Usage

recorder run                              # start the daemon
recorder note                             # interactive note (stdin)
recorder note "meeting started late"      # note via CLI argument
recorder segment <transcript>             # show segments (dry-run)
recorder segment <transcript> --write     # write segment files + transcript markers
recorder segment <transcript> --boundaries  # show boundaries only (no LLM)
recorder prompts                            # print resolved system prompts (debug)
recorder prompts cleanup                    # print one prompt
recorder prompts summarize combine          # print a subset

Inspect final rendered prompts after config changes:

recorder prompts summarize

Output

Daily Transcripts

Streaming, append-only markdown files in transcript.outputDir:

<output_dir>/YYYY-MM-DD-recorder.md

Each file has YAML frontmatter and timestamped event lines:

---
date: 2026-05-23
type: recorder-transcript
---

[15:04:32] 🔊 **sys** [Alice Smith] Let's migrate the API
[15:04:35] 🎤 **mic** We should start with the schema
[15:04:47] 🪟 **mtg** joined: Meet - API Planning
[15:05:01] 👥 **ppl** Alice Smith, Bob Johnson
[15:20:15] 💤 **idl** 15 min
[15:20:16] ✂️ **seg** | 1504 api-migration

Event tags: sys (system audio), mic (microphone), mtg (meeting change), ppl (participants), idl (silence), nfo (user note), pin (boundary hint), seg (segment boundary), rec (start/stop).

Segment Summaries

Atomic markdown files in segments.outputDir:

<output_dir>/YYYY-MM-DD-HHMM-<slug>.md

Each file has YAML frontmatter with metadata and contains the LLM-generated summary followed by the full segment transcript:

---
title: "API Migration & Query Optimization"
date: 2026-05-23
time: "15:04–15:45"
duration: 41m
type: segment
source: "[[raw/transcripts/2026-05-23-recorder.md]]"
participants: ["Alice Smith", "Bob Johnson"]
---

Chrome DevTools Protocol

Speaker attribution and meeting detection require Chrome launched with remote debugging. Configure the ports in config.json under signals.cdpPorts:

google-chrome --remote-debugging-port=9222

The recorder polls all configured CDP ports, auto-detects meeting tabs (Google Meet, Microsoft Teams), identifies active speakers by observing CSS class changes on participant tiles, and detects meeting changes when tabs appear/disappear or titles change.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
internal		internal
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
GEMINI.md		GEMINI.md
README.md		README.md
config.example.json		config.example.json
go.mod		go.mod
go.sum		go.sum
http_client.go		http_client.go
main.go		main.go
mise.toml		mise.toml
wire.go		wire.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

recorder

Architecture

Requirements

Installation

Configuration

LLM prompts

Usage

Output

Daily Transcripts

Segment Summaries

Chrome DevTools Protocol

About

Uh oh!

Releases 3

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

recorder

Architecture

Requirements

Installation

Configuration

LLM prompts

Usage

Output

Daily Transcripts

Segment Summaries

Chrome DevTools Protocol

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Contributors

Uh oh!

Languages