athal7 · stephengolub · Jun 26, 2026 · Jun 26, 2026 · athal7 · Jun 26, 2026
diff --git a/dot_config/opencode/commands/kb-enrich.md b/dot_config/opencode/commands/kb-enrich.md
@@ -7,44 +7,51 @@ Run the daily knowledge base enrichment. By default enrich every date since the
 
 $ARGUMENTS
 
-## Sources
+## Step 0 — Resolve configuration
 
-Check activity across all available sources:
+**KB_ROOT** — the knowledge base root directory. Resolve in this order:
+1. The env var `KB_ROOT` if set and non-empty
+2. Default: `~/.local/share/kb`
 
-- **opencode** coding sessions
-- **slack** chat messages and threads
-- **zoom** meeting transcripts — captions live at `~/Documents/Zoom/YYYY-MM-DD HH.MM.SS <Title>/meeting_saved_closed_caption.txt`. For each transcript whose dir-date falls in the enrich window, distill it with the local on-device model first (see Extract step) instead of reading the full raw caption text.
-- **linear** issues and comments
-- **gh** code reviews, PRs, and issues
-- **openspec durable store (AUTHORITATIVE for `/implement` work)** — each worktree's `openspec/` carries two narrow symlinks into a durable per-repo store at `~/.local/share/kb/openspec/<repo-slug>/` (`openspec/specs` → store `specs/`, `openspec/changes/archive` → store `changes/archive/`). At Ship, `openspec archive` moves a completed change through the `changes/archive` symlink into the store, so its artifacts persist regardless of when work shipped. For each date being enriched, read `~/.local/share/kb/openspec/*/changes/archive/<date>-*/design.md` for decisions, the "why", and rejected alternatives, and read each store's durable `specs/` for the standing requirements. These structured artifacts are the source of truth for the reasoning behind completed `/implement` work — use them instead of reconstructing it from full (token-expensive, lossy) session transcripts.
+All KB reads and writes in this run use `$KB_ROOT`. Every collector file also receives `$KB_ROOT` as context.
 
-### Session exclusion — the core token-saving dedup
+## Step 1 — Resolve the date range
 
-The openspec store is authoritative for `/implement` work, so the sessions that PRODUCED an archived change must be EXCLUDED from transcript reads. Build the exclusion set and skip those sessions:
+Enrich the gap since the last run, not a hard single day. The most recent `$KB_ROOT/journal/YYYY-MM-DD.md` is the last-run marker: enrich each date from (last journal date + 1) through today, inclusive. This makes a Monday run sweep the trailing weekend and lets a skipped run self-heal on the next run. If no prior journal exists, default to today. An explicit date or range in `$ARGUMENTS` overrides this.
 
-1. **Collect excluded worktrees.** For each date being enriched, read every `~/.local/share/kb/openspec/*/changes/archive/<date>-*/kb-meta.yaml` and collect its `worktree:` value (the absolute repo/worktree root, stamped at archive). That set is the exclusion list.
-2. **Skip those sessions.** When scanning **opencode** sessions, a session is identified by its `directory` column in the `session` table of `~/.local/share/opencode/opencode.db`. SKIP any session whose `directory` is in the exclusion set — for those, narrate from the change's `design.md`/specs, not the transcript. Only sessions NOT covered by an archived change get a transcript read.
-3. **Filter at query time.** Pass the collected worktrees as the `NOT IN (...)` list and bound by the date window (`time_updated` is epoch-ms):
+## Step 2 — Load collectors
 
-   ```sql
-   SELECT id, directory, title, time_updated
-   FROM session
-   WHERE time_updated BETWEEN :start_ms AND :end_ms
-     AND directory NOT IN ('/abs/worktree/a', '/abs/worktree/b');
-   -- returned sessions are the ONLY ones that need a transcript read;
-   -- excluded directories are covered by the durable change artifacts instead.
-   ```
+Collectors live in `~/.config/opencode/kb-collectors/`. Each file is a self-contained markdown recipe that describes one data source. Read all `*.md` files in that directory and sort them by the `priority` field in their YAML frontmatter (lower number = runs first). Read each file's body now so you can apply its query recipe and extraction rules during collection.
 
-**Benign failure modes** (neither loses correctness): a missed match (stale/absent `kb-meta.yaml`) just wastes one transcript read; an over-match (a session in an excluded worktree that wasn't really part of the change) just relies on the better, distilled artifact instead of the transcript.
+Some collectors perform a **runtime enabled check** at the start of their body (e.g. verifying a token or config value exists before proceeding). Honor those checks: if a collector's body says to skip, log the reason and move on.
 
-## Enrichment Steps
+> **To add a new data source:** drop a new `.md` file into `~/.config/opencode/kb-collectors/`. No changes to this orchestrator needed. To disable a source, remove or don't add its collector file.
 
-1. **Extract** people facts, project updates, and decisions from each source.
-   - **Zoom transcripts:** for each in-window transcript, run `~/.config/opencode/bin/kb-distill <caption-file> "<title>" <date>` and use the returned JSON facts (participants, topics, decisions, action_items, open_questions, summary) in place of the raw caption text when extracting people facts, decisions, and action items. The raw transcript is sent only to the on-device local model (privacy positive); the authoritative do-not-store privacy filter below still applies at the WRITE step. **If `kb-distill` exits non-zero, read the raw transcript yourself instead and note the fallback in the journal.**
-2. **Journal** — write one cross-project rollup journal file per enriched date, each with diff stats. By construction each is THIN: feed it only from the NON-excluded sessions (those not covered by an archived change) plus git diff-stats. For `/implement` work, do NOT re-narrate the openspec change — reference the durable store artifacts (`design.md`/specs already in the kb via the symlink). The journal's role is the cross-project rollup + non-`/implement` activity, not a reconstruction of openspec work. Keep it; just don't duplicate the store.
-3. **Profiles** — merge new facts into knowledge-base people and project profiles
-4. **Decisions** — add any decisions to the decisions log. Pull key design decisions and rejected alternatives from the durable store's `~/.local/share/kb/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the kb). The decisions log is a distilled record anchored to its product/project, not a dump of the design files.
-5. **Action items** — extract action items from the enriched window's activity. Cross-reference within the same activity data — if the activity shows you already took the action (replied to the thread, reviewed the PR, closed the issue), skip the reminder. Only create reminders for items that were not resolved within the enriched window.
+## Step 3 — Session exclusion (cross-collector dedup)
+
+Before running the `opencode` collector, build the session exclusion set from the `openspec` collector output (priority 0, runs first):
+
+For each date being enriched, read `$KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml` and collect every `worktree:` value. This set is passed into the `opencode` collector as the `NOT IN (...)` list. Sessions in excluded worktrees are already covered by the durable OpenSpec change artifacts (`design.md`/specs); they do not need a transcript read.
+
+**Benign failure modes:** a missed match (stale/absent `kb-meta.yaml`) just wastes one transcript read. An over-match just relies on the better, distilled artifact. Neither loses correctness.
+
+## Step 4 — Run collectors
+
+For each date in the enrichment window, run each collector in priority order. Apply the collector's own query recipe, triage rules, and extraction rules exactly as written in its body. Carry the results forward into Step 5.
+
+## Step 5 — Enrichment
+
+### Journal
+Write one cross-project rollup journal file per enriched date at `$KB_ROOT/journal/YYYY-MM-DD.md`. By construction each is THIN: feed it only from the non-excluded sessions (not covered by an archived OpenSpec change) plus git diff-stats plus any meeting summaries from collectors. For `/implement` work, do NOT re-narrate the OpenSpec change — reference the durable store artifacts (`design.md`/specs already in the KB via the symlink). The journal's role is the cross-project rollup and non-`/implement` activity, not a reconstruction of OpenSpec work.
+
+### Profiles
+Merge new facts into `$KB_ROOT/people/` and `$KB_ROOT/projects/` profiles. Load the `knowledge-base` skill for the canonical profile shape and merge rules.
+
+### Decisions
+Add any decisions to the decisions log. Pull key design decisions and rejected alternatives from `$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the KB). Also extract any decisions surfaced by meeting or chat collectors. The decisions log is a distilled record anchored to its product/project, not a dump of design files or transcripts.
+
+### Action items
+Extract action items from the enriched window's activity. Cross-reference within the same activity data — if the activity shows the action was already taken (replied, reviewed, closed, appeared as done in a later meeting), skip the reminder. Only create reminders for items not resolved within the enriched window.
 
 ## Privacy
 
@@ -54,7 +61,3 @@ Do not extract or store:
 - Performance evaluations
 - Legal or attorney-client privileged content
 - Content from HR-related discussions
-
-## Date range
-
-Enrich the gap since the last run, not a hard single day. The most recent `~/.local/share/kb/journal/YYYY-MM-DD.md` is the last-run marker: enrich each date from (last journal date + 1) through today, inclusive. This makes a Monday run sweep the trailing weekend and lets a skipped run self-heal on the next run. If no prior journal exists, default to today. An explicit date or range in arguments overrides this.
diff --git a/dot_config/opencode/kb-collectors/gh.md b/dot_config/opencode/kb-collectors/gh.md
@@ -0,0 +1,43 @@
+---
+name: gh
+priority: 4
+authoritative_for: [shipped-code, reviews]
+description: GitHub PRs you authored or reviewed in the enrichment window
+# skip_bots: commit authors / PR actors to ignore
+skip_bots: [dependabot]
+---
+
+## Enabled check
+
+Read GitHub orgs from chezmoi data:
+
+```bash
+ORGS=$(chezmoi data --format json | jq -r '[.orgs | keys[]] | join(" ")')
+```
+
+If `ORGS` is empty, search across all repos you have access to (no org filter). If non-empty, scope searches with an `org:NAME` filter per org.
+
+## How to query
+
+```bash
+# PRs you opened or updated
+gh search prs --author "@me" --state all --updated ">=YYYY-MM-DD" \
+  --json number,title,repository,state,updatedAt,body
+
+# PRs you were asked to review
+gh search prs --review-requested "@me" --updated ">=YYYY-MM-DD" \
+  --json number,title,repository,state,updatedAt
+```
+
+Add `org:NAME` to each query for every org in `ORGS` (run one query per org, or combine with multiple `org:` terms in a single search string).
+
+## What to extract
+
+- Merged PRs — what shipped
+- Review comments that surfaced decisions
+- Linked issues
+
+## What to skip
+
+- Draft PRs
+- Commits and PRs authored by bots listed in `skip_bots`
diff --git a/dot_config/opencode/kb-collectors/linear.md b/dot_config/opencode/kb-collectors/linear.md
@@ -0,0 +1,50 @@
+---
+name: linear
+priority: 3
+authoritative_for: [tickets, completed-work]
+description: Linear issues you touched in the enrichment window
+---
+
+## Enabled check
+
+Load the `linear` skill. If no Linear API token is available (the skill cannot authenticate), skip this collector and log "linear: no auth, skipping".
+
+## How to query
+
+Use the Linear GraphQL API (endpoint `https://api.linear.app/graphql`) via the `linear` skill. Query issues assigned to or created by you that were updated within the enrichment window:
+
+```graphql
+{
+  issues(
+    filter: {
+      updatedAt: { gte: "YYYY-MM-DDT00:00:00Z" }
+      or: [
+        { assignee: { isMe: { eq: true } } }
+        { creator: { isMe: { eq: true } } }
+      ]
+    }
+    first: 50
+  ) {
+    nodes {
+      identifier
+      title
+      state { name }
+      updatedAt
+      description
+      url
+    }
+  }
+}
+```
+
+## What to extract
+
+- Newly created tickets
+- Status changes (especially to Done/Completed)
+- Decisions captured in descriptions or comments
+- Any ticket closed in the window (signals completed work not otherwise visible in git)
+
+## What to skip
+
+- Bot-generated or auto-updated tickets
+- Tickets you are only a watcher on with no direct activity
diff --git a/dot_config/opencode/kb-collectors/opencode.md b/dot_config/opencode/kb-collectors/opencode.md
@@ -0,0 +1,34 @@
+---
+name: opencode
+priority: 5
+authoritative_for: [coding-sessions]
+description: opencode coding sessions from the local session store
+---
+
+## How to query
+
+Query the session store SQLite database at `~/.local/share/opencode/opencode.db`.
+
+> **Note:** The session exclusion / dedup step is handled by the orchestrator before this collector runs. By the time this collector is called, it receives a list of directories to exclude (sessions covered by archived OpenSpec changes). Apply the `NOT IN (...)` clause below.
+
+```sql
+SELECT id, directory, title, time_updated
+FROM session
+WHERE time_updated BETWEEN :start_ms AND :end_ms
+  AND directory NOT IN ('/abs/worktree/a', '/abs/worktree/b');
+-- The excluded directories are passed in by the orchestrator.
+-- Only sessions NOT in the exclusion set get a transcript read.
+-- For excluded sessions, the orchestrator narrates from the change's design.md/specs instead.
+```
+
+`time_updated` is epoch-milliseconds.
+
+## What to extract
+
+- Work done in sessions not covered by an OpenSpec archived change
+- Project context, technical decisions made interactively, approaches tried
+
+## What to skip
+
+- Sessions whose `directory` is in the exclusion set (covered by the durable OpenSpec change artifacts — see the orchestrator's dedup step)
+- Sessions with no substantive content (e.g. very short duration, no meaningful tool calls)
diff --git a/dot_config/opencode/kb-collectors/openspec.md b/dot_config/opencode/kb-collectors/openspec.md
@@ -0,0 +1,44 @@
+---
+name: openspec
+priority: 0
+authoritative_for: [implement-work, design-decisions, rejected-alternatives]
+description: OpenSpec durable store — authoritative source for /implement work; read BEFORE other collectors to build the session exclusion set
+---
+
+## Why priority 0
+
+This collector runs first. Its primary job is building the **session exclusion set** used by the `opencode` collector — a list of worktree paths whose sessions are already covered by an archived OpenSpec change and should not get a redundant (token-expensive) transcript read.
+
+## How to query
+
+For each date being enriched, read every `$KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml`. Collect the `worktree:` value from each. That set is the exclusion list passed to the `opencode` collector.
+
+```bash
+# Collect worktrees for a given date
+for meta in $KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml; do
+  grep '^worktree:' "$meta" | awk '{print $2}'
+done
+```
+
+Then read each archived change's `design.md`:
+
+```
+$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md
+```
+
+READ these (do not copy them — the artifacts are already in the KB via the symlink). Extract decisions, the "why", and rejected alternatives for the decisions log.
+
+Also read the durable `specs/` for standing requirements:
+
+```
+$KB_ROOT/openspec/*/specs/
+```
+
+## What to extract
+
+- Decisions, rationale, and rejected alternatives from `design.md` files
+- The set of worktree paths (→ exclusion list for the `opencode` collector)
+
+## What to skip
+
+- Re-narrating or duplicating the full design content in the journal — reference the durable store artifacts instead
diff --git a/dot_config/opencode/kb-collectors/slack.md b/dot_config/opencode/kb-collectors/slack.md
@@ -0,0 +1,43 @@
+---
+name: slack
+priority: 2
+authoritative_for: [informal-decisions, action-items, contact-info]
+description: Slack messages — your sent messages and mentions, in the enrichment window
+---
+
+## How to query
+
+Token is in `~/.config/team-context-mcp/.env` as `SLACK_USER_TOKEN`. Set `SLACK_USER_ID` to your Slack user ID (find it in your Slack profile → "Copy member ID").
+
+```bash
+SLACK_TOKEN=$(grep SLACK_USER_TOKEN ~/.config/team-context-mcp/.env | cut -d= -f2)
+SLACK_USER_ID="<your-slack-user-id>"  # e.g. U01ABC23DEF
+
+# Your messages in the date window
+curl -s "https://slack.com/api/search.messages?query=from:me+after:YYYY-MM-DD+before:YYYY-MM-DD&count=20&sort=timestamp" \
+  -H "Authorization: Bearer $SLACK_TOKEN"
+
+# Mentions of you in the date window
+curl -s "https://slack.com/api/search.messages?query=%3C${SLACK_USER_ID}%3E+after:YYYY-MM-DD+before:YYYY-MM-DD&count=20&sort=timestamp" \
+  -H "Authorization: Bearer $SLACK_TOKEN"
+
+# Read a thread (get replies)
+curl -s "https://slack.com/api/conversations.replies?channel=CHANNEL_ID&ts=THREAD_TS" \
+  -H "Authorization: Bearer $SLACK_TOKEN"
+```
+
+Slack is high-volume; read selectively to keep token cost manageable.
+
+## What to extract
+
+- Decisions announced or confirmed in Slack that didn't appear in a Granola meeting
+- Action items assigned to you or by you that aren't already in Linear
+- New contact info (Slack handles, email addresses) for people profiles
+- Customer or partner names that surfaced in conversation
+
+## What to skip
+
+- Routine standup threads already covered by Granola
+- Emoji reactions and short acknowledgments ("👍", "sounds good")
+- HR, compensation, or personal channels (privacy)
+- Anything already captured from a Granola meeting for the same day