From 5cf8643e922206e6ffbfbd995e56b5384db93df7 Mon Sep 17 00:00:00 2001 From: Stephen Golub Date: Fri, 26 Jun 2026 12:25:04 -0500 Subject: [PATCH 1/2] feat(kb-enrich): pluggable collector architecture MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Refactors /kb-enrich into a two-layer design: a thin orchestrator command plus a directory of self-contained collector files, one per data source. Changes: - commands/kb-enrich.md: rewritten as a 5-step orchestrator (resolve config, resolve date range, load collectors, run collectors, enrichment). Sources are no longer hard-coded; the orchestrator reads whatever collectors are present in kb-collectors/. - kb-collectors/ (new): one .md file per source, each with YAML frontmatter (name, enabled, priority, authoritative_for) and a body with query recipe, triage rules, and extraction rules. - openspec.md (priority 0) — builds session exclusion set first - granola.md (priority 1) — meetings, decisions, people-contact - slack.md (priority 2) — informal decisions, action items - linear.md (priority 3) — tickets, completed work - gh.md (priority 4) — shipped PRs, reviews - opencode.md (priority 5) — coding sessions (with exclusion applied) - google-chat.md (priority 6, disabled by default) — opt-in for Chat To add a new source: drop a .md file into kb-collectors/. No changes to the orchestrator needed. To disable a source or configure per-machine values (org lists, workspace slugs, token paths), edit the collector's frontmatter. --- dot_config/opencode/commands/kb-enrich.md | 78 +++++++++++-------- dot_config/opencode/kb-collectors/gh.md | 40 ++++++++++ .../opencode/kb-collectors/google-chat.md | 38 +++++++++ dot_config/opencode/kb-collectors/granola.md | 31 ++++++++ dot_config/opencode/kb-collectors/linear.md | 34 ++++++++ dot_config/opencode/kb-collectors/opencode.md | 35 +++++++++ dot_config/opencode/kb-collectors/openspec.md | 45 +++++++++++ dot_config/opencode/kb-collectors/slack.md | 44 +++++++++++ 8 files changed, 312 insertions(+), 33 deletions(-) create mode 100644 dot_config/opencode/kb-collectors/gh.md create mode 100644 dot_config/opencode/kb-collectors/google-chat.md create mode 100644 dot_config/opencode/kb-collectors/granola.md create mode 100644 dot_config/opencode/kb-collectors/linear.md create mode 100644 dot_config/opencode/kb-collectors/opencode.md create mode 100644 dot_config/opencode/kb-collectors/openspec.md create mode 100644 dot_config/opencode/kb-collectors/slack.md diff --git a/dot_config/opencode/commands/kb-enrich.md b/dot_config/opencode/commands/kb-enrich.md index 9d8aca8a..57d82730 100644 --- a/dot_config/opencode/commands/kb-enrich.md +++ b/dot_config/opencode/commands/kb-enrich.md @@ -7,44 +7,59 @@ Run the daily knowledge base enrichment. By default enrich every date since the $ARGUMENTS -## Sources +## Step 0 — Resolve configuration -Check activity across all available sources: +**KB_ROOT** — the knowledge base root directory. Resolve in this order: +1. The env var `KB_ROOT` if set and non-empty +2. Default: `~/.local/share/kb` -- **opencode** coding sessions -- **slack** chat messages and threads -- **zoom** meeting transcripts — captions live at `~/Documents/Zoom/YYYY-MM-DD HH.MM.SS /meeting_saved_closed_caption.txt`. For each transcript whose dir-date falls in the enrich window, distill it with the local on-device model first (see Extract step) instead of reading the full raw caption text. -- **linear** issues and comments -- **gh** code reviews, PRs, and issues -- **openspec durable store (AUTHORITATIVE for `/implement` work)** — each worktree's `openspec/` carries two narrow symlinks into a durable per-repo store at `~/.local/share/kb/openspec/<repo-slug>/` (`openspec/specs` → store `specs/`, `openspec/changes/archive` → store `changes/archive/`). At Ship, `openspec archive` moves a completed change through the `changes/archive` symlink into the store, so its artifacts persist regardless of when work shipped. For each date being enriched, read `~/.local/share/kb/openspec/*/changes/archive/<date>-*/design.md` for decisions, the "why", and rejected alternatives, and read each store's durable `specs/` for the standing requirements. These structured artifacts are the source of truth for the reasoning behind completed `/implement` work — use them instead of reconstructing it from full (token-expensive, lossy) session transcripts. +All KB reads and writes in this run use `$KB_ROOT`. Every collector file also receives `$KB_ROOT` as context. -### Session exclusion — the core token-saving dedup +**Re-run guard** — if today's journal file already exists at `$KB_ROOT/journal/YYYY-MM-DD.md` and `--force` was not passed in arguments, skip that date silently (log "already enriched: YYYY-MM-DD"). This prevents duplicate work from concurrent runs. -The openspec store is authoritative for `/implement` work, so the sessions that PRODUCED an archived change must be EXCLUDED from transcript reads. Build the exclusion set and skip those sessions: +## Step 1 — Resolve the date range -1. **Collect excluded worktrees.** For each date being enriched, read every `~/.local/share/kb/openspec/*/changes/archive/<date>-*/kb-meta.yaml` and collect its `worktree:` value (the absolute repo/worktree root, stamped at archive). That set is the exclusion list. -2. **Skip those sessions.** When scanning **opencode** sessions, a session is identified by its `directory` column in the `session` table of `~/.local/share/opencode/opencode.db`. SKIP any session whose `directory` is in the exclusion set — for those, narrate from the change's `design.md`/specs, not the transcript. Only sessions NOT covered by an archived change get a transcript read. -3. **Filter at query time.** Pass the collected worktrees as the `NOT IN (...)` list and bound by the date window (`time_updated` is epoch-ms): +Enrich the gap since the last run, not a hard single day. The most recent `$KB_ROOT/journal/YYYY-MM-DD.md` is the last-run marker: enrich each date from (last journal date + 1) through today, inclusive. This makes a Monday run sweep the trailing weekend and lets a skipped run self-heal on the next run. If no prior journal exists, default to today. An explicit date or range in `$ARGUMENTS` overrides this. - ```sql - SELECT id, directory, title, time_updated - FROM session - WHERE time_updated BETWEEN :start_ms AND :end_ms - AND directory NOT IN ('/abs/worktree/a', '/abs/worktree/b'); - -- returned sessions are the ONLY ones that need a transcript read; - -- excluded directories are covered by the durable change artifacts instead. - ``` +## Step 2 — Load collectors -**Benign failure modes** (neither loses correctness): a missed match (stale/absent `kb-meta.yaml`) just wastes one transcript read; an over-match (a session in an excluded worktree that wasn't really part of the change) just relies on the better, distilled artifact instead of the transcript. +Collectors live in `~/.config/opencode/kb-collectors/`. Each file is a self-contained markdown recipe that describes one data source. Read all `*.md` files in that directory. For each file: -## Enrichment Steps +1. Parse the YAML frontmatter to get `name`, `enabled`, `priority`, `authoritative_for`. +2. Skip any collector with `enabled: false`. +3. Sort remaining collectors by `priority` ascending (lower number = runs first). -1. **Extract** people facts, project updates, and decisions from each source. - - **Zoom transcripts:** for each in-window transcript, run `~/.config/opencode/bin/kb-distill <caption-file> "<title>" <date>` and use the returned JSON facts (participants, topics, decisions, action_items, open_questions, summary) in place of the raw caption text when extracting people facts, decisions, and action items. The raw transcript is sent only to the on-device local model (privacy positive); the authoritative do-not-store privacy filter below still applies at the WRITE step. **If `kb-distill` exits non-zero, read the raw transcript yourself instead and note the fallback in the journal.** -2. **Journal** — write one cross-project rollup journal file per enriched date, each with diff stats. By construction each is THIN: feed it only from the NON-excluded sessions (those not covered by an archived change) plus git diff-stats. For `/implement` work, do NOT re-narrate the openspec change — reference the durable store artifacts (`design.md`/specs already in the kb via the symlink). The journal's role is the cross-project rollup + non-`/implement` activity, not a reconstruction of openspec work. Keep it; just don't duplicate the store. -3. **Profiles** — merge new facts into knowledge-base people and project profiles -4. **Decisions** — add any decisions to the decisions log. Pull key design decisions and rejected alternatives from the durable store's `~/.local/share/kb/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the kb). The decisions log is a distilled record anchored to its product/project, not a dump of the design files. -5. **Action items** — extract action items from the enriched window's activity. Cross-reference within the same activity data — if the activity shows you already took the action (replied to the thread, reviewed the PR, closed the issue), skip the reminder. Only create reminders for items that were not resolved within the enriched window. +The full set of collector instructions — how to query the source, triage rules, what to extract, and what to skip — is in each collector's body. Read the body now so you can apply it during collection. + +Some collectors perform a **runtime enabled check** at the start of their body (e.g. verifying a token file exists before proceeding). Honor those checks: if a collector's body says to skip, log the reason and move on. + +> **To add a new data source:** drop a new `.md` file into `~/.config/opencode/kb-collectors/`. No changes to this orchestrator needed. To disable a source temporarily, set `enabled: false` in its frontmatter. To configure per-machine values (token paths, org lists, workspaces), edit the relevant frontmatter field directly in the collector file. + +## Step 3 — Session exclusion (cross-collector dedup) + +Before running the `opencode` collector, build the session exclusion set from the `openspec` collector output (priority 0, runs first): + +For each date being enriched, read `$KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml` and collect every `worktree:` value. This set is passed into the `opencode` collector as the `NOT IN (...)` list. Sessions in excluded worktrees are already covered by the durable OpenSpec change artifacts (`design.md`/specs); they do not need a transcript read. + +**Benign failure modes:** a missed match (stale/absent `kb-meta.yaml`) just wastes one transcript read. An over-match just relies on the better, distilled artifact. Neither loses correctness. + +## Step 4 — Run collectors + +For each date in the enrichment window, run each enabled collector in priority order. Apply the collector's own query recipe, triage rules, and extraction rules exactly as written in its body. Carry the results forward into Step 5. + +## Step 5 — Enrichment + +### Journal +Write one cross-project rollup journal file per enriched date at `$KB_ROOT/journal/YYYY-MM-DD.md`. By construction each is THIN: feed it only from the non-excluded sessions (not covered by an archived OpenSpec change) plus git diff-stats plus Granola meeting summaries. For `/implement` work, do NOT re-narrate the OpenSpec change — reference the durable store artifacts (`design.md`/specs already in the KB via the symlink). The journal's role is the cross-project rollup and non-`/implement` activity, not a reconstruction of OpenSpec work. + +### Profiles +Merge new facts into `$KB_ROOT/people/` and `$KB_ROOT/projects/` profiles. Load the `knowledge-base` skill for the canonical profile shape and merge rules. Granola is especially good for contact info (emails appear in participant lists) and role/team data — update `email:` frontmatter on person profiles whenever a new address is seen. + +### Decisions +Add any decisions to the decisions log. Pull key design decisions and rejected alternatives from `$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the KB). Also extract decisions surfaced in Granola meeting notes. The decisions log is a distilled record anchored to its product/project, not a dump of design files or transcripts. + +### Action items +Extract action items from the enriched window's activity. Cross-reference within the same activity data — if the activity shows the action was already taken (replied, reviewed, closed, appeared as done in a later meeting), skip the reminder. Only create reminders for items not resolved within the enriched window. ## Privacy @@ -54,7 +69,4 @@ Do not extract or store: - Performance evaluations - Legal or attorney-client privileged content - Content from HR-related discussions - -## Date range - -Enrich the gap since the last run, not a hard single day. The most recent `~/.local/share/kb/journal/YYYY-MM-DD.md` is the last-run marker: enrich each date from (last journal date + 1) through today, inclusive. This makes a Monday run sweep the trailing weekend and lets a skipped run self-heal on the next run. If no prior journal exists, default to today. An explicit date or range in arguments overrides this. +- Content from meetings titled or tagged as confidential (e.g. "Confidential: ...") diff --git a/dot_config/opencode/kb-collectors/gh.md b/dot_config/opencode/kb-collectors/gh.md new file mode 100644 index 00000000..2f50e92d --- /dev/null +++ b/dot_config/opencode/kb-collectors/gh.md @@ -0,0 +1,40 @@ +--- +name: gh +enabled: true +priority: 4 +authoritative_for: [shipped-code, reviews] +description: GitHub PRs you authored or reviewed in the enrichment window +# orgs: GitHub orgs to scope searches to. When set, each org is added as +# `org:NAME` to the search query. Leave empty for no org filter (searches +# across all repos you have access to). +orgs: [] +# skip_bots: commit authors / PR actors to ignore +skip_bots: [dependabot] +--- + +## How to query + +Scope to the orgs listed in `orgs` frontmatter. Build the org filter by joining each as `org:NAME`: + +```bash +# PRs you opened or updated (add org filters if orgs list is non-empty) +gh search prs --author "@me" --state all --updated ">=YYYY-MM-DD" \ + --json number,title,repository,state,updatedAt,body + +# PRs you were asked to review +gh search prs --review-requested "@me" --updated ">=YYYY-MM-DD" \ + --json number,title,repository,state,updatedAt +``` + +If the `orgs` list in frontmatter is empty, omit the org filter and search across all repos. + +## What to extract + +- Merged PRs — what shipped +- Review comments that surfaced decisions +- Linked issues + +## What to skip + +- Draft PRs +- Commits and PRs authored by bots listed in `skip_bots` diff --git a/dot_config/opencode/kb-collectors/google-chat.md b/dot_config/opencode/kb-collectors/google-chat.md new file mode 100644 index 00000000..6da7f0b7 --- /dev/null +++ b/dot_config/opencode/kb-collectors/google-chat.md @@ -0,0 +1,38 @@ +--- +name: google-chat +enabled: false +priority: 6 +authoritative_for: [volunteer-work, informal-decisions] +description: Google Chat messages from spaces you are active in (volunteer org, etc.) +# token_path: absolute path to the file containing the Google Chat bearer token. +# When empty or the file does not exist, this collector is skipped automatically. +token_path: "" +--- + +## Enabled check + +Before running: if `token_path` is empty or the file at that path does not exist, skip this collector entirely and log "google-chat: no token, skipping". Set `enabled: true` and `token_path` in the frontmatter on machines where Google Chat is available. + +## How to query + +Read the bearer token from the file at `token_path`. Use the Google Chat REST API `spaces.messages.list` endpoint with a `createTime` filter for the date window: + +``` +GET https://chat.googleapis.com/v1/spaces/{space}/messages + ?filter=createTime>"YYYY-MM-DDT00:00:00Z" AND createTime<"YYYY-MM-DDT23:59:59Z" +Authorization: Bearer <token from token_path> +``` + +Focus on spaces related to volunteer work. + +## Triage + +- **What to extract:** + - Decisions and action items from volunteer organization spaces + - Event planning, logistics, or coordination you were part of + - New contacts (names, roles) from the organization + +- **What to skip:** + - Casual social messages with no action or decision content + - Announcements you didn't participate in + - Anything already captured from a Granola meeting for the same day diff --git a/dot_config/opencode/kb-collectors/granola.md b/dot_config/opencode/kb-collectors/granola.md new file mode 100644 index 00000000..0d4c5da9 --- /dev/null +++ b/dot_config/opencode/kb-collectors/granola.md @@ -0,0 +1,31 @@ +--- +name: granola +enabled: true +priority: 1 +authoritative_for: [meetings, decisions, people-contact] +description: Meeting notes, decisions, and people facts from Granola (MCP) +--- + +## How to query + +Use `list_meetings` with `time_range: custom` and `custom_start`/`custom_end` set to the enrichment window. Then call `get_meetings` on any non-trivial meetings. Use `get_meeting_transcript` for verbatim detail when needed. Granola is the authoritative source for meeting content — it covers all history with no date limit. + +## Triage + +Not all meetings need deep reads. Apply this triage to keep token cost low: + +- **Always read:** 1:1s, sig syncs, cycle planning, retrospectives, ad-hoc technical sessions, any meeting with a descriptive title suggesting a decision was made +- **Skim summary only:** standups (read notes, skip transcript), demo days (extract shipped items), all-hands / org-wide meetings (extract only items directly relevant to your work) +- **Skip entirely:** "New note" untitled entries with no participants other than you, HR/benefits/onboarding sessions (privacy), personal/non-work meetings + +## What to extract + +- People facts: email addresses from participant lists, role/team data, Slack handles if mentioned +- Decisions announced or confirmed in the meeting +- Action items assigned to you or by you +- Project status updates + +## What to skip + +- Meetings already marked skip in the triage above +- Content already captured from another source (prefer Granola as authoritative, skip the duplicate in Slack) diff --git a/dot_config/opencode/kb-collectors/linear.md b/dot_config/opencode/kb-collectors/linear.md new file mode 100644 index 00000000..d006f862 --- /dev/null +++ b/dot_config/opencode/kb-collectors/linear.md @@ -0,0 +1,34 @@ +--- +name: linear +enabled: true +priority: 3 +authoritative_for: [tickets, completed-work] +description: Linear issues you touched in the enrichment window +# workspace: the Linear workspace slug passed to the `linear` CLI. +# Leave empty to use whatever workspace the CLI is already authenticated to. +workspace: "" +--- + +## How to query + +If `workspace` is set in frontmatter, pass it with `--workspace`: + +```bash +linear issue mine --updated-after YYYY-MM-DD --all-states --no-pager +# (the CLI uses the default workspace; --workspace flag is not supported by the +# current linear CLI — workspace selection is via `linear auth` at setup time) +``` + +The workspace this collector targets is `{{ frontmatter.workspace }}` (set in frontmatter; update there if you switch orgs). + +## What to extract + +- Newly created tickets +- Status changes +- Decisions captured in descriptions or comments +- Any ticket closed in the window (signals completed work not otherwise visible in git) + +## What to skip + +- Bot-generated or auto-updated tickets +- Tickets you are only a watcher on with no direct activity diff --git a/dot_config/opencode/kb-collectors/opencode.md b/dot_config/opencode/kb-collectors/opencode.md new file mode 100644 index 00000000..924d700e --- /dev/null +++ b/dot_config/opencode/kb-collectors/opencode.md @@ -0,0 +1,35 @@ +--- +name: opencode +enabled: true +priority: 5 +authoritative_for: [coding-sessions] +description: opencode coding sessions from the local session store +--- + +## How to query + +Query the session store SQLite database at `~/.local/share/opencode/opencode.db`. + +> **Note:** The session exclusion / dedup step is handled by the orchestrator before this collector runs. By the time this collector is called, it receives a list of directories to exclude (sessions covered by archived OpenSpec changes). Apply the `NOT IN (...)` clause below. + +```sql +SELECT id, directory, title, time_updated +FROM session +WHERE time_updated BETWEEN :start_ms AND :end_ms + AND directory NOT IN ('/abs/worktree/a', '/abs/worktree/b'); +-- The excluded directories are passed in by the orchestrator. +-- Only sessions NOT in the exclusion set get a transcript read. +-- For excluded sessions, the orchestrator narrates from the change's design.md/specs instead. +``` + +`time_updated` is epoch-milliseconds. + +## What to extract + +- Work done in sessions not covered by an OpenSpec archived change +- Project context, technical decisions made interactively, approaches tried + +## What to skip + +- Sessions whose `directory` is in the exclusion set (covered by the durable OpenSpec change artifacts — see the orchestrator's dedup step) +- Sessions with no substantive content (e.g. very short duration, no meaningful tool calls) diff --git a/dot_config/opencode/kb-collectors/openspec.md b/dot_config/opencode/kb-collectors/openspec.md new file mode 100644 index 00000000..5938f13e --- /dev/null +++ b/dot_config/opencode/kb-collectors/openspec.md @@ -0,0 +1,45 @@ +--- +name: openspec +enabled: true +priority: 0 +authoritative_for: [implement-work, design-decisions, rejected-alternatives] +description: OpenSpec durable store — authoritative source for /implement work; read BEFORE other collectors to build the session exclusion set +--- + +## Why priority 0 + +This collector runs first. Its primary job is building the **session exclusion set** used by the `opencode` collector — a list of worktree paths whose sessions are already covered by an archived OpenSpec change and should not get a redundant (token-expensive) transcript read. + +## How to query + +For each date being enriched, read every `$KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml`. Collect the `worktree:` value from each. That set is the exclusion list passed to the `opencode` collector. + +```bash +# Collect worktrees for a given date +for meta in $KB_ROOT/openspec/*/changes/archive/<date>-*/kb-meta.yaml; do + grep '^worktree:' "$meta" | awk '{print $2}' +done +``` + +Then read each archived change's `design.md`: + +``` +$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md +``` + +READ these (do not copy them — the artifacts are already in the KB via the symlink). Extract decisions, the "why", and rejected alternatives for the decisions log. + +Also read the durable `specs/` for standing requirements: + +``` +$KB_ROOT/openspec/*/specs/ +``` + +## What to extract + +- Decisions, rationale, and rejected alternatives from `design.md` files +- The set of worktree paths (→ exclusion list for the `opencode` collector) + +## What to skip + +- Re-narrating or duplicating the full design content in the journal — reference the durable store artifacts instead diff --git a/dot_config/opencode/kb-collectors/slack.md b/dot_config/opencode/kb-collectors/slack.md new file mode 100644 index 00000000..6f055ca7 --- /dev/null +++ b/dot_config/opencode/kb-collectors/slack.md @@ -0,0 +1,44 @@ +--- +name: slack +enabled: true +priority: 2 +authoritative_for: [informal-decisions, action-items, contact-info] +description: Slack messages — your sent messages and mentions, in the enrichment window +--- + +## How to query + +Token is in `~/.config/team-context-mcp/.env` as `SLACK_USER_TOKEN`. Set `SLACK_USER_ID` to your Slack user ID (find it in your Slack profile → "Copy member ID"). + +```bash +SLACK_TOKEN=$(grep SLACK_USER_TOKEN ~/.config/team-context-mcp/.env | cut -d= -f2) +SLACK_USER_ID="<your-slack-user-id>" # e.g. U01ABC23DEF + +# Your messages in the date window +curl -s "https://slack.com/api/search.messages?query=from:me+after:YYYY-MM-DD+before:YYYY-MM-DD&count=20&sort=timestamp" \ + -H "Authorization: Bearer $SLACK_TOKEN" + +# Mentions of you in the date window +curl -s "https://slack.com/api/search.messages?query=%3C${SLACK_USER_ID}%3E+after:YYYY-MM-DD+before:YYYY-MM-DD&count=20&sort=timestamp" \ + -H "Authorization: Bearer $SLACK_TOKEN" + +# Read a thread (get replies) +curl -s "https://slack.com/api/conversations.replies?channel=CHANNEL_ID&ts=THREAD_TS" \ + -H "Authorization: Bearer $SLACK_TOKEN" +``` + +Slack is high-volume; read selectively to keep token cost manageable. + +## What to extract + +- Decisions announced or confirmed in Slack that didn't appear in a Granola meeting +- Action items assigned to you or by you that aren't already in Linear +- New contact info (Slack handles, email addresses) for people profiles +- Customer or partner names that surfaced in conversation + +## What to skip + +- Routine standup threads already covered by Granola +- Emoji reactions and short acknowledgments ("👍", "sounds good") +- HR, compensation, or personal channels (privacy) +- Anything already captured from a Granola meeting for the same day From ce2dc6f2abfe37dee4ebd7411926a51dbbb8ab86 Mon Sep 17 00:00:00 2001 From: Stephen Golub <stephen@golub.io> Date: Fri, 26 Jun 2026 14:17:59 -0500 Subject: [PATCH 2/2] fix(kb-enrich): address PR feedback - Remove re-run guard (redundant with Step 1 date-range logic) - Drop 'enabled' frontmatter field; presence/absence of file is the toggle - gh.md: read orgs from chezmoi data instead of hardcoded frontmatter - linear.md: switch from linear CLI to GraphQL API via 'linear' skill - Remove google-chat.md and granola.md (user-specific collectors) --- dot_config/opencode/commands/kb-enrich.md | 23 ++++------ dot_config/opencode/kb-collectors/gh.md | 21 ++++++---- .../opencode/kb-collectors/google-chat.md | 38 ----------------- dot_config/opencode/kb-collectors/granola.md | 31 -------------- dot_config/opencode/kb-collectors/linear.md | 42 +++++++++++++------ dot_config/opencode/kb-collectors/opencode.md | 1 - dot_config/opencode/kb-collectors/openspec.md | 1 - dot_config/opencode/kb-collectors/slack.md | 1 - 8 files changed, 48 insertions(+), 110 deletions(-) delete mode 100644 dot_config/opencode/kb-collectors/google-chat.md delete mode 100644 dot_config/opencode/kb-collectors/granola.md diff --git a/dot_config/opencode/commands/kb-enrich.md b/dot_config/opencode/commands/kb-enrich.md index 57d82730..afb6b98c 100644 --- a/dot_config/opencode/commands/kb-enrich.md +++ b/dot_config/opencode/commands/kb-enrich.md @@ -15,25 +15,17 @@ $ARGUMENTS All KB reads and writes in this run use `$KB_ROOT`. Every collector file also receives `$KB_ROOT` as context. -**Re-run guard** — if today's journal file already exists at `$KB_ROOT/journal/YYYY-MM-DD.md` and `--force` was not passed in arguments, skip that date silently (log "already enriched: YYYY-MM-DD"). This prevents duplicate work from concurrent runs. - ## Step 1 — Resolve the date range Enrich the gap since the last run, not a hard single day. The most recent `$KB_ROOT/journal/YYYY-MM-DD.md` is the last-run marker: enrich each date from (last journal date + 1) through today, inclusive. This makes a Monday run sweep the trailing weekend and lets a skipped run self-heal on the next run. If no prior journal exists, default to today. An explicit date or range in `$ARGUMENTS` overrides this. ## Step 2 — Load collectors -Collectors live in `~/.config/opencode/kb-collectors/`. Each file is a self-contained markdown recipe that describes one data source. Read all `*.md` files in that directory. For each file: - -1. Parse the YAML frontmatter to get `name`, `enabled`, `priority`, `authoritative_for`. -2. Skip any collector with `enabled: false`. -3. Sort remaining collectors by `priority` ascending (lower number = runs first). - -The full set of collector instructions — how to query the source, triage rules, what to extract, and what to skip — is in each collector's body. Read the body now so you can apply it during collection. +Collectors live in `~/.config/opencode/kb-collectors/`. Each file is a self-contained markdown recipe that describes one data source. Read all `*.md` files in that directory and sort them by the `priority` field in their YAML frontmatter (lower number = runs first). Read each file's body now so you can apply its query recipe and extraction rules during collection. -Some collectors perform a **runtime enabled check** at the start of their body (e.g. verifying a token file exists before proceeding). Honor those checks: if a collector's body says to skip, log the reason and move on. +Some collectors perform a **runtime enabled check** at the start of their body (e.g. verifying a token or config value exists before proceeding). Honor those checks: if a collector's body says to skip, log the reason and move on. -> **To add a new data source:** drop a new `.md` file into `~/.config/opencode/kb-collectors/`. No changes to this orchestrator needed. To disable a source temporarily, set `enabled: false` in its frontmatter. To configure per-machine values (token paths, org lists, workspaces), edit the relevant frontmatter field directly in the collector file. +> **To add a new data source:** drop a new `.md` file into `~/.config/opencode/kb-collectors/`. No changes to this orchestrator needed. To disable a source, remove or don't add its collector file. ## Step 3 — Session exclusion (cross-collector dedup) @@ -45,18 +37,18 @@ For each date being enriched, read `$KB_ROOT/openspec/*/changes/archive/<date>-* ## Step 4 — Run collectors -For each date in the enrichment window, run each enabled collector in priority order. Apply the collector's own query recipe, triage rules, and extraction rules exactly as written in its body. Carry the results forward into Step 5. +For each date in the enrichment window, run each collector in priority order. Apply the collector's own query recipe, triage rules, and extraction rules exactly as written in its body. Carry the results forward into Step 5. ## Step 5 — Enrichment ### Journal -Write one cross-project rollup journal file per enriched date at `$KB_ROOT/journal/YYYY-MM-DD.md`. By construction each is THIN: feed it only from the non-excluded sessions (not covered by an archived OpenSpec change) plus git diff-stats plus Granola meeting summaries. For `/implement` work, do NOT re-narrate the OpenSpec change — reference the durable store artifacts (`design.md`/specs already in the KB via the symlink). The journal's role is the cross-project rollup and non-`/implement` activity, not a reconstruction of OpenSpec work. +Write one cross-project rollup journal file per enriched date at `$KB_ROOT/journal/YYYY-MM-DD.md`. By construction each is THIN: feed it only from the non-excluded sessions (not covered by an archived OpenSpec change) plus git diff-stats plus any meeting summaries from collectors. For `/implement` work, do NOT re-narrate the OpenSpec change — reference the durable store artifacts (`design.md`/specs already in the KB via the symlink). The journal's role is the cross-project rollup and non-`/implement` activity, not a reconstruction of OpenSpec work. ### Profiles -Merge new facts into `$KB_ROOT/people/` and `$KB_ROOT/projects/` profiles. Load the `knowledge-base` skill for the canonical profile shape and merge rules. Granola is especially good for contact info (emails appear in participant lists) and role/team data — update `email:` frontmatter on person profiles whenever a new address is seen. +Merge new facts into `$KB_ROOT/people/` and `$KB_ROOT/projects/` profiles. Load the `knowledge-base` skill for the canonical profile shape and merge rules. ### Decisions -Add any decisions to the decisions log. Pull key design decisions and rejected alternatives from `$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the KB). Also extract decisions surfaced in Granola meeting notes. The decisions log is a distilled record anchored to its product/project, not a dump of design files or transcripts. +Add any decisions to the decisions log. Pull key design decisions and rejected alternatives from `$KB_ROOT/openspec/*/changes/archive/<date>-*/design.md` (READ, don't copy — the artifacts are already in the KB). Also extract any decisions surfaced by meeting or chat collectors. The decisions log is a distilled record anchored to its product/project, not a dump of design files or transcripts. ### Action items Extract action items from the enriched window's activity. Cross-reference within the same activity data — if the activity shows the action was already taken (replied, reviewed, closed, appeared as done in a later meeting), skip the reminder. Only create reminders for items not resolved within the enriched window. @@ -69,4 +61,3 @@ Do not extract or store: - Performance evaluations - Legal or attorney-client privileged content - Content from HR-related discussions -- Content from meetings titled or tagged as confidential (e.g. "Confidential: ...") diff --git a/dot_config/opencode/kb-collectors/gh.md b/dot_config/opencode/kb-collectors/gh.md index 2f50e92d..32cf13f0 100644 --- a/dot_config/opencode/kb-collectors/gh.md +++ b/dot_config/opencode/kb-collectors/gh.md @@ -1,23 +1,26 @@ --- name: gh -enabled: true priority: 4 authoritative_for: [shipped-code, reviews] description: GitHub PRs you authored or reviewed in the enrichment window -# orgs: GitHub orgs to scope searches to. When set, each org is added as -# `org:NAME` to the search query. Leave empty for no org filter (searches -# across all repos you have access to). -orgs: [] # skip_bots: commit authors / PR actors to ignore skip_bots: [dependabot] --- -## How to query +## Enabled check + +Read GitHub orgs from chezmoi data: -Scope to the orgs listed in `orgs` frontmatter. Build the org filter by joining each as `org:NAME`: +```bash +ORGS=$(chezmoi data --format json | jq -r '[.orgs | keys[]] | join(" ")') +``` + +If `ORGS` is empty, search across all repos you have access to (no org filter). If non-empty, scope searches with an `org:NAME` filter per org. + +## How to query ```bash -# PRs you opened or updated (add org filters if orgs list is non-empty) +# PRs you opened or updated gh search prs --author "@me" --state all --updated ">=YYYY-MM-DD" \ --json number,title,repository,state,updatedAt,body @@ -26,7 +29,7 @@ gh search prs --review-requested "@me" --updated ">=YYYY-MM-DD" \ --json number,title,repository,state,updatedAt ``` -If the `orgs` list in frontmatter is empty, omit the org filter and search across all repos. +Add `org:NAME` to each query for every org in `ORGS` (run one query per org, or combine with multiple `org:` terms in a single search string). ## What to extract diff --git a/dot_config/opencode/kb-collectors/google-chat.md b/dot_config/opencode/kb-collectors/google-chat.md deleted file mode 100644 index 6da7f0b7..00000000 --- a/dot_config/opencode/kb-collectors/google-chat.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -name: google-chat -enabled: false -priority: 6 -authoritative_for: [volunteer-work, informal-decisions] -description: Google Chat messages from spaces you are active in (volunteer org, etc.) -# token_path: absolute path to the file containing the Google Chat bearer token. -# When empty or the file does not exist, this collector is skipped automatically. -token_path: "" ---- - -## Enabled check - -Before running: if `token_path` is empty or the file at that path does not exist, skip this collector entirely and log "google-chat: no token, skipping". Set `enabled: true` and `token_path` in the frontmatter on machines where Google Chat is available. - -## How to query - -Read the bearer token from the file at `token_path`. Use the Google Chat REST API `spaces.messages.list` endpoint with a `createTime` filter for the date window: - -``` -GET https://chat.googleapis.com/v1/spaces/{space}/messages - ?filter=createTime>"YYYY-MM-DDT00:00:00Z" AND createTime<"YYYY-MM-DDT23:59:59Z" -Authorization: Bearer <token from token_path> -``` - -Focus on spaces related to volunteer work. - -## Triage - -- **What to extract:** - - Decisions and action items from volunteer organization spaces - - Event planning, logistics, or coordination you were part of - - New contacts (names, roles) from the organization - -- **What to skip:** - - Casual social messages with no action or decision content - - Announcements you didn't participate in - - Anything already captured from a Granola meeting for the same day diff --git a/dot_config/opencode/kb-collectors/granola.md b/dot_config/opencode/kb-collectors/granola.md deleted file mode 100644 index 0d4c5da9..00000000 --- a/dot_config/opencode/kb-collectors/granola.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -name: granola -enabled: true -priority: 1 -authoritative_for: [meetings, decisions, people-contact] -description: Meeting notes, decisions, and people facts from Granola (MCP) ---- - -## How to query - -Use `list_meetings` with `time_range: custom` and `custom_start`/`custom_end` set to the enrichment window. Then call `get_meetings` on any non-trivial meetings. Use `get_meeting_transcript` for verbatim detail when needed. Granola is the authoritative source for meeting content — it covers all history with no date limit. - -## Triage - -Not all meetings need deep reads. Apply this triage to keep token cost low: - -- **Always read:** 1:1s, sig syncs, cycle planning, retrospectives, ad-hoc technical sessions, any meeting with a descriptive title suggesting a decision was made -- **Skim summary only:** standups (read notes, skip transcript), demo days (extract shipped items), all-hands / org-wide meetings (extract only items directly relevant to your work) -- **Skip entirely:** "New note" untitled entries with no participants other than you, HR/benefits/onboarding sessions (privacy), personal/non-work meetings - -## What to extract - -- People facts: email addresses from participant lists, role/team data, Slack handles if mentioned -- Decisions announced or confirmed in the meeting -- Action items assigned to you or by you -- Project status updates - -## What to skip - -- Meetings already marked skip in the triage above -- Content already captured from another source (prefer Granola as authoritative, skip the duplicate in Slack) diff --git a/dot_config/opencode/kb-collectors/linear.md b/dot_config/opencode/kb-collectors/linear.md index d006f862..4ff0c060 100644 --- a/dot_config/opencode/kb-collectors/linear.md +++ b/dot_config/opencode/kb-collectors/linear.md @@ -1,30 +1,46 @@ --- name: linear -enabled: true priority: 3 authoritative_for: [tickets, completed-work] description: Linear issues you touched in the enrichment window -# workspace: the Linear workspace slug passed to the `linear` CLI. -# Leave empty to use whatever workspace the CLI is already authenticated to. -workspace: "" --- -## How to query +## Enabled check -If `workspace` is set in frontmatter, pass it with `--workspace`: +Load the `linear` skill. If no Linear API token is available (the skill cannot authenticate), skip this collector and log "linear: no auth, skipping". -```bash -linear issue mine --updated-after YYYY-MM-DD --all-states --no-pager -# (the CLI uses the default workspace; --workspace flag is not supported by the -# current linear CLI — workspace selection is via `linear auth` at setup time) -``` +## How to query -The workspace this collector targets is `{{ frontmatter.workspace }}` (set in frontmatter; update there if you switch orgs). +Use the Linear GraphQL API (endpoint `https://api.linear.app/graphql`) via the `linear` skill. Query issues assigned to or created by you that were updated within the enrichment window: + +```graphql +{ + issues( + filter: { + updatedAt: { gte: "YYYY-MM-DDT00:00:00Z" } + or: [ + { assignee: { isMe: { eq: true } } } + { creator: { isMe: { eq: true } } } + ] + } + first: 50 + ) { + nodes { + identifier + title + state { name } + updatedAt + description + url + } + } +} +``` ## What to extract - Newly created tickets -- Status changes +- Status changes (especially to Done/Completed) - Decisions captured in descriptions or comments - Any ticket closed in the window (signals completed work not otherwise visible in git) diff --git a/dot_config/opencode/kb-collectors/opencode.md b/dot_config/opencode/kb-collectors/opencode.md index 924d700e..a8cae872 100644 --- a/dot_config/opencode/kb-collectors/opencode.md +++ b/dot_config/opencode/kb-collectors/opencode.md @@ -1,6 +1,5 @@ --- name: opencode -enabled: true priority: 5 authoritative_for: [coding-sessions] description: opencode coding sessions from the local session store diff --git a/dot_config/opencode/kb-collectors/openspec.md b/dot_config/opencode/kb-collectors/openspec.md index 5938f13e..d4ec7ba0 100644 --- a/dot_config/opencode/kb-collectors/openspec.md +++ b/dot_config/opencode/kb-collectors/openspec.md @@ -1,6 +1,5 @@ --- name: openspec -enabled: true priority: 0 authoritative_for: [implement-work, design-decisions, rejected-alternatives] description: OpenSpec durable store — authoritative source for /implement work; read BEFORE other collectors to build the session exclusion set diff --git a/dot_config/opencode/kb-collectors/slack.md b/dot_config/opencode/kb-collectors/slack.md index 6f055ca7..0fda1fb5 100644 --- a/dot_config/opencode/kb-collectors/slack.md +++ b/dot_config/opencode/kb-collectors/slack.md @@ -1,6 +1,5 @@ --- name: slack -enabled: true priority: 2 authoritative_for: [informal-decisions, action-items, contact-info] description: Slack messages — your sent messages and mentions, in the enrichment window