YiAgent · YiWang24 · May 27, 2026 · May 27, 2026 · May 27, 2026 · May 27, 2026
diff --git a/docs/superpowers/plans/v02-execution-plan.md b/docs/superpowers/plans/v02-execution-plan.md
@@ -0,0 +1,20 @@
+# v0.2 Execution Plan
+
+## Status
+
+| Feature | Status | Notes |
+|---|---|---|
+| F1: Per-task budget | ✅ DONE | BudgetGuard already in runtime.py line 121 |
+| F2: Config approval gate | TODO | New middleware |
+| F3: Audit CLI | TODO | New CLI commands |
+| F4: LinearAdapter | TODO | New channel adapter |
+| F5: Plugin system | TODO | New directory + CI |
+| F6: Issue dedup | TODO | pgvector + embedding |
+
+## Execution Order
+
+1. F3: Audit CLI (reads existing data, no schema changes)
+2. F2: Config approval gate (new middleware, small scope)
+3. F5: Plugin system (scaffolding + 3 builtin plugins)
+4. F4: LinearAdapter (new channel, webhook + write-back)
+5. F6: Issue dedup (pgvector migration + embedding pipeline)
diff --git a/docs/superpowers/plans/v02-f1-per-task-budget.md b/docs/superpowers/plans/v02-f1-per-task-budget.md
@@ -0,0 +1,41 @@
+# F1: Per-Task Budget Fix
+
+> Sprint 1, Task T1. Fix: per-task budget only checked at preflight (input-side), not during agent loop execution.
+
+## Problem
+
+`BudgetEnforcement` middleware checks monthly/global caps before the task starts, but the per-task cap ($3.00 for fix) is never enforced during the agent loop. A runaway agent can exceed its budget.
+
+## Solution
+
+Add `BudgetStepGuard` — a middleware that checks cumulative cost before each agent step.
+
+### File: `openbot/infrastructure/agents/_budget_middleware.py`
+
+```python
+class BudgetStepGuard:
+    """Check per-task cost before each agent step.
+
+    Queries cost_meter for cumulative spend on this task_id.
+    If exceeds per_task_cap_usd → raise BudgetExceeded.
+    """
+
+    def __init__(self, task_id: str, cap_usd: float, session_factory):
+        ...
+
+    async def before_step(self, state) -> None:
+        total = await self._query_cost()
+        if total >= self.cap_usd:
+            raise BudgetExceeded(f"Hit per-task budget (${self.cap_usd:.2f})")
+```
+
+### Integration Point
+
+Wire into DeepAgents runtime middleware stack in `openbot/infrastructure/agents/runtime.py`.
+
+### Tests
+
+- `tests/unit/infrastructure/agents/test_budget_step_guard.py`
+  - Under budget → continues
+  - At/over budget → raises BudgetExceeded
+  - cost_meter query failure → degrades gracefully (log warning, continue)
diff --git a/docs/superpowers/specs/v02-overall-spec.md b/docs/superpowers/specs/v02-overall-spec.md
@@ -0,0 +1,26 @@
+# v0.2 Overall Spec
+
+> 6 features, 3 sprints, 6 weeks. PRD: `docs/prd/openbot-v02-prd.md`
+
+## Feature List
+
+| # | Feature | Sprint | Effort | Files |
+|---|---|---|---|---|
+| F1 | Per-task budget fix | 1 | 2d | `_budget_middleware.py`, tests |
+| F2 | Config approval gate | 1 | 1d | `config_approval.py`, tests |
+| F3 | Audit CLI | 1 | 2d | `openbot/cli/audit.py`, tests |
+| F4 | LinearAdapter | 2 | 5d | `linear.py`, `channel_credentials.py`, webhook route, tests |
+| F5 | Plugin system | 2 | 3d | `openbot_plugins/`, CI gate, CONTRIBUTING.md |
+| F6 | Issue dedup | 3 | 4d | `dedup.py`, pgvector migration, embedding + rerank, tests |
+
+## Execution Order
+
+F1 → F2 → F3 → F4 → F5 → F6
+
+## Shared Constraints
+
+- `make check` must pass after each feature
+- No hardcoded values — env vars via `Settings`
+- Immutable data patterns (frozen dataclasses)
+- Hexagonal architecture — domain ← application ← infrastructure
+- Egress scanning on all bot-authored output
diff --git a/openbot/application/middleware/config_approval.py b/openbot/application/middleware/config_approval.py
@@ -0,0 +1,147 @@
+"""Config approval gate — PRD v0.2 §3.2.
+
+When a PR actually modifies `.openbot/config.yaml`, the bot checks for a
+`config-approved` label. Without it, the bot comments and blocks. Detection
+is based on the PR **diff** (the real changed files), not on free-text
+mentions in the title/body — otherwise the gate is trivially bypassed by
+simply not naming the file.
+
+When the touched lines include high-risk keys (budget / safety / channels /
+cancel / rate_limit / egress caps), they're surfaced in the block comment so
+the reviewer knows why approval is required.
+
+This is a preflight-level check — it runs before the workflow starts,
+same as budget/rate-limit/cancel gates.
+"""
+
+from __future__ import annotations
+
+import logging
+
+from openbot.application.middleware.preflight import (
+    MiddlewareDecision,
+    MiddlewareResult,
+    PreflightContext,
+    WorkflowPhase,
+)
+
+_logger = logging.getLogger(__name__)
+
+# High-risk YAML keys. When a changed line in the config diff starts with one
+# of these keys, the change is flagged in the block comment. The gate blocks
+# on ANY config.yaml change regardless (fail closed); this list only enriches
+# the reason surfaced to the reviewer.
+_HIGH_RISK_KEYS = (
+    "budget",
+    "safety",
+    "channels",
+    "cancel",
+    "rate_limit",
+    "egress_action",
+    "global_hard_kill_usd",
+    "monthly_soft_cap_usd",
+    "per_task",
+)
+
+_APPROVAL_LABEL = "config-approved"
+_CONFIG_PATH = ".openbot/config.yaml"
+
+
+def _diff_touches_config(diff: str) -> bool:
+    """True if the unified diff modifies ``.openbot/config.yaml``."""
+    for line in diff.splitlines():
+        if line.startswith(("diff --git", "+++ ", "--- ")) and _CONFIG_PATH in line:
+            return True
+    return False
+
+
+def _changed_high_risk_keys(diff: str) -> list[str]:
+    """Return high-risk YAML keys appearing on added/removed lines inside the
+    ``.openbot/config.yaml`` section of the diff.
+
+    Best-effort: scans the file section between its ``diff --git`` header and
+    the next file header, inspecting ``+``/``-`` content lines (ignoring the
+    ``+++``/``---`` file markers) for a leading ``key:`` token.
+    """
+    found: set[str] = set()
+    in_config = False
+    for line in diff.splitlines():
+        if line.startswith("diff --git"):
+            in_config = _CONFIG_PATH in line
+            continue
+        if not in_config:
+            continue
+        if line.startswith(("+++", "---")):
+            continue
+        if not line.startswith(("+", "-")):
+            continue
+        stripped = line[1:].strip()
+        key = stripped.split(":", 1)[0].strip()
+        if key in _HIGH_RISK_KEYS:
+            found.add(key)
+    return sorted(found)
+
+
+class ConfigApprovalGate:
+    """Block config-change PRs that lack the approval label."""
+
+    name = "config_approval"
+
+    async def __call__(self, ctx: PreflightContext) -> MiddlewareDecision:
+        """Block PRs that modify .openbot/config.yaml without the approval label."""
+        event = ctx.event
+
+        # Only applies to PRs.
+        if not event.pr_number:
+            return MiddlewareDecision.proceed()
+
+        # Inspect the actual diff to decide whether config.yaml is touched.
+        # Fail open on fetch errors (same posture as other soft gates) — a
+        # transient API failure shouldn't wedge every PR.
+        try:
+            diff = await ctx.adapter.get_pr_diff(event, event.pr_number)
+        except Exception:
+            _logger.warning(
+                "config_approval_diff_unavailable",
+                extra={"pr_number": event.pr_number},
+            )
+            return MiddlewareDecision.proceed()
+
+        if not _diff_touches_config(diff):
+            return MiddlewareDecision.proceed()
+
+        # Approval label present → allow.
+        labels = event.raw.get("pull_request", {}).get("labels", [])
+        label_names = {lbl.get("name", "") for lbl in labels}
+        if _APPROVAL_LABEL in label_names:
+            _logger.info(
+                "config_approval_granted",
+                extra={"pr_number": event.pr_number, "labels": sorted(label_names)},
+            )
+            return MiddlewareDecision.proceed()
+
+        # Config change without approval — block.
+        high_risk = _changed_high_risk_keys(diff)
+        _logger.warning(
+            "config_approval_blocked",
+            extra={
+                "pr_number": event.pr_number,
+                "labels": sorted(label_names),
+                "high_risk_keys": high_risk,
+            },
+        )
+        risk_note = (
+            f" High-risk fields changed: {', '.join(high_risk)}."
+            if high_risk
+            else " (budget, safety, channels, cancel)."
+        )
+        return MiddlewareDecision(
+            result=MiddlewareResult.BLOCKED,
+            reason="config_approval_required",
+            comment=(
+                f"Config change requires the `{_APPROVAL_LABEL}` label. "
+                f"This PR modifies `{_CONFIG_PATH}`.{risk_note} "
+                f"Add the `{_APPROVAL_LABEL}` label to proceed."
+            ),
+            audit_phase=WorkflowPhase.SKIPPED,
+        )
diff --git a/openbot/application/use_cases/dedup.py b/openbot/application/use_cases/dedup.py
@@ -0,0 +1,143 @@
+"""Issue deduplication — PRD v0.2 §2.3.
+
+Pipeline:
+  1. Embed new issue title + body
+  2. pgvector ANN search (top-K by cosine similarity)
+  3. LLM rerank → top-3 candidates
+  4. Bot comment with candidates (never auto-close)
+
+Dependencies: pgvector extension on Postgres, embedding provider (Voyage/OpenAI).
+"""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass
+from typing import TYPE_CHECKING, Any
+
+if TYPE_CHECKING:
+    from openbot.application.ports.channel_adapter import ChannelAdapterPort
+    from openbot.domain.events import UnifiedEvent
+
+_logger = logging.getLogger(__name__)
+
+
+@dataclass(frozen=True, slots=True)
+class DedupCandidate:
+    """A potential duplicate issue."""
+
+    issue_number: int
+    title: str
+    similarity: float
+
+
+@dataclass(frozen=True, slots=True)
+class DedupResult:
+    """Result of dedup analysis."""
+
+    candidates: list[DedupCandidate]
+    comment: str
+
+
+def render_dedup_comment(candidates: list[DedupCandidate]) -> str:
+    """Render a user-facing comment listing potential duplicates."""
+    if not candidates:
+        return ""
+
+    parts: list[str] = [
+        ":mag: **Potential duplicates found:**\n",
+    ]
+    for c in candidates:
+        pct = f"{c.similarity * 100:.0f}%"
+        parts.append(f"- #{c.issue_number} — {c.title} ({pct} similar)")
+
+    parts.append("")
+    parts.append(
+        "_Please check if this issue is a duplicate. "
+        "If so, close this one and continue the discussion in the existing issue._"
+    )
+    return "\n".join(parts)
+
+
+async def check_duplicates(
+    event: UnifiedEvent,
+    adapter: ChannelAdapterPort,
+    *,
+    issue_title: str,
+    issue_body: str,
+    embedding_func: Any | None = None,
+    search_func: Any | None = None,
+    rerank_func: Any | None = None,
+    min_similarity: float = 0.75,
+    top_k: int = 10,
+    rerank_top: int = 3,
+) -> DedupResult:
+    """Check for duplicate issues and return candidates.
+
+    This is the use-case entry point. It orchestrates:
+      1. Embed the issue
+      2. Search for similar issues
+      3. Rerank with LLM
+      4. Render the comment
+
+    When embedding/search/rerank functions are None (not configured),
+    returns an empty result gracefully.
+    """
+    if embedding_func is None or search_func is None:
+        _logger.info("dedup_skipped_not_configured")
+        return DedupResult(candidates=[], comment="")
+
+    try:
+        # Step 1: Embed
+        text = f"{issue_title}\n\n{issue_body}"[:4000]  # truncate for embedding
+        embedding = await embedding_func(text)
+
+        # Step 2: Search
+        similar = await search_func(
+            embedding=embedding,
+            repo=event.repo,
+            top_k=top_k,
+            min_similarity=min_similarity,
+        )
+
+        if not similar:
+            return DedupResult(candidates=[], comment="")
+
+        # Step 3: Rerank (optional). When a rerank_func is supplied, let it
+        # reorder/trim the raw ANN hits (LLM rerank, PRD v0.2 §2.3); otherwise
+        # fall back to similarity order. Either way, keep the top `rerank_top`.
+        if rerank_func is not None:
+            reranked = await rerank_func(
+                issue_title=issue_title,
+                issue_body=issue_body,
+                candidates=similar,
+            )
+        else:
+            reranked = similar
+
+        candidates = [
+            DedupCandidate(
+                issue_number=s["issue_number"],
+                title=s["title"],
+                similarity=s["similarity"],
+            )
+            for s in reranked[:rerank_top]
+        ]
+
+        # Step 4: Render comment
+        comment = render_dedup_comment(candidates)
+
+        _logger.info(
+            "dedup_completed",
+            extra={
+                "repo": event.repo,
+                "candidates": len(candidates),
+                "max_similarity": max(c.similarity for c in candidates) if candidates else 0,
+            },
+        )
+
+        return DedupResult(candidates=candidates, comment=comment)
+
+    except Exception:
+        _logger.exception("dedup_failed")
+        return DedupResult(candidates=[], comment="")