Skip to content
20 changes: 20 additions & 0 deletions docs/superpowers/plans/v02-execution-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# v0.2 Execution Plan

## Status

| Feature | Status | Notes |
|---|---|---|
| F1: Per-task budget | ✅ DONE | BudgetGuard already in runtime.py line 121 |
| F2: Config approval gate | TODO | New middleware |
| F3: Audit CLI | TODO | New CLI commands |
| F4: LinearAdapter | TODO | New channel adapter |
| F5: Plugin system | TODO | New directory + CI |
| F6: Issue dedup | TODO | pgvector + embedding |

## Execution Order

1. F3: Audit CLI (reads existing data, no schema changes)
2. F2: Config approval gate (new middleware, small scope)
3. F5: Plugin system (scaffolding + 3 builtin plugins)
4. F4: LinearAdapter (new channel, webhook + write-back)
5. F6: Issue dedup (pgvector migration + embedding pipeline)
41 changes: 41 additions & 0 deletions docs/superpowers/plans/v02-f1-per-task-budget.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# F1: Per-Task Budget Fix

> Sprint 1, Task T1. Fix: per-task budget only checked at preflight (input-side), not during agent loop execution.

## Problem

`BudgetEnforcement` middleware checks monthly/global caps before the task starts, but the per-task cap ($3.00 for fix) is never enforced during the agent loop. A runaway agent can exceed its budget.

## Solution

Add `BudgetStepGuard` — a middleware that checks cumulative cost before each agent step.

### File: `openbot/infrastructure/agents/_budget_middleware.py`

```python
class BudgetStepGuard:
"""Check per-task cost before each agent step.

Queries cost_meter for cumulative spend on this task_id.
If exceeds per_task_cap_usd → raise BudgetExceeded.
"""

def __init__(self, task_id: str, cap_usd: float, session_factory):
...

async def before_step(self, state) -> None:
total = await self._query_cost()
if total >= self.cap_usd:
raise BudgetExceeded(f"Hit per-task budget (${self.cap_usd:.2f})")
```

### Integration Point

Wire into DeepAgents runtime middleware stack in `openbot/infrastructure/agents/runtime.py`.

### Tests

- `tests/unit/infrastructure/agents/test_budget_step_guard.py`
- Under budget → continues
- At/over budget → raises BudgetExceeded
- cost_meter query failure → degrades gracefully (log warning, continue)
26 changes: 26 additions & 0 deletions docs/superpowers/specs/v02-overall-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# v0.2 Overall Spec

> 6 features, 3 sprints, 6 weeks. PRD: `docs/prd/openbot-v02-prd.md`

## Feature List

| # | Feature | Sprint | Effort | Files |
|---|---|---|---|---|
| F1 | Per-task budget fix | 1 | 2d | `_budget_middleware.py`, tests |
| F2 | Config approval gate | 1 | 1d | `config_approval.py`, tests |
| F3 | Audit CLI | 1 | 2d | `openbot/cli/audit.py`, tests |
| F4 | LinearAdapter | 2 | 5d | `linear.py`, `channel_credentials.py`, webhook route, tests |
| F5 | Plugin system | 2 | 3d | `openbot_plugins/`, CI gate, CONTRIBUTING.md |
| F6 | Issue dedup | 3 | 4d | `dedup.py`, pgvector migration, embedding + rerank, tests |

## Execution Order

F1 → F2 → F3 → F4 → F5 → F6

## Shared Constraints

- `make check` must pass after each feature
- No hardcoded values — env vars via `Settings`
- Immutable data patterns (frozen dataclasses)
- Hexagonal architecture — domain ← application ← infrastructure
- Egress scanning on all bot-authored output
147 changes: 147 additions & 0 deletions openbot/application/middleware/config_approval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
"""Config approval gate — PRD v0.2 §3.2.

When a PR actually modifies `.openbot/config.yaml`, the bot checks for a
`config-approved` label. Without it, the bot comments and blocks. Detection
is based on the PR **diff** (the real changed files), not on free-text
mentions in the title/body — otherwise the gate is trivially bypassed by
simply not naming the file.

When the touched lines include high-risk keys (budget / safety / channels /
cancel / rate_limit / egress caps), they're surfaced in the block comment so
the reviewer knows why approval is required.

This is a preflight-level check — it runs before the workflow starts,
same as budget/rate-limit/cancel gates.
"""

from __future__ import annotations

import logging

from openbot.application.middleware.preflight import (
MiddlewareDecision,
MiddlewareResult,
PreflightContext,
WorkflowPhase,
)

_logger = logging.getLogger(__name__)

# High-risk YAML keys. When a changed line in the config diff starts with one
# of these keys, the change is flagged in the block comment. The gate blocks
# on ANY config.yaml change regardless (fail closed); this list only enriches
# the reason surfaced to the reviewer.
_HIGH_RISK_KEYS = (
"budget",
"safety",
"channels",
"cancel",
"rate_limit",
"egress_action",
"global_hard_kill_usd",
"monthly_soft_cap_usd",
"per_task",
)

_APPROVAL_LABEL = "config-approved"
_CONFIG_PATH = ".openbot/config.yaml"


def _diff_touches_config(diff: str) -> bool:
"""True if the unified diff modifies ``.openbot/config.yaml``."""
for line in diff.splitlines():
if line.startswith(("diff --git", "+++ ", "--- ")) and _CONFIG_PATH in line:
return True
return False


def _changed_high_risk_keys(diff: str) -> list[str]:
"""Return high-risk YAML keys appearing on added/removed lines inside the
``.openbot/config.yaml`` section of the diff.

Best-effort: scans the file section between its ``diff --git`` header and
the next file header, inspecting ``+``/``-`` content lines (ignoring the
``+++``/``---`` file markers) for a leading ``key:`` token.
"""
found: set[str] = set()
in_config = False
for line in diff.splitlines():
if line.startswith("diff --git"):
in_config = _CONFIG_PATH in line
continue
if not in_config:
continue
if line.startswith(("+++", "---")):
continue
if not line.startswith(("+", "-")):
continue
stripped = line[1:].strip()
key = stripped.split(":", 1)[0].strip()
if key in _HIGH_RISK_KEYS:
found.add(key)
return sorted(found)


class ConfigApprovalGate:
"""Block config-change PRs that lack the approval label."""

name = "config_approval"

async def __call__(self, ctx: PreflightContext) -> MiddlewareDecision:
"""Block PRs that modify .openbot/config.yaml without the approval label."""
event = ctx.event

# Only applies to PRs.
if not event.pr_number:
return MiddlewareDecision.proceed()

# Inspect the actual diff to decide whether config.yaml is touched.
# Fail open on fetch errors (same posture as other soft gates) — a
# transient API failure shouldn't wedge every PR.
try:
diff = await ctx.adapter.get_pr_diff(event, event.pr_number)
except Exception:
_logger.warning(
"config_approval_diff_unavailable",
extra={"pr_number": event.pr_number},
)
return MiddlewareDecision.proceed()

if not _diff_touches_config(diff):
return MiddlewareDecision.proceed()

# Approval label present → allow.
labels = event.raw.get("pull_request", {}).get("labels", [])
label_names = {lbl.get("name", "") for lbl in labels}
if _APPROVAL_LABEL in label_names:
_logger.info(
"config_approval_granted",
extra={"pr_number": event.pr_number, "labels": sorted(label_names)},
)
return MiddlewareDecision.proceed()

# Config change without approval — block.
high_risk = _changed_high_risk_keys(diff)
_logger.warning(
"config_approval_blocked",
extra={
"pr_number": event.pr_number,
"labels": sorted(label_names),
"high_risk_keys": high_risk,
},
)
risk_note = (
f" High-risk fields changed: {', '.join(high_risk)}."
if high_risk
else " (budget, safety, channels, cancel)."
)
return MiddlewareDecision(
result=MiddlewareResult.BLOCKED,
reason="config_approval_required",
comment=(
f"Config change requires the `{_APPROVAL_LABEL}` label. "
f"This PR modifies `{_CONFIG_PATH}`.{risk_note} "
f"Add the `{_APPROVAL_LABEL}` label to proceed."
),
audit_phase=WorkflowPhase.SKIPPED,
)
143 changes: 143 additions & 0 deletions openbot/application/use_cases/dedup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
"""Issue deduplication — PRD v0.2 §2.3.

Pipeline:
1. Embed new issue title + body
2. pgvector ANN search (top-K by cosine similarity)
3. LLM rerank → top-3 candidates
4. Bot comment with candidates (never auto-close)

Dependencies: pgvector extension on Postgres, embedding provider (Voyage/OpenAI).
"""

from __future__ import annotations

import logging
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any

if TYPE_CHECKING:
from openbot.application.ports.channel_adapter import ChannelAdapterPort
from openbot.domain.events import UnifiedEvent

_logger = logging.getLogger(__name__)


@dataclass(frozen=True, slots=True)
class DedupCandidate:
"""A potential duplicate issue."""

issue_number: int
title: str
similarity: float


@dataclass(frozen=True, slots=True)
class DedupResult:
"""Result of dedup analysis."""

candidates: list[DedupCandidate]
comment: str


def render_dedup_comment(candidates: list[DedupCandidate]) -> str:
"""Render a user-facing comment listing potential duplicates."""
if not candidates:
return ""

parts: list[str] = [
":mag: **Potential duplicates found:**\n",
]
for c in candidates:
pct = f"{c.similarity * 100:.0f}%"
parts.append(f"- #{c.issue_number} — {c.title} ({pct} similar)")

parts.append("")
parts.append(
"_Please check if this issue is a duplicate. "
"If so, close this one and continue the discussion in the existing issue._"
)
return "\n".join(parts)


async def check_duplicates(
event: UnifiedEvent,
adapter: ChannelAdapterPort,

Check warning on line 64 in openbot/application/use_cases/dedup.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Remove the unused function parameter "adapter".

See more on https://sonarcloud.io/project/issues?id=YiAgent_openbot&issues=AZ7s0BcN4ZKJmrimdwrc&open=AZ7s0BcN4ZKJmrimdwrc&pullRequest=103
*,
issue_title: str,
issue_body: str,
embedding_func: Any | None = None,
search_func: Any | None = None,
rerank_func: Any | None = None,
min_similarity: float = 0.75,
top_k: int = 10,
rerank_top: int = 3,
) -> DedupResult:
"""Check for duplicate issues and return candidates.

This is the use-case entry point. It orchestrates:
1. Embed the issue
2. Search for similar issues
3. Rerank with LLM
4. Render the comment

When embedding/search/rerank functions are None (not configured),
returns an empty result gracefully.
"""
if embedding_func is None or search_func is None:
_logger.info("dedup_skipped_not_configured")
return DedupResult(candidates=[], comment="")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium — The rerank_func parameter is accepted but never called — the pipeline skips directly from search results to rendering. The docstring says 'LLM rerank → top-3 candidates' but the code just takes similar[:rerank_top] without reranking.

# Step 3: Rerank (optional)
        candidates = [
            DedupCandidate(...)
            for s in similar[:rerank_top]
        ]


try:
# Step 1: Embed
text = f"{issue_title}\n\n{issue_body}"[:4000] # truncate for embedding
embedding = await embedding_func(text)

# Step 2: Search
similar = await search_func(
embedding=embedding,
repo=event.repo,
top_k=top_k,
min_similarity=min_similarity,
)

if not similar:
return DedupResult(candidates=[], comment="")

# Step 3: Rerank (optional). When a rerank_func is supplied, let it
# reorder/trim the raw ANN hits (LLM rerank, PRD v0.2 §2.3); otherwise
# fall back to similarity order. Either way, keep the top `rerank_top`.
if rerank_func is not None:
reranked = await rerank_func(
issue_title=issue_title,
issue_body=issue_body,
candidates=similar,
)
else:
reranked = similar

candidates = [
DedupCandidate(
issue_number=s["issue_number"],
title=s["title"],
similarity=s["similarity"],
)
for s in reranked[:rerank_top]
]

# Step 4: Render comment
comment = render_dedup_comment(candidates)

_logger.info(
"dedup_completed",
extra={
"repo": event.repo,
"candidates": len(candidates),
"max_similarity": max(c.similarity for c in candidates) if candidates else 0,
},
)

return DedupResult(candidates=candidates, comment=comment)

except Exception:
_logger.exception("dedup_failed")
return DedupResult(candidates=[], comment="")
Loading
Loading