-
Notifications
You must be signed in to change notification settings - Fork 0
feat(v0.2): add LinearAdapter, plugins, dedup, audit CLI, ConfigApprovalGate #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
YiWang24
wants to merge
7
commits into
main
Choose a base branch
from
feat/v02-linear-plugins-dedup
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
f990744
feat(audit): add show/export/budget-reset CLI subcommands
YiWang24 382d4c0
feat(plugins): plugin framework + 3 builtin plugins
YiWang24 2729770
feat(linear): LinearAdapter — webhook verification + event parsing + …
YiWang24 bbbce2e
feat(safety): ConfigApprovalGate — block config PRs without approval …
YiWang24 bca5206
feat(dedup): issue deduplication — embedding + pgvector search + use …
YiWang24 7adb5fc
Merge remote-tracking branch 'origin/main' into feat/v02-linear-plugi…
claude 604214e
fix(v0.2): wire Linear channel, fix dedup engine leak, harden config …
claude File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # v0.2 Execution Plan | ||
|
|
||
| ## Status | ||
|
|
||
| | Feature | Status | Notes | | ||
| |---|---|---| | ||
| | F1: Per-task budget | ✅ DONE | BudgetGuard already in runtime.py line 121 | | ||
| | F2: Config approval gate | TODO | New middleware | | ||
| | F3: Audit CLI | TODO | New CLI commands | | ||
| | F4: LinearAdapter | TODO | New channel adapter | | ||
| | F5: Plugin system | TODO | New directory + CI | | ||
| | F6: Issue dedup | TODO | pgvector + embedding | | ||
|
|
||
| ## Execution Order | ||
|
|
||
| 1. F3: Audit CLI (reads existing data, no schema changes) | ||
| 2. F2: Config approval gate (new middleware, small scope) | ||
| 3. F5: Plugin system (scaffolding + 3 builtin plugins) | ||
| 4. F4: LinearAdapter (new channel, webhook + write-back) | ||
| 5. F6: Issue dedup (pgvector migration + embedding pipeline) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| # F1: Per-Task Budget Fix | ||
|
|
||
| > Sprint 1, Task T1. Fix: per-task budget only checked at preflight (input-side), not during agent loop execution. | ||
|
|
||
| ## Problem | ||
|
|
||
| `BudgetEnforcement` middleware checks monthly/global caps before the task starts, but the per-task cap ($3.00 for fix) is never enforced during the agent loop. A runaway agent can exceed its budget. | ||
|
|
||
| ## Solution | ||
|
|
||
| Add `BudgetStepGuard` — a middleware that checks cumulative cost before each agent step. | ||
|
|
||
| ### File: `openbot/infrastructure/agents/_budget_middleware.py` | ||
|
|
||
| ```python | ||
| class BudgetStepGuard: | ||
| """Check per-task cost before each agent step. | ||
|
|
||
| Queries cost_meter for cumulative spend on this task_id. | ||
| If exceeds per_task_cap_usd → raise BudgetExceeded. | ||
| """ | ||
|
|
||
| def __init__(self, task_id: str, cap_usd: float, session_factory): | ||
| ... | ||
|
|
||
| async def before_step(self, state) -> None: | ||
| total = await self._query_cost() | ||
| if total >= self.cap_usd: | ||
| raise BudgetExceeded(f"Hit per-task budget (${self.cap_usd:.2f})") | ||
| ``` | ||
|
|
||
| ### Integration Point | ||
|
|
||
| Wire into DeepAgents runtime middleware stack in `openbot/infrastructure/agents/runtime.py`. | ||
|
|
||
| ### Tests | ||
|
|
||
| - `tests/unit/infrastructure/agents/test_budget_step_guard.py` | ||
| - Under budget → continues | ||
| - At/over budget → raises BudgetExceeded | ||
| - cost_meter query failure → degrades gracefully (log warning, continue) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # v0.2 Overall Spec | ||
|
|
||
| > 6 features, 3 sprints, 6 weeks. PRD: `docs/prd/openbot-v02-prd.md` | ||
|
|
||
| ## Feature List | ||
|
|
||
| | # | Feature | Sprint | Effort | Files | | ||
| |---|---|---|---|---| | ||
| | F1 | Per-task budget fix | 1 | 2d | `_budget_middleware.py`, tests | | ||
| | F2 | Config approval gate | 1 | 1d | `config_approval.py`, tests | | ||
| | F3 | Audit CLI | 1 | 2d | `openbot/cli/audit.py`, tests | | ||
| | F4 | LinearAdapter | 2 | 5d | `linear.py`, `channel_credentials.py`, webhook route, tests | | ||
| | F5 | Plugin system | 2 | 3d | `openbot_plugins/`, CI gate, CONTRIBUTING.md | | ||
| | F6 | Issue dedup | 3 | 4d | `dedup.py`, pgvector migration, embedding + rerank, tests | | ||
|
|
||
| ## Execution Order | ||
|
|
||
| F1 → F2 → F3 → F4 → F5 → F6 | ||
|
|
||
| ## Shared Constraints | ||
|
|
||
| - `make check` must pass after each feature | ||
| - No hardcoded values — env vars via `Settings` | ||
| - Immutable data patterns (frozen dataclasses) | ||
| - Hexagonal architecture — domain ← application ← infrastructure | ||
| - Egress scanning on all bot-authored output |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,147 @@ | ||
| """Config approval gate — PRD v0.2 §3.2. | ||
|
|
||
| When a PR actually modifies `.openbot/config.yaml`, the bot checks for a | ||
| `config-approved` label. Without it, the bot comments and blocks. Detection | ||
| is based on the PR **diff** (the real changed files), not on free-text | ||
| mentions in the title/body — otherwise the gate is trivially bypassed by | ||
| simply not naming the file. | ||
|
|
||
| When the touched lines include high-risk keys (budget / safety / channels / | ||
| cancel / rate_limit / egress caps), they're surfaced in the block comment so | ||
| the reviewer knows why approval is required. | ||
|
|
||
| This is a preflight-level check — it runs before the workflow starts, | ||
| same as budget/rate-limit/cancel gates. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
|
|
||
| from openbot.application.middleware.preflight import ( | ||
| MiddlewareDecision, | ||
| MiddlewareResult, | ||
| PreflightContext, | ||
| WorkflowPhase, | ||
| ) | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
| # High-risk YAML keys. When a changed line in the config diff starts with one | ||
| # of these keys, the change is flagged in the block comment. The gate blocks | ||
| # on ANY config.yaml change regardless (fail closed); this list only enriches | ||
| # the reason surfaced to the reviewer. | ||
| _HIGH_RISK_KEYS = ( | ||
| "budget", | ||
| "safety", | ||
| "channels", | ||
| "cancel", | ||
| "rate_limit", | ||
| "egress_action", | ||
| "global_hard_kill_usd", | ||
| "monthly_soft_cap_usd", | ||
| "per_task", | ||
| ) | ||
|
|
||
| _APPROVAL_LABEL = "config-approved" | ||
| _CONFIG_PATH = ".openbot/config.yaml" | ||
|
|
||
|
|
||
| def _diff_touches_config(diff: str) -> bool: | ||
| """True if the unified diff modifies ``.openbot/config.yaml``.""" | ||
| for line in diff.splitlines(): | ||
| if line.startswith(("diff --git", "+++ ", "--- ")) and _CONFIG_PATH in line: | ||
| return True | ||
| return False | ||
|
|
||
|
|
||
| def _changed_high_risk_keys(diff: str) -> list[str]: | ||
| """Return high-risk YAML keys appearing on added/removed lines inside the | ||
| ``.openbot/config.yaml`` section of the diff. | ||
|
|
||
| Best-effort: scans the file section between its ``diff --git`` header and | ||
| the next file header, inspecting ``+``/``-`` content lines (ignoring the | ||
| ``+++``/``---`` file markers) for a leading ``key:`` token. | ||
| """ | ||
| found: set[str] = set() | ||
| in_config = False | ||
| for line in diff.splitlines(): | ||
| if line.startswith("diff --git"): | ||
| in_config = _CONFIG_PATH in line | ||
| continue | ||
| if not in_config: | ||
| continue | ||
| if line.startswith(("+++", "---")): | ||
| continue | ||
| if not line.startswith(("+", "-")): | ||
| continue | ||
| stripped = line[1:].strip() | ||
| key = stripped.split(":", 1)[0].strip() | ||
| if key in _HIGH_RISK_KEYS: | ||
| found.add(key) | ||
| return sorted(found) | ||
|
|
||
|
|
||
| class ConfigApprovalGate: | ||
| """Block config-change PRs that lack the approval label.""" | ||
|
|
||
| name = "config_approval" | ||
|
|
||
| async def __call__(self, ctx: PreflightContext) -> MiddlewareDecision: | ||
| """Block PRs that modify .openbot/config.yaml without the approval label.""" | ||
| event = ctx.event | ||
|
|
||
| # Only applies to PRs. | ||
| if not event.pr_number: | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # Inspect the actual diff to decide whether config.yaml is touched. | ||
| # Fail open on fetch errors (same posture as other soft gates) — a | ||
| # transient API failure shouldn't wedge every PR. | ||
| try: | ||
| diff = await ctx.adapter.get_pr_diff(event, event.pr_number) | ||
| except Exception: | ||
| _logger.warning( | ||
| "config_approval_diff_unavailable", | ||
| extra={"pr_number": event.pr_number}, | ||
| ) | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| if not _diff_touches_config(diff): | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # Approval label present → allow. | ||
| labels = event.raw.get("pull_request", {}).get("labels", []) | ||
| label_names = {lbl.get("name", "") for lbl in labels} | ||
| if _APPROVAL_LABEL in label_names: | ||
| _logger.info( | ||
| "config_approval_granted", | ||
| extra={"pr_number": event.pr_number, "labels": sorted(label_names)}, | ||
| ) | ||
| return MiddlewareDecision.proceed() | ||
|
|
||
| # Config change without approval — block. | ||
| high_risk = _changed_high_risk_keys(diff) | ||
| _logger.warning( | ||
| "config_approval_blocked", | ||
| extra={ | ||
| "pr_number": event.pr_number, | ||
| "labels": sorted(label_names), | ||
| "high_risk_keys": high_risk, | ||
| }, | ||
| ) | ||
| risk_note = ( | ||
| f" High-risk fields changed: {', '.join(high_risk)}." | ||
| if high_risk | ||
| else " (budget, safety, channels, cancel)." | ||
| ) | ||
| return MiddlewareDecision( | ||
| result=MiddlewareResult.BLOCKED, | ||
| reason="config_approval_required", | ||
| comment=( | ||
| f"Config change requires the `{_APPROVAL_LABEL}` label. " | ||
| f"This PR modifies `{_CONFIG_PATH}`.{risk_note} " | ||
| f"Add the `{_APPROVAL_LABEL}` label to proceed." | ||
| ), | ||
| audit_phase=WorkflowPhase.SKIPPED, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| """Issue deduplication — PRD v0.2 §2.3. | ||
|
|
||
| Pipeline: | ||
| 1. Embed new issue title + body | ||
| 2. pgvector ANN search (top-K by cosine similarity) | ||
| 3. LLM rerank → top-3 candidates | ||
| 4. Bot comment with candidates (never auto-close) | ||
|
|
||
| Dependencies: pgvector extension on Postgres, embedding provider (Voyage/OpenAI). | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from dataclasses import dataclass | ||
| from typing import TYPE_CHECKING, Any | ||
|
|
||
| if TYPE_CHECKING: | ||
| from openbot.application.ports.channel_adapter import ChannelAdapterPort | ||
| from openbot.domain.events import UnifiedEvent | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) | ||
| class DedupCandidate: | ||
| """A potential duplicate issue.""" | ||
|
|
||
| issue_number: int | ||
| title: str | ||
| similarity: float | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) | ||
| class DedupResult: | ||
| """Result of dedup analysis.""" | ||
|
|
||
| candidates: list[DedupCandidate] | ||
| comment: str | ||
|
|
||
|
|
||
| def render_dedup_comment(candidates: list[DedupCandidate]) -> str: | ||
| """Render a user-facing comment listing potential duplicates.""" | ||
| if not candidates: | ||
| return "" | ||
|
|
||
| parts: list[str] = [ | ||
| ":mag: **Potential duplicates found:**\n", | ||
| ] | ||
| for c in candidates: | ||
| pct = f"{c.similarity * 100:.0f}%" | ||
| parts.append(f"- #{c.issue_number} — {c.title} ({pct} similar)") | ||
|
|
||
| parts.append("") | ||
| parts.append( | ||
| "_Please check if this issue is a duplicate. " | ||
| "If so, close this one and continue the discussion in the existing issue._" | ||
| ) | ||
| return "\n".join(parts) | ||
|
|
||
|
|
||
| async def check_duplicates( | ||
| event: UnifiedEvent, | ||
| adapter: ChannelAdapterPort, | ||
|
Check warning on line 64 in openbot/application/use_cases/dedup.py
|
||
| *, | ||
| issue_title: str, | ||
| issue_body: str, | ||
| embedding_func: Any | None = None, | ||
| search_func: Any | None = None, | ||
| rerank_func: Any | None = None, | ||
| min_similarity: float = 0.75, | ||
| top_k: int = 10, | ||
| rerank_top: int = 3, | ||
| ) -> DedupResult: | ||
| """Check for duplicate issues and return candidates. | ||
|
|
||
| This is the use-case entry point. It orchestrates: | ||
| 1. Embed the issue | ||
| 2. Search for similar issues | ||
| 3. Rerank with LLM | ||
| 4. Render the comment | ||
|
|
||
| When embedding/search/rerank functions are None (not configured), | ||
| returns an empty result gracefully. | ||
| """ | ||
| if embedding_func is None or search_func is None: | ||
| _logger.info("dedup_skipped_not_configured") | ||
| return DedupResult(candidates=[], comment="") | ||
|
|
||
| try: | ||
| # Step 1: Embed | ||
| text = f"{issue_title}\n\n{issue_body}"[:4000] # truncate for embedding | ||
| embedding = await embedding_func(text) | ||
|
|
||
| # Step 2: Search | ||
| similar = await search_func( | ||
| embedding=embedding, | ||
| repo=event.repo, | ||
| top_k=top_k, | ||
| min_similarity=min_similarity, | ||
| ) | ||
|
|
||
| if not similar: | ||
| return DedupResult(candidates=[], comment="") | ||
|
|
||
| # Step 3: Rerank (optional). When a rerank_func is supplied, let it | ||
| # reorder/trim the raw ANN hits (LLM rerank, PRD v0.2 §2.3); otherwise | ||
| # fall back to similarity order. Either way, keep the top `rerank_top`. | ||
| if rerank_func is not None: | ||
| reranked = await rerank_func( | ||
| issue_title=issue_title, | ||
| issue_body=issue_body, | ||
| candidates=similar, | ||
| ) | ||
| else: | ||
| reranked = similar | ||
|
|
||
| candidates = [ | ||
| DedupCandidate( | ||
| issue_number=s["issue_number"], | ||
| title=s["title"], | ||
| similarity=s["similarity"], | ||
| ) | ||
| for s in reranked[:rerank_top] | ||
| ] | ||
|
|
||
| # Step 4: Render comment | ||
| comment = render_dedup_comment(candidates) | ||
|
|
||
| _logger.info( | ||
| "dedup_completed", | ||
| extra={ | ||
| "repo": event.repo, | ||
| "candidates": len(candidates), | ||
| "max_similarity": max(c.similarity for c in candidates) if candidates else 0, | ||
| }, | ||
| ) | ||
|
|
||
| return DedupResult(candidates=candidates, comment=comment) | ||
|
|
||
| except Exception: | ||
| _logger.exception("dedup_failed") | ||
| return DedupResult(candidates=[], comment="") | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
medium — The
rerank_funcparameter is accepted but never called — the pipeline skips directly from search results to rendering. The docstring says 'LLM rerank → top-3 candidates' but the code just takessimilar[:rerank_top]without reranking.