From a22613544f24d399a6a56b563c95994b8f521c71 Mon Sep 17 00:00:00 2001 From: Spyros Date: Sat, 6 Jun 2026 02:27:46 +0300 Subject: [PATCH] Add Cursor agent instructions and modular workflow rules. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Track AGENTS.md and .cursor/rules so agents follow a consistent human-plan-then-execute flow: requirements gates, plan approval, make lint/mypy/test verification, TDD for behavioral changes, builder≠judge review, and OpenCRE-specific prod DB / Alembic guardrails. Co-authored-by: Cursor --- .cursor/rules/alembic-deploy-guardrail.mdc | 14 +++++ .cursor/rules/autonomous-workflow.mdc | 19 ++++++ .cursor/rules/complete-ticket.mdc | 26 ++++++++ .cursor/rules/context-management.mdc | 29 +++++++++ .cursor/rules/multi-agent-workflow.mdc | 72 ++++++++++++++++++++++ .cursor/rules/never-assume.mdc | 30 +++++++++ .cursor/rules/plan-first-workflow.mdc | 54 ++++++++++++++++ .cursor/rules/production-db-ops-safety.mdc | 14 +++++ .cursor/rules/requirements-gate.mdc | 53 ++++++++++++++++ .cursor/rules/tdd-workflow.mdc | 30 +++++++++ .cursor/rules/verifiable-goals.mdc | 49 +++++++++++++++ .gitignore | 7 ++- AGENTS.md | 40 ++++++++++++ 13 files changed, 436 insertions(+), 1 deletion(-) create mode 100644 .cursor/rules/alembic-deploy-guardrail.mdc create mode 100644 .cursor/rules/autonomous-workflow.mdc create mode 100644 .cursor/rules/complete-ticket.mdc create mode 100644 .cursor/rules/context-management.mdc create mode 100644 .cursor/rules/multi-agent-workflow.mdc create mode 100644 .cursor/rules/never-assume.mdc create mode 100644 .cursor/rules/plan-first-workflow.mdc create mode 100644 .cursor/rules/production-db-ops-safety.mdc create mode 100644 .cursor/rules/requirements-gate.mdc create mode 100644 .cursor/rules/tdd-workflow.mdc create mode 100644 .cursor/rules/verifiable-goals.mdc create mode 100644 AGENTS.md diff --git a/.cursor/rules/alembic-deploy-guardrail.mdc b/.cursor/rules/alembic-deploy-guardrail.mdc new file mode 100644 index 000000000..1374dc14b --- /dev/null +++ b/.cursor/rules/alembic-deploy-guardrail.mdc @@ -0,0 +1,14 @@ +--- +description: Enforce Alembic DB-revision guardrail before deploy/migration operations +alwaysApply: true +--- + +# Alembic Deploy Guardrail + +- Before any production/staging deploy or migration operation, run the Alembic guardrail check: + - `python scripts/check_alembic_revision_guardrail.py` + - or `make alembic-guardrail` +- If the guardrail reports unknown DB revision(s), stop immediately and do not run migrations until lineage is reconciled. +- For Heroku deploys, keep `Procfile` `release:` wired to the guardrail so incompatible slugs fail before web/worker rollout. +- After backup restore operations, re-run the guardrail before any `flask db upgrade`. + diff --git a/.cursor/rules/autonomous-workflow.mdc b/.cursor/rules/autonomous-workflow.mdc new file mode 100644 index 000000000..3376479af --- /dev/null +++ b/.cursor/rules/autonomous-workflow.mdc @@ -0,0 +1,19 @@ +--- +description: End-to-end execution policy after plan approval +alwaysApply: true +--- + +# OpenCRE Autonomous Workflow + +Policy: see `AGENTS.md`. Verification: see `verifiable-goals.mdc`. + +- For big changes, wait for human plan approval (`multi-agent-workflow.mdc` Phase 1) before editing. +- After approval, execute end-to-end unless blocked by auth/secrets, destructive actions, or material plan deviations. +- Prefer small, safe changes; avoid unrelated refactors. +- Use `Makefile` targets when possible. +- **Always run lint, mypy, and tests after substantive changes; iterate until green.** Show evidence in handoff. +- If commit verification depends on shell initialization, use a zsh-compatible shell context. +- Run long commands in background; monitor until completion. +- Do NOT commit or push unless explicitly asked. +- In commits do not add "made with cursor" lines. +- On substantive changes, recommend or run judge/subagent review before handoff. diff --git a/.cursor/rules/complete-ticket.mdc b/.cursor/rules/complete-ticket.mdc new file mode 100644 index 000000000..80fb63d79 --- /dev/null +++ b/.cursor/rules/complete-ticket.mdc @@ -0,0 +1,26 @@ +--- +description: Requires complete tickets before coding - asks clarifying questions if requirements are missing +globs: "**/*.{md,txt}" +alwaysApply: false +--- + +# Complete Ticket Requirement + +When the user submits a ticket or task via a `.md` or `.txt` file (including `AGENTS.md`): + +1. **Check** for: goal, success criteria, context, constraints — same rules as `requirements-gate.mdc` +2. **If missing** → ask clarifying questions; offer the **Requirements template** in `requirements-gate.mdc`; do NOT code +3. **If complete** → follow the decision tree in `multi-agent-workflow.mdc`: + - Trivial (typo, rename, one-liner) → implement directly + - Non-trivial → Plan Mode per `plan-first-workflow.mdc` or Phase 1 per `multi-agent-workflow.mdc` when both apply + - Wait for approval before implementation when planning is required +4. **Never assume** missing details — see `never-assume.mdc` + +Verification after implementation: `verifiable-goals.mdc`. Tests for new behavior: `tdd-workflow.mdc`. + +## Coding standards (not covered elsewhere) + +- **Stack:** Python for backend/application code; TypeScript for new frontend code (`application/frontend/`). +- **Error handling:** Handle expected failure paths explicitly; avoid silent failures; return or raise meaningful errors. +- **Async (frontend):** Prefer `async`/`await` over raw Promise chains or callbacks in new TypeScript. +- **Clean code:** Small functions, clear names, match surrounding module conventions; no drive-by refactors. diff --git a/.cursor/rules/context-management.mdc b/.cursor/rules/context-management.mdc new file mode 100644 index 000000000..6fec0191b --- /dev/null +++ b/.cursor/rules/context-management.mdc @@ -0,0 +1,29 @@ +--- +description: Keep agent context focused across tasks and stale threads +alwaysApply: true +--- + +# Context Management + +## Do + +- Use `/clear` between unrelated tasks or features. +- Reference files with `@path` instead of pasting entire file contents. +- Use `@Past Chats` to pull in prior work instead of copy-pasting old conversations. +- Start fresh after **2 failed corrections** on the same issue: `/clear`, then write a better prompt that incorporates what you learned. + +## Don't + +- Let context accumulate across unrelated features in one long thread. +- Describe files vaguely when an `@` reference exists. +- Keep correcting the same mistake in a degrading context — reset instead. + +## When context is stale + +Signs you should `/clear` and re-prompt: + +- Repeated fixes on the same bug without progress +- Agent confuses requirements from an earlier, unrelated task +- Plan has drifted significantly from what was approved + +After clearing, restate: goal, approved plan (or link to `.cursor/plans/*.md`), relevant `@` files, and acceptance criteria. diff --git a/.cursor/rules/multi-agent-workflow.mdc b/.cursor/rules/multi-agent-workflow.mdc new file mode 100644 index 000000000..1dad91ea6 --- /dev/null +++ b/.cursor/rules/multi-agent-workflow.mdc @@ -0,0 +1,72 @@ +--- +description: Two-phase planning for big changes; builder must not be sole judge +alwaysApply: true +--- + +# Multi-Agent Workflow (Human Plan → Agent Execute) + +For **big changes**, split work into two phases. Do not skip Phase 1. + +## What counts as a big change + +- New feature, new standard importer, or new user-facing capability +- Expected to touch **3+ files** or **>500 lines** of diff +- Refactor or migration with behavioral risk +- Touches critical paths (auth, secrets, production data, payments) +- Incomplete requirements or meaningful product/design choices + +Small fixes: single-file bugfix, typo, test-only tweak, clear one-liner → `plan-first-workflow.mdc` only. + +## Phase 1 — Human-led planning (no code) + +**Stop before editing.** + +1. Acknowledge this is a big change; planning comes first. +2. Ask minimum questions: goal, data sources, pattern to mirror, acceptance criteria, out of scope. +3. Draft a plan: steps, `@` file paths, similar code, test/validation plan, risks. +4. Wait for explicit approval ("proceed", "approved", or confirmed edited plan). +5. No commits, push, or implementation code in Phase 1. Read-only research is fine. + Tests and production code begin in Phase 2 after approval (see `tdd-workflow.mdc`). + +## Phase 2 — Agent execution + +After approval: + +1. Execute per `autonomous-workflow.mdc`. +2. Follow the approved plan; pause if material deviation is needed. +3. Implement incrementally; verify as you go (`verifiable-goals.mdc`). +4. Hand off with checklist including test evidence. + +## Builder ≠ Judge (required on substantive work) + +| Role | Responsibility | +|------|----------------| +| **Builder** | Implements approved plan | +| **Judge** | Independent review — edge cases, security, test gaps | + +Invoke judge via subagent, parallel agent, or fresh context: + +- "Use a subagent to review this change for edge cases and security issues." + +Do not mark substantive work complete without independent review or documented reason to skip. + +**Re-review:** Required only when judge findings change behavior, tests, or security posture materially. Style-only nits fixed in-place do not require a second judge pass. + +## Decision tree + +``` +User request received + │ + ├─ Missing goal / criteria / context / constraints? + │ └─ STOP → ask questions OR offer Requirements template (requirements-gate.mdc) + │ + ├─ Typo / rename / one-sentence fix? + │ └─ Implement → quality checks → show evidence + │ + ├─ Multi-file / new feature / refactor / critical path? + │ └─ Plan Mode → detailed plan → WAIT for approval → implement + │ → quality checks → subagent review → handoff with evidence + │ + └─ Otherwise + └─ Brief plan → implement → quality checks → handoff with evidence +``` diff --git a/.cursor/rules/never-assume.mdc b/.cursor/rules/never-assume.mdc new file mode 100644 index 000000000..22450b00a --- /dev/null +++ b/.cursor/rules/never-assume.mdc @@ -0,0 +1,30 @@ +--- +description: Verify or ask — no guessing packages, APIs, patterns, or incomplete code +alwaysApply: true +--- + +# Never Assume + +| Do NOT | Instead | +|--------|---------| +| Assume package names, API endpoints, DB schema, or file locations | Read the codebase; grep; ask | +| Add new dependencies without explanation | State why, alternatives considered, and get approval | +| Use mocks unless explicitly requested | Prefer integration-style tests matching `@application/tests/` patterns | +| Replace code with placeholders, TODOs, or stubs | Ship complete, working code | +| Write incomplete implementations | Finish the feature or stop and explain what's blocked | +| Guess production/staging targets for DB ops | Confirm target app; follow `production-db-ops-safety.mdc` | +| Commit or push unless asked | Wait for explicit user request | + +## Scope and diff discipline + +- Minimize scope — smallest correct diff; no drive-by refactors. +- Match surrounding naming, types, imports, and documentation level. +- Comments only for non-obvious business logic. +- Do not add markdown/docs files the user did not ask for. +- Do not use "made with cursor" or similar in commits. + +## OpenCRE conventions + +- Mirror patterns in existing code (importers, tests, web routes). +- Use Makefile targets over ad-hoc commands when available. +- Prefer `scripts/db/` for production DB operations over raw SQL. diff --git a/.cursor/rules/plan-first-workflow.mdc b/.cursor/rules/plan-first-workflow.mdc new file mode 100644 index 000000000..05510f1ad --- /dev/null +++ b/.cursor/rules/plan-first-workflow.mdc @@ -0,0 +1,54 @@ +--- +description: Require Plan Mode and user approval before non-trivial implementation +alwaysApply: true +--- + +# Plan-First Workflow + +Apply before non-trivial edits. When criteria overlap with `multi-agent-workflow.mdc` (e.g. new feature, 3+ files), follow **multi-agent** Phase 1 — it is the stricter superset. + +## Skip planning only for + +- Typos, renames, obvious single-line fixes +- Tasks fully specified with no design choices + +## MUST plan before implementing when ANY apply + +- New feature or new importer +- Multi-file change (3+ files) or >500 lines expected +- Refactor or migration with behavioral risk +- Touches critical paths (auth, secrets, production data, payments, deploy) +- Requirements have meaningful design choices + +## Plan output MUST include + +1. **Goal** — restated in one sentence +2. **Files** — exact paths to create/modify/delete +3. **Dependencies** — imports, DB migrations, external APIs, new packages +4. **Steps** — ordered implementation sequence +5. **Tests** — new/updated test files and what they assert +6. **Edge cases** — failure modes, empty input, backwards compatibility +7. **Verification** — exact commands (`make lint`, `make mypy`, `make test`, etc.) +8. **Risks** — what could break and how to detect it + +## Approval gate + +After presenting the plan, **wait for explicit user approval** ("proceed", "approved", +or an edited plan confirmed by the user) before writing implementation code. + +After approval, **tests may be written first** per `tdd-workflow.mdc` — failing tests are not "implementation code" for Phase 1 purposes. + +Read-only research (grep, read files, explore codebase) is allowed during planning. + +Save approved plans to `.cursor/plans/.md` when scope is substantial. + +## Before editing (Agent Mode) + +- Brief plan with reasoning: goal, steps, files touched, validation approach. +- Break into smaller steps; check `scripts/db/` and `scripts/` for deploy/DB/import ops. + +## While editing + +- Only modify code relevant to the request. +- Never use placeholders — include complete, working code. +- **Reference patterns specifically:** e.g. mirror `@application/utils/external_project_parsers/parsers/pci_dss.py`, tests like `@application/tests/pci_dss_parser_test.py`. diff --git a/.cursor/rules/production-db-ops-safety.mdc b/.cursor/rules/production-db-ops-safety.mdc new file mode 100644 index 000000000..c37168766 --- /dev/null +++ b/.cursor/rules/production-db-ops-safety.mdc @@ -0,0 +1,14 @@ +--- +description: Require all-caps confirmation for destructive production DB operations and prefer pre-op backups +alwaysApply: true +--- + +# Production DB Operations Safety + +- For production database operations, treat destructive actions (`DELETE`, `DROP`, `TRUNCATE`, irreversible `ALTER`) as high risk. +- Prefer using `scripts/db/` operations instead of ad-hoc production DB commands whenever those scripts cover the use case. +- Before proposing or executing destructive production DB actions, require explicit all-caps confirmation from the user. +- Confirmation should be exact and unambiguous (for OpenCRE scripts: `I_UNDERSTAND_OPENCREORG_PROD_DB_DESTRUCTIVE_ACTION`). +- Prefer capturing a fresh backup before destructive production DB actions; if a backup is skipped, clearly explain risk and ask for confirmation again. +- If app/environment target is ambiguous, stop and ask to confirm target app first. + diff --git a/.cursor/rules/requirements-gate.mdc b/.cursor/rules/requirements-gate.mdc new file mode 100644 index 000000000..bfdf070dc --- /dev/null +++ b/.cursor/rules/requirements-gate.mdc @@ -0,0 +1,53 @@ +--- +description: Stop and ask clarifying questions before coding when requirements are incomplete +alwaysApply: true +--- + +# Requirements Gate + +If ANY of the following is missing or ambiguous, **STOP and ask clarifying questions**. +Do NOT write, edit, or delete code until the gaps are filled. + +| Required | What to ask | +|----------|-------------| +| **Clear goal** | What outcome does the user want? What problem are we solving? | +| **Verifiable success criteria** | Which checks must pass? (tests, typecheck, linter, CI, manual steps) | +| **Context references** | Which files, modules, or prior chats apply? Prefer `@path/to/file` refs when ambiguous. | +| **Constraints** | Out of scope, performance, compatibility, no new deps, prod DB safety, etc. | + +## Context references — when `@` refs are required + +- **Satisfied without `@`** when the message names specific paths, modules, or patterns clearly (e.g. "add tests for `application/utils/gap_analysis.py`"). +- **Ask for `@` refs** when multiple files could apply, the pattern to mirror is unclear, or prior chat/issue context is needed. + +## Skip the gate only when + +The request is fully specified in one sentence with obvious success criteria and unambiguous context, e.g. +"Fix typo in README line 42" or "Rename `foo` to `bar` in `application/utils/gap_analysis.py`." + +## Requirements template (offer when info is missing) + +``` +## Task + + +## Success criteria (all must pass) +- [ ] `make lint` +- [ ] `make mypy` +- [ ] `make test` (or targeted: `python -m unittest application/tests/_test.py`) +- [ ] `make frontend` / `yarn build` (if frontend touched) +- [ ] CI green (`gh pr checks` or Actions UI) +- [ ] Other: ___ + +## Context +- Files: @path/to/relevant/file.py +- Pattern to mirror: @path/to/similar/implementation.py +- Prior work: @Past Chats / issue # / PR # + +## Constraints +- In scope: ___ +- Out of scope: ___ +- Dependencies: none / explain before adding +- Mocks: allowed / not allowed +- Production DB: N/A / read-only / destructive (requires explicit confirmation) +``` diff --git a/.cursor/rules/tdd-workflow.mdc b/.cursor/rules/tdd-workflow.mdc new file mode 100644 index 000000000..673ec7b79 --- /dev/null +++ b/.cursor/rules/tdd-workflow.mdc @@ -0,0 +1,30 @@ +--- +description: Test-first workflow for new behavior, importers, and API changes +alwaysApply: true +--- + +# TDD Workflow + +Use for new behavior, bug fixes with clear reproduction, and importers/API changes where expected I/O is known. + +Skip for: typos, pure refactors with existing test coverage, config-only changes. + +## Timing relative to planning + +- **After plan approval** (or immediately for trivial fixes that skip planning): write failing tests, then implementation. +- Do not write production code before plan approval when planning is required. + +## Loop + +1. **Write tests first** from expected input/output or acceptance criteria. +2. **Run tests and confirm they fail** for the right reason. Do not write implementation yet. +3. **Commit tests** only when the user explicitly asks to commit mid-flow. +4. **Implement** the minimum code to pass tests. Do not modify tests to make them pass unless requirements changed (say so explicitly). +5. **Iterate** until `make test` (and lint/mypy) pass. +6. **Commit implementation** when the user asks. + +## OpenCRE conventions + +- Follow patterns in `@application/tests/` (e.g. `@application/tests/pci_dss_parser_test.py` for importers). +- Use test DB setup from existing parser/web tests; avoid unnecessary mocks when integration-style tests match the codebase. +- Prefer one focused test class per behavior area. diff --git a/.cursor/rules/verifiable-goals.mdc b/.cursor/rules/verifiable-goals.mdc new file mode 100644 index 000000000..98e26dc71 --- /dev/null +++ b/.cursor/rules/verifiable-goals.mdc @@ -0,0 +1,49 @@ +--- +description: Non-negotiable lint, mypy, test, and CI checks with evidence in handoff +alwaysApply: true +--- + +# Verifiable Goals + +Agents need pass/fail checks to close the loop. Without verification, the human becomes the verification loop. + +## Required checks (code changes) + +Run in order unless clearly irrelevant (explain why if skipped): + +1. `make lint` +2. `make mypy` +3. `make test` — full suite, or a targeted module when scope is narrow +4. `make frontend` / `yarn build` — when frontend/TS/TSX touched +5. `make alembic-guardrail` — before deploy/migration ops + +## CI + +When preparing or fixing a PR: all workflow jobs must be green. +Use `gh pr checks` or the GitHub Actions UI. Fix failures iteratively. + +## Frontend TypeScript (when TS/TSX changed) + +- Webpack production build must succeed (`make frontend` / `yarn build`) +- TypeScript must compile with no errors +- Prettier must pass (`make lint` runs black for Python and prettier for frontend) + +## Iterate until green + +- If any check fails, fix the failure and rerun from the failed step. +- Do not hand off with failing checks unless blocked; document the blocker explicitly. +- Before commit (when user asks): all relevant checks must pass. + +## Handoff evidence (mandatory) + +Report what you ran and the outcome — not assertions: + +``` +make lint — passed +make mypy — passed +make test — passed (N tests, 0 failures) +make frontend — passed (if applicable) +gh pr checks — all green (if PR-related) +``` + +If a check failed, show the error, the fix, and the rerun result. diff --git a/.gitignore b/.gitignore index 831738958..ecc6b7d04 100644 --- a/.gitignore +++ b/.gitignore @@ -46,7 +46,9 @@ v/ .venv/ ### Local AI/editor workspaces ### -.cursor/ +.cursor/* +!.cursor/rules/ +!.cursor/rules/** .claude/ ### Frontend @@ -64,6 +66,7 @@ standards_cache.sqlite ### Docs *.md +!AGENTS.md ### Dev DBDumps *.sql @@ -78,3 +81,5 @@ tmp/ ### CREs dir cres/* +### Local project management tooling +project management scripts/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..38c171524 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,40 @@ +# OpenCRE Agent Instructions + +Cursor agents working in this repo must follow the rules in `.cursor/rules/`. + +## Quick start + +1. **Requirements gate** — If goal, success criteria, `@` file refs, or constraints are missing, stop and ask (`requirements-gate.mdc`). +2. **Plan first** — Non-trivial or multi-file work requires a plan and user approval before coding (`plan-first-workflow.mdc`, `multi-agent-workflow.mdc`). +3. **Verify** — After code changes, run checks and iterate until green (`verifiable-goals.mdc`): + - `make lint` + - `make mypy` + - `make test` + - `make frontend` (if frontend touched) +4. **Review** — Substantive work needs independent judge/subagent review (`multi-agent-workflow.mdc`). + +## Rule index + +| Rule | Purpose | +|------|---------| +| `requirements-gate.mdc` | Clarifying questions + requirements template | +| `complete-ticket.mdc` | Ticket gate for `.md`/`.txt` files; uses `requirements-gate` template + coding standards | +| `plan-first-workflow.mdc` | Plan Mode before non-trivial edits | +| `multi-agent-workflow.mdc` | Big changes, approval gates, builder ≠ judge | +| `verifiable-goals.mdc` | Lint, mypy, test, CI — show evidence | +| `never-assume.mdc` | No guessing; complete code; minimal scope | +| `tdd-workflow.mdc` | Test-first for new behavior and importers | +| `autonomous-workflow.mdc` | Execute after approval; no unsolicited commits | +| `context-management.mdc` | `/clear`, `@` refs, stale context recovery | +| `production-db-ops-safety.mdc` | Destructive prod DB confirmation | +| `alembic-deploy-guardrail.mdc` | Pre-deploy migration guardrail | + +## OpenCRE commands + +```bash +make lint # black + frontend prettier +make mypy # Python typecheck +make test # Python unittest suite +make frontend # yarn build (when TS/TSX changed) +make alembic-guardrail # before deploy/migration ops +```