From ae4cd549dc70c47b705f1c51b1fc899af88685fb Mon Sep 17 00:00:00 2001
From: Evgeny Kiriyak <224408464+evkir@users.noreply.github.com>
Date: Fri, 19 Jun 2026 00:06:25 +0300
Subject: [PATCH 1/4] docs: comprehensive README rewrite for v1.0 positioning

---
 README.md | 223 +++++++++++++++++++++++++++---------------------------
 1 file changed, 112 insertions(+), 111 deletions(-)
diff --git a/README.md b/README.md
index 8f2e2d3..73c7c30 100644
--- a/README.md
+++ b/README.md
@@ -1,16 +1,14 @@
 <div align="center">
 
-
-![CI](https://github.com/evkir/CyberAI/actions/workflows/ci.yml/badge.svg) ![Python](https://img.shields.io/badge/python-3.11%20%7C%203.12-blue) ![License](https://img.shields.io/badge/license-MIT-green)
+![CI](https://github.com/evkir/CyberAI/actions/workflows/ci.yml/badge.svg)
+![Python](https://img.shields.io/badge/python-3.11%2B-blue)
+![License](https://img.shields.io/badge/license-MIT-green)
+![Status](https://img.shields.io/badge/status-v0.5.0-orange)
+![LLM](https://img.shields.io/badge/LLM-OpenAI%20%7C%20Anthropic-blueviolet)
 
 # 🤖 CyberAI
 
-**AI-powered pentest orchestration platform**
-
-![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white)
-![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)
-![Status](https://img.shields.io/badge/Status-Active%20Development-orange?style=flat-square)
-![LLM](https://img.shields.io/badge/LLM-OpenAI%20%7C%20Anthropic-blueviolet?style=flat-square)
+**OOB-driven, agent-trust-aware AI pentest platform**
 
 > Built by someone who red-teams AI, not just with it.
 
@@ -20,121 +18,132 @@
 
 ## What is CyberAI?
 
-CyberAI is a multi-agent orchestration layer for offensive security workflows.
-It connects the **phantom toolchain** — OOB detection, CVE intelligence, TLS analysis —
-and routes findings through an AI pipeline that surfaces actionable attack paths.
+CyberAI is a multi-agent orchestration layer for offensive security. Five
+specialized agents — **Recon, Intel, Exploit, Report, Web3** — run a typed,
+auditable pipeline that turns a target into actionable attack paths and a
+validated report.
+
+Two things set it apart from "LLM wrapper over nmap":
 
-This is not a chatbot wrapper for pentesters.
-It's an agentic system where specialized AI agents handle recon, correlation,
-and reporting autonomously — while you focus on what matters: exploitation.
+- **OOB-driven exploitation.** Blind vulns (SSRF, XXE, blind injection) are
+  confirmed through out-of-band callbacks captured by
+  [phantom-grid](https://github.com/evkir/phantom-grid), not guessed from
+  response diffs.
+- **Agent-trust-aware design.** Every banner and tool output is treated as
+  untrusted input: sanitized, injection-scanned, and parsed before it ever
+  reaches the LLM context. Adversarial thinking is a design input, not a
+  disclaimer.
+
+Reach beyond the network: the **Web3 agent** runs Slither static analysis and
+maps detectors to Immunefi severity tiers for smart-contract audits.
 
 ---
 
-## Architecture
+## Architecture                                                                            +------------------+                                                                       target -----------> |   Orchestrator   |  typed pipeline, dry-run, budget
 
-```
-┌──────────────────────────────────────────────────────────┐
-│                        CyberAI Core                      │
-│                                                          │
-│   ┌──────────────────┐       ┌────────────────────────┐  │
-│   │   Orchestrator   │──────▶│      Agent Pool        │  │
-│   │      Agent       │       │  ┌─────────────────┐   │  │
-│   └──────────────────┘       │  │  Recon Agent    │   │  │
-│           │                  │  │  Intel Agent    │   │  │
-│           │                  │  │  Exploit Agent  │   │  │
-│           │                  │  │  Report Agent   │   │  │
-│           │                  │  └─────────────────┘   │  │
-│           │                  └────────────────────────┘  │
-│           ▼                                              │
-│   ┌──────────────────────────────────────────────────┐   │
-│   │                  Phantom Stack                   │   │
-│   │        phantom-grid  ·  phantom-intel            │   │
-│   │                  reality-probe                   │   │
-│   └──────────────────────────────────────────────────┘   │
-└──────────────────────────────────────────────────────────┘
-```
++--------+---------+  injection-scan at phase boundaries
 
-### Agent responsibilities
+|
 
-| Agent | Role |
-|-------|------|
-| **Orchestrator** | Routes tasks, manages agent lifecycle, aggregates results |
-| **Recon** | Target enumeration — DNS, WHOIS, subdomains, open ports |
-| **Intel** | CVE lookups, CVSS scoring, exploit availability |
-| **Exploit** | CVE → PoC mapping, attack surface analysis |
-| **Report** | Findings aggregation → structured Markdown / PDF output |
++-----------+----------+-----------+------------+
 
----
+v           v          v           v            v
 
-## Security design
++------+   +------+   +--------+  +--------+   +------+
+
+|Recon |-->|Intel |-->|Exploit |->|Report  |   | Web3 | (standalone)
+
++------+   +------+   +---+----+  +--------+   +--+---+
 
-Multi-agent security is a first-class concern, not an afterthought:
+DNS       NVD/CVE     OOB |  PoC  judge         | Slither
 
-- **Agent trust boundaries** — each agent operates with minimal necessary permissions
-- **Input validation** — all external data sanitized before entering the LLM context
-- **Prompt injection resistance** — structured prompts, output parsing, no raw passthrough
-- **Audit trail** — every agent action logged with full inputs and outputs
+nmap      EPSS        nuclei H1-export          | Immunefi
 
-> The irony of building an AI pentest tool while studying AI attack surfaces
-> is intentional. Adversarial thinking is a design input.
+subdom    prioritize      |                     | severity
+
+v
+
++-------------+
+
+| phantom-grid|  OOB callback capture
+
++-------------+
+Observability:  SQLite audit log . session export/import . cyberai replay
+
+Interfaces:     CLI . FastAPI dashboard (SSE) . MCP server (Claude Desktop)                ### Agents
+
+| Agent | Input | Output | Key tools |
+|-------|-------|--------|-----------|
+| **Recon** | target | open ports, DNS, WHOIS, subdomains | nmap (flag-whitelisted), async DNS, subdomain enum |
+| **Intel** | recon kb | ranked CVEs | NVD client, EPSS enrichment, risk prioritizer |
+| **Exploit** | intel kb | attack paths, OOB findings | nuclei, searchsploit, OOB/SSRF/XXE workflows |
+| **Report** | session kb | structured Markdown / H1 export | LLM summary + LLM-as-judge validation |
+| **Web3** | .sol path / address | severity-tiered findings | Slither, Etherscan, Immunefi classifier |
 
 ---
 
-## Project structure
+## Security design
 
-```
-CyberAI/
-├── cyberai/
-│   ├── core/               # Orchestrator, config, LLM client
-│   ├── agents/
-│   │   ├── recon/          # Target enumeration pipeline
-│   │   ├── intel/          # CVE intelligence feed
-│   │   ├── exploit/        # CVE → PoC mapping
-│   │   └── report/         # Report generation
-│   ├── integrations/       # Phantom stack connectors
-│   └── utils/              # Shared helpers
-├── templates/              # Jinja2 report templates
-├── tests/
-│   ├── unit/
-│   └── integration/
-├── config.example.yml
-├── .env.example
-├── requirements.txt
-└── setup.py
-```
+- **Agent trust boundaries** — each agent runs with minimal permissions.
+- **Untrusted input handling** — banners sanitized, length-capped, marked
+  `UNTRUSTED` before LLM context.
+- **Prompt-injection detection** — 33-pattern detector at every phase boundary;
+  hits become MEDIUM findings, visible in the report.
+- **Scope enforcement** — wildcard + `!`-exclusion matching honors HackerOne /
+  Bugcrowd briefs (`cyberai scope import`).
+- **Audit trail** — every agent action logged (JSONL or SQLite) with full
+  inputs/outputs; sessions are replayable.
 
 ---
 
 ## Quick start
 
-**1. Clone and install**
-
 ```bash
 git clone https://github.com/evkir/CyberAI.git
 cd CyberAI
 pip install -e .
 ```
 
-> Prefer isolation? Run `python -m venv venv && source venv/bin/activate` first.
-
-**2. Configure**
-
 ```bash
 cp config.example.yml config.yml
 cp .env.example .env
-# Edit .env -- add your OPENAI_API_KEY or ANTHROPIC_API_KEY
+# Edit .env — add OPENAI_API_KEY or ANTHROPIC_API_KEY (not needed for --dry-run)
 ```
 
-**3. Run a scan**
-
 ```bash
-# Dry-run: walks all 4 phases, no network calls, no API key needed
+# Dry-run: walks all 4 phases, no network, no API key
 python -m cyberai scan example.com --dry-run
 
-# Real scan
-python -m cyberai scan target.htb
+# Real scan, scope-restricted
+python -m cyberai scan target.htb --scope '*.target.htb'
+
+# Replay a saved session deterministically
+python -m cyberai replay <session_id>
+
+# Import a bug-bounty scope
+python -m cyberai scope import h1 --program acme
+
+# Status / config
+python -m cyberai status
+```
+
+### Web dashboard
+
+```bash
+uvicorn cyberai.web.app:app --reload
+# http://127.0.0.1:8000  — session list, live SSE progress, report view
+```
+
+### MCP server (Claude Desktop / Cursor)
+
+```bash
+python -m cyberai.mcp.server
 ```
 
+Exposes recon/intel tools (`nmap_scan`, `dns_enum`, `cve_search`,
+`epss_score`, …) over the Model Context Protocol. See
+[docs/mcp/integration.md](docs/mcp/integration.md).
+
 ---
 
 ## Configuration
@@ -142,37 +151,31 @@ python -m cyberai scan target.htb
 ```yaml
 # config.yml
 llm:
-  provider: openai       # openai | anthropic
+  provider: openai        # openai | anthropic
   model: gpt-4o
   max_tokens: 4096
   temperature: 0.2
 
 phantom:
-  grid_url: http://127.0.0.1:8080
-  intel_db: ~/.phantom/intel.db
+  grid_url: http://127.0.0.1:9090
 
 output_dir: reports/
-verbose: false
-timeout: 60
+max_cost_usd: 0.0         # 0 = disabled; set to enforce a budget
 ```
 
+Optional feature flags (default off, no-regression):
+`use_native_tools`, `use_nuclei`, `use_llm_summary`, `use_judge`.
+
 ---
 
-## Roadmap
+## Documentation
 
-```
-[x] Project structure & scaffolding
-[x] Config system (.env + YAML)
-[ ] LLM client abstraction (OpenAI / Anthropic)
-[ ] Orchestrator agent core loop
-[ ] Recon agent — DNS, WHOIS, subdomain enum
-[ ] phantom-intel integration — CVE context injection
-[ ] phantom-grid integration — OOB result correlation
-[ ] Exploit suggestion agent — CVE → PoC mapping
-[ ] Report generation — Markdown + PDF output
-[ ] Multi-agent safety protocol layer
-[ ] CLI interface (click)
-```
+| Doc | What |
+|-----|------|
+| [docs/api/agents.md](docs/api/agents.md) | Agent API reference |
+| [docs/exploit/oob-exploitation-workflow.md](docs/exploit/oob-exploitation-workflow.md) | OOB / SSRF walkthrough |
+| [docs/web3/web3-audit.md](docs/web3/web3-audit.md) | Smart-contract audit for Immunefi |
+| [docs/mcp/integration.md](docs/mcp/integration.md) | MCP server setup |
 
 ---
 
@@ -180,7 +183,7 @@ timeout: 60
 
 | Tool | Role |
 |------|------|
-| [phantom-grid](https://github.com/evkir/phantom-grid) | OOB interaction capture & analysis |
+| [phantom-grid](https://github.com/evkir/phantom-grid) | OOB interaction capture |
 | [phantom-intel](https://github.com/evkir/phantom-intel) | CVE intelligence feed |
 | [reality-probe](https://github.com/evkir/reality-probe) | TLS analysis & config auditing |
 
@@ -188,9 +191,9 @@ timeout: 60
 
 ## Requirements
 
-- Python 3.10+
-- OpenAI API key **or** Anthropic API key
-- phantom-grid (optional, for OOB correlation)
+- Python 3.11+
+- OpenAI **or** Anthropic API key (not required for `--dry-run`)
+- Optional: phantom-grid (OOB), nuclei, slither, NVD API key
 
 ---
 
@@ -198,8 +201,6 @@ timeout: 60
 
 MIT — see [LICENSE](LICENSE)
 
----
-
 <div align="center">
 <sub>Part of the <a href="https://github.com/evkir">evkir</a> security toolchain.</sub>
 </div>

From 5af221e1764e3577533dd2e236c07ce17c9557ad Mon Sep 17 00:00:00 2001
From: Evgeny Kiriyak <224408464+evkir@users.noreply.github.com>
Date: Fri, 19 Jun 2026 00:07:46 +0300
Subject: [PATCH 2/4] docs: rewrite agent API reference with real contract

---
 docs/api/agents.md | 112 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 95 insertions(+), 17 deletions(-)

diff --git a/docs/api/agents.md b/docs/api/agents.md
index 3507dd4..304edc2 100644
--- a/docs/api/agents.md
+++ b/docs/api/agents.md
@@ -1,22 +1,100 @@
-# CyberAI Agent API
+# Agent API Reference
 
-## AsyncPipeline
-from cyberai.core.pipeline import AsyncPipeline
-result = AsyncPipeline.execute("10.10.10.1")
-print(result.success, result.recon, result.intel, result.exploit)
+All pipeline agents share the `BaseAgent` contract. The orchestrator constructs
+each agent with explicit dependencies and calls `run()`:
 
-## AsyncReconAgent
-from cyberai.agents.recon.async_agent import AsyncReconAgent
-result = await AsyncReconAgent().run("10.10.10.1")
+```python
+agent = ReconAgent(config, session, llm, audit)
+result = agent.run(target, context=None)  # -> dict; data also written to session.kb
+```
 
-## AsyncIntelAgent
-from cyberai.agents.recon.async_agent import AsyncIntelAgent
-result = await AsyncIntelAgent().run(recon_result)
+- `config` — `CyberAIConfig` (feature flags, budget, output_dir)
+- `session` — `ScanSession`; agents read/write findings and `session.kb`
+- `llm` — `LLMClient` (may be `None` for deterministic / dry-run paths)
+- `audit` — `AuditLogger`; every action is recorded
 
-## AsyncExploitAgent
-from cyberai.agents.recon.async_agent import AsyncExploitAgent
-result = await AsyncExploitAgent().run(intel_result)
+`run()` returns a status dict and persists structured data into `session.kb`
+under the agent's key. Agents never mutate each other directly — the knowledge
+base is the single source of truth between phases.
 
-## Safety
-from cyberai.core.safety import InputSanitizer, ScopeValidator, ScopeConfig
-clean = InputSanitizer.sanitize(untrusted_string)
+---
+
+## ReconAgent
+
+`cyberai/agents/recon/agent.py`
+
+- **Input:** `target` (host / domain / IP)
+- **Output dict + kb key `recon`:** open ports, DNS records, WHOIS, subdomains
+- **Tools:** `nmap_scan` (flag-whitelisted), `dns_lookup`, `whois_lookup`,
+  `subdomain_enum`
+- **Edge cases:**
+  - nmap flags are validated against a whitelist; unknown flags are rejected
+    before subprocess (no shell, argv list).
+  - Results are cached by `target + flags` hash (TTL); failed scans (rc != 0)
+    are not cached.
+  - Async variant (`AsyncReconAgent`) gathers DNS + subdomain enumeration
+    concurrently; nmap/TLS stay on executor (blocking subprocess).
+
+## IntelAgent
+
+`cyberai/agents/intel/agent.py`
+
+- **Input:** `target` + recon data from `session.kb`
+- **Output dict + kb key `intel`:** ranked CVEs with CVSS, EPSS, exploit factor
+- **Tools:** `cve_search` (NVD), `epss_score`
+- **Edge cases:**
+  - NVD rate-limited (50/30s with API key, 5/30s without); 429/503 →
+    exponential backoff, max 3 retries.
+  - EPSS HTTP failure → silent `0.0`, pipeline survives `api.first.org` outage.
+  - Composite score boosts EPSS non-linearly (EPSS > 0.5 → 🔥, > 0.2 → ⚠).
+
+## ExploitAgent
+
+`cyberai/agents/exploit/agent.py`
+
+- **Input:** `target` + intel data from `session.kb`
+- **Output dict + kb key `exploit`:** attack paths, PoC mappings, OOB findings
+- **Tools:** `build_chain`, `map_poc`; optional nuclei / OOB workflows
+- **Flags:** `use_native_tools` (LLM-driven chain via native tool calling),
+  `use_nuclei` (nuclei engine + OOB wiring)
+- **Edge cases:**
+  - OOB workflows (SSRF/XXE) confirm blind vulns via phantom-grid callbacks —
+    see [../exploit/oob-exploitation-workflow.md](../exploit/oob-exploitation-workflow.md).
+  - Native tool args carry identifiers (`cve_id`/`target`), not full CVE dicts;
+    real data is resolved agent-side (anti-hallucination, fewer tokens).
+  - Falls back to the deterministic path if the model never calls `build_chain`.
+  - searchsploit / nuclei absent → graceful (`available = False`), not fatal.
+
+## ReportAgent
+
+`cyberai/agents/report/agent.py`
+
+- **Input:** `target` + full `session.kb`
+- **Output dict + kb key `report`:** Markdown report path; optional H1 export
+- **Tools:** deterministic renderer; optional LLM summary + judge
+- **Flags:** `use_llm_summary` (structured LLM summary), `use_judge`
+  (LLM-as-judge validation)
+- **Edge cases:**
+  - Deterministic report never fails on LLM error (fail-safe try/except).
+  - Judge validates each claim against kb evidence; score < 0.7 triggers a
+    regeneration with feedback. Hallucinated CVEs (not in kb) are caught.
+  - HackerOne export follows the H1 template (title / severity / steps /
+    impact / recommendation).
+
+## SmartContractAgent (Web3)
+
+`cyberai/agents/web3/agent.py`
+
+- **Standalone** — not part of the network pipeline; a contract is not a
+  network target.
+- **Input:** `target` = local `.sol` path **or** contract address
+- **Output dict + kb key `web3`:** findings, `highest_severity`,
+  `slither_available`; for addresses, `source_meta` from Etherscan
+- **Tools:** `slither_scan`, `fetch_source` (Etherscan)
+- **Edge cases:**
+  - Local `.sol` is the primary path; Etherscan is graceful without an API key.
+  - Slither absent → `available = False`, findings empty, no crash.
+  - Detectors map to Immunefi tiers (reentrancy-eth / arbitrary-send /
+    suicidal / delegatecall → Critical); unknown detectors fall back to
+    impact × confidence.
+  - See [../web3/web3-audit.md](../web3/web3-audit.md).

From 6934f431c970c2e355b397006db7507e5d92eb1b Mon Sep 17 00:00:00 2001
From: Evgeny Kiriyak <224408464+evkir@users.noreply.github.com>
Date: Fri, 19 Jun 2026 00:09:31 +0300
Subject: [PATCH 3/4] docs: rewrite OOB exploitation walkthrough for
 phantom-grid v2 token-flow

---
 docs/exploit/oob-exploitation-workflow.md | 107 +++++++++++++++-------
 1 file changed, 74 insertions(+), 33 deletions(-)

diff --git a/docs/exploit/oob-exploitation-workflow.md b/docs/exploit/oob-exploitation-workflow.md
index d2924a7..cb81e26 100644
--- a/docs/exploit/oob-exploitation-workflow.md
+++ b/docs/exploit/oob-exploitation-workflow.md
@@ -2,46 +2,87 @@
 
 ## Overview
 
-Out-of-band (OOB) techniques confirm blind vulnerabilities where the
-application response gives no direct feedback. CyberAI routes OOB
-payloads through phantom-grid, which captures DNS/HTTP callbacks.
+Out-of-band (OOB) techniques confirm **blind** vulnerabilities — cases where
+the application response gives no direct feedback (blind SSRF, blind XXE, blind
+injection). Instead of diffing responses, CyberAI plants a payload that forces
+the target to call back to [phantom-grid](https://github.com/evkir/phantom-grid),
+which captures the DNS/HTTP interaction out of band. A captured callback is
+proof of execution.
 
-## Architecture
+## Components                                                                              ExploitAgent
 
-ExploitAgent
-│
-├── SSRFWorkflow ──► target app ──► phantom-grid (OOB callback)
-│                                        │
-└── XXEWorkflow ──► target XML parser ───┘
-│
-PhantomGridPoller
-(polls for callback)
++-- SSRFWorkflow  --> target app  ----+
 
-## SSRF Detection Flow
++-- XXEWorkflow   --> XML parser  ----+--> phantom-grid (captures callback)
 
-1. Generate unique `interaction_id`
-2. Build payload: `http://<phantom-domain>/<interaction_id>`
-3. Inject into URL parameter via GET or POST
-4. Poll phantom-grid `/api/interactions/<id>` for DNS/HTTP hit
-5. Confirmed hit → HIGH severity finding
++-- OOBWorkflow   --> generic inject -+
 
-## Blind XXE Flow
+|
 
-1. Generate XXE payload referencing phantom-grid domain
-2. Deliver via POST body, SOAP envelope, or file upload
-3. Parser resolves external entity → OOB DNS/HTTP to phantom-grid
-4. Poll for callback → confirmed blind XXE
+PhantomGridClient.get_interactions(id) <+   (polls captured interactions)                  - `cyberai/integrations/phantom_grid.py` — `PhantomGridClient` (token-flow, v2 API)
+- `cyberai/agents/exploit/ssrf_workflow.py` — `SSRFWorkflow`
+- `cyberai/agents/exploit/xxe_workflow.py` — `XXEWorkflow`
+- `cyberai/agents/exploit/oob_workflow.py` — `OOBWorkflow` (generic orchestration)
 
-## Payload Types
+## phantom-grid v2 token-flow
 
-| Type | Technique | Confirms |
-|------|-----------|---------|
-| Basic OOB | `SYSTEM "http://phantom/id"` | HTTP callback |
-| Parameter entity | `%remote` DTD load | DNS + HTTP |
+The grid runs on port **9090**. The capture URL is derived from a server-issued
+**token**, not a client-generated id:
 
-## Operational Notes
+```python
+from cyberai.integrations.phantom_grid import PhantomGridClient
 
-- Set `max_wait` based on target response time (default 30s)
-- Use per-test `interaction_id` — never reuse across targets
-- phantom-grid must be reachable from target server, not just attacker
-- All payloads logged in AuditTrail automatically via decorators
+grid = PhantomGridClient(base_url="http://127.0.0.1:9090")
+if not grid.available():           # health check; graceful if grid is down
+    ...
+
+token = grid.create_token(label="ssrf-example")   # POST /api/tokens -> token
+url   = grid.capture_url(token)                    # http://<host>/c/<token>
+# ... inject `url` into the target ...
+hits  = grid.get_interactions(token)               # GET captured interactions
+```
+
+`OOBInteraction.confirmed` is `True` once a DNS/HTTP hit lands on the token.
+
+## SSRF detection flow
+
+1. `SSRFWorkflow` requests a capture URL from phantom-grid (server token).
+2. Build the SSRF payload pointing at that URL (`_make_payload`).
+3. Inject into the candidate parameter via GET or POST (`test` / `test_batch`).
+4. Poll `get_interactions(token)` for a DNS/HTTP callback.
+5. Confirmed hit → `SSRFResult` with HIGH severity; recorded as a finding.
+
+## Blind XXE flow
+
+1. `XXEWorkflow` builds an XML payload with an external entity referencing the
+   phantom-grid capture URL.
+2. Submit to the XML-parsing endpoint.
+3. A parser that resolves the entity triggers the OOB callback.
+4. Poll for the interaction → confirmed blind XXE.
+
+## Worked example — blind SSRF on example.com
+
+> Authorized targets only. Confirm scope before running
+> (`cyberai scope import`).
+
+1. Start phantom-grid locally (or point `phantom.grid_url` at your instance).
+2. Run the exploit phase with OOB enabled:
+
+```bash
+   python -m cyberai scan example.com --scope example.com
+```
+
+3. ExploitAgent picks an SSRF-candidate parameter, requests a token from the
+   grid, and injects `http://<host>/c/<token>` into it.
+4. If `example.com` fetches the URL server-side, phantom-grid records the hit.
+5. `get_interactions(token)` returns a confirmed `OOBInteraction`; the agent
+   raises a HIGH-severity SSRF finding into the report.
+
+## Notes
+
+- No callback within the poll window → reported as **unconfirmed**, never a
+  false HIGH. Absence of evidence is not evidence.
+- phantom-grid absent / unreachable → workflows degrade gracefully
+  (`available = False`); the deterministic pipeline still completes.
+- WebSocket push is on the phantom-grid roadmap; the current client polls over
+  HTTP.

From 4314d522f0b1e0cbb67a3b366ab9451c50205021 Mon Sep 17 00:00:00 2001
From: Evgeny Kiriyak <224408464+evkir@users.noreply.github.com>
Date: Fri, 19 Jun 2026 00:11:05 +0300
Subject: [PATCH 4/4] docs: add Web3 / Immunefi audit workflow guide

---
 docs/web3/web3-audit.md | 81 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 docs/web3/web3-audit.md

diff --git a/docs/web3/web3-audit.md b/docs/web3/web3-audit.md
new file mode 100644
index 0000000..a452efa
--- /dev/null
+++ b/docs/web3/web3-audit.md
@@ -0,0 +1,81 @@
+# Web3 Audit Workflow — SmartContractAgent → Slither → Immunefi
+
+## Overview
+
+The `SmartContractAgent` runs static analysis on a Solidity contract and maps
+each finding to an [Immunefi](https://immunefi.com/) severity tier — so you can
+triage before submitting to a bounty program. It is **standalone**: a contract
+is not a network target, so this agent runs outside the recon→intel→exploit
+pipeline.
+
+## Components                                                                              SmartContractAgent.run(target)
+
++-- local .sol  --> SlitherTool.analyze() --> parse_slither_json()
+
+|                                                |
+
+|                                                v
+
+|                                   immunefi_severity.classify_all()
+
+|                                   immunefi_severity.highest_tier()
+
++-- address     --> EtherscanClient.fetch_source()   (graceful w/o API key)                - `cyberai/agents/web3/agent.py` — `SmartContractAgent`
+- `cyberai/agents/web3/slither_tool.py` — `SlitherTool`, `parse_slither_json`,
+  `SlitherFinding`
+- `cyberai/agents/web3/immunefi_severity.py` — `classify`, `classify_all`,
+  `highest_tier`
+- `cyberai/agents/web3/etherscan.py` — `EtherscanClient`
+
+## Input modes
+
+| `target` | Mode | Path |
+|----------|------|------|
+| local `*.sol` file | `local` | Slither analysis (primary) |
+| contract address | `address` | Etherscan source fetch (needs API key) |
+
+Local `.sol` is the primary, fully-offline path. Address mode is graceful
+without `ETHERSCAN_API_KEY` (returns source metadata only).
+
+## Severity mapping
+
+Slither detectors are mapped to Immunefi tiers by a per-check table; unknown
+detectors fall back to `impact × confidence`:
+
+| Slither detector | Immunefi tier |
+|------------------|---------------|
+| `reentrancy-eth` | Critical |
+| `arbitrary-send` | Critical |
+| `suicidal` | Critical |
+| `controlled-delegatecall` | Critical |
+| (unknown) | impact × confidence fallback |
+
+`highest_tier(findings)` returns the worst tier across all findings — your
+headline severity for a submission.
+
+## Worked example — reentrancy audit
+
+1. Point the agent at a local contract:
+
+```python
+   from cyberai.agents.web3.agent import SmartContractAgent
+
+   agent = SmartContractAgent(config, session, llm=None, audit=audit)
+   result = agent.run("contracts/Vault.sol")
+   print(result["highest_severity"], len(result["findings"]))
+```
+
+2. On a TheDAO-style reentrant contract, Slither reports `reentrancy-eth`
+   (alongside `solc-version`, `low-level-calls`).
+3. `reentrancy-eth` maps to **Critical** → `highest_severity = "Critical"`.
+4. Triage the finding against the program's scope and PoC requirements before
+   submitting to Immunefi.
+
+## Notes
+
+- Slither absent → `slither_available = False`, findings empty, no crash;
+  CI covers the logic with mocked Slither output.
+- JSON parsing is verified against Slither 0.11.5
+  (`results.detectors[].{check,impact,confidence,description}`).
+- This is a triage aid, **not** a substitute for manual review — static
+  analysis has false positives; confirm exploitability before submission.