🤖 CyberAI

OOB-driven, agent-trust-aware AI pentest platform

Built by someone who red-teams AI, not just with it.

What is CyberAI?

CyberAI is a multi-agent orchestration layer for offensive security. Five specialized agents — Recon, Intel, Exploit, Report, Web3 — run a typed, auditable pipeline that turns a target into actionable attack paths and a validated report.

Two things set it apart from "LLM wrapper over nmap":

OOB-driven exploitation. Blind vulns (SSRF, XXE, blind injection) are confirmed through out-of-band callbacks captured by phantom-grid, not guessed from response diffs.
Agent-trust-aware design. Every banner and tool output is treated as untrusted input: sanitized, injection-scanned, and parsed before it ever reaches the LLM context. Adversarial thinking is a design input, not a disclaimer.

Reach beyond the network: the Web3 agent runs Slither static analysis and maps detectors to Immunefi severity tiers for smart-contract audits.

Architecture +------------------+ target -----------> | Orchestrator | typed pipeline, dry-run, budget

+--------+---------+ injection-scan at phase boundaries

|

+-----------+----------+-----------+------------+

v v v v v

+------+ +------+ +--------+ +--------+ +------+

|Recon |-->|Intel |-->|Exploit |->|Report | | Web3 | (standalone)

+------+ +------+ +---+----+ +--------+ +--+---+

DNS NVD/CVE OOB | PoC judge | Slither

nmap EPSS nuclei H1-export | Immunefi

subdom prioritize | | severity

v

+-------------+

| phantom-grid| OOB callback capture

+-------------+ Observability: SQLite audit log . session export/import . cyberai replay

Interfaces: CLI . FastAPI dashboard (SSE) . MCP server (Claude Desktop) ### Agents

Agent	Input	Output	Key tools
Recon	target	open ports, DNS, WHOIS, subdomains	nmap (flag-whitelisted), async DNS, subdomain enum
Intel	recon kb	ranked CVEs	NVD client, EPSS enrichment, risk prioritizer
Exploit	intel kb	attack paths, OOB findings	nuclei, searchsploit, OOB/SSRF/XXE workflows
Report	session kb	structured Markdown / H1 export	LLM summary + LLM-as-judge validation
Web3	.sol path / address	severity-tiered findings	Slither, Etherscan, Immunefi classifier

Security design

Agent trust boundaries — each agent runs with minimal permissions.
Untrusted input handling — banners sanitized, length-capped, marked UNTRUSTED before LLM context.
Prompt-injection detection — 33-pattern detector at every phase boundary; hits become MEDIUM findings, visible in the report.
Scope enforcement — wildcard + !-exclusion matching honors HackerOne / Bugcrowd briefs (cyberai scope import).
Audit trail — every agent action logged (JSONL or SQLite) with full inputs/outputs; sessions are replayable.

Quick start

git clone https://github.com/evkir/CyberAI.git
cd CyberAI
pip install -e .

cp config.example.yml config.yml
cp .env.example .env
# Edit .env — add OPENAI_API_KEY or ANTHROPIC_API_KEY (not needed for --dry-run)

# Dry-run: walks all 4 phases, no network, no API key
python -m cyberai scan example.com --dry-run

# Real scan, scope-restricted
python -m cyberai scan target.htb --scope '*.target.htb'

# Replay a saved session deterministically
python -m cyberai replay <session_id>

# Import a bug-bounty scope
python -m cyberai scope import h1 --program acme

# Status / config
python -m cyberai status

Web dashboard

uvicorn cyberai.web.app:app --reload
# http://127.0.0.1:8000  — session list, live SSE progress, report view

MCP server (Claude Desktop / Cursor)

python -m cyberai.mcp.server

Exposes recon/intel tools (nmap_scan, dns_enum, cve_search, epss_score, …) over the Model Context Protocol. See docs/mcp/integration.md.

Configuration

# config.yml
llm:
  provider: openai        # openai | anthropic
  model: gpt-4o
  max_tokens: 4096
  temperature: 0.2

phantom:
  grid_url: http://127.0.0.1:9090

output_dir: reports/
max_cost_usd: 0.0         # 0 = disabled; set to enforce a budget

Optional feature flags (default off, no-regression): use_native_tools, use_nuclei, use_llm_summary, use_judge.

Documentation

Doc	What
docs/api/agents.md	Agent API reference
docs/exploit/oob-exploitation-workflow.md	OOB / SSRF walkthrough
docs/web3/web3-audit.md	Smart-contract audit for Immunefi
docs/mcp/integration.md	MCP server setup

Related tools

Tool	Role
phantom-grid	OOB interaction capture
phantom-intel	CVE intelligence feed
reality-probe	TLS analysis & config auditing

Requirements

Python 3.11+
OpenAI or Anthropic API key (not required for --dry-run)
Optional: phantom-grid (OOB), nuclei, slither, NVD API key

License

MIT — see LICENSE

_{Part of the evkir security toolchain.}

Name		Name	Last commit message	Last commit date
Latest commit History 316 Commits
.github		.github
cyberai		cyberai
docs		docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
STANDOFF.md		STANDOFF.md
config.example.yml		config.example.yml
main.py		main.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 CyberAI

What is CyberAI?

Architecture +------------------+ target -----------> | Orchestrator | typed pipeline, dry-run, budget

Security design

Quick start

Web dashboard

MCP server (Claude Desktop / Cursor)

Configuration

Documentation

Related tools

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 CyberAI

What is CyberAI?

Architecture +------------------+ target -----------> | Orchestrator | typed pipeline, dry-run, budget

Security design

Quick start

Web dashboard

MCP server (Claude Desktop / Cursor)

Configuration

Documentation

Related tools

Requirements

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages