MVP rewrite: OpenClaw + OpenBao + Kashif + Aga#1
Conversation
- Rename Sentinel -> Aga across PRDs and analysis docs. Ottoman metaphor: Aga (Agha of the Janissaries) commands the guard corps and directs Kashif's inspections. - Replace Infisical placeholder with OpenBao as the Secret Vault. Grant records now carry OpenBao lease IDs; missed revocations on province destroy are bounded by TTL, not by Aga's reliability. - Add KASHIF_NOTES.md: three-layer CPU-only screener design (LLM Guard regex + Prompt Guard 2 22M + Llama Guard 3 1B Q4). - Add CLAUDE.md: guidance for future Claude Code sessions, including MVP threat model (only hostile Pasha in scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nClaw update) Adopted from origin/archive-hermes-infisical:MOTIVATION.md. Only edit: the "How it's secured" bullet now names all three security-perimeter components (Janissary + Kashif + Aga) and lists OpenClaw alongside OpenHands and CrewAI as interchangeable runtime examples. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/Kashif edits) Adopted from origin/archive-hermes-infisical:ARCHITECTURE.md. Edits: - Sentinel renamed to Aga throughout - Hermes runtime -> OpenClaw (diagrams, province lifecycle step 17, config file paths) - hermes-firman / hermes-coding-berat -> openclaw-firman / openclaw-coding-berat - Infisical -> OpenBao; grant records carry openbao_lease_id and lease_expires_at; destroy flow notes that missed revocations are bounded by lease TTL server-side - Kashif added to system overview diagram as trusted-core component alongside Janissary - Appeal flow diagram extended: Janissary forwards appeal payload to Kashif /screen/appeal; Kashif returns allow/block/escalate with fail-closed semantics on timeout - US-4 and US-6 updated to include Kashif screening step - US-10 boot order adds OpenBao (first), Kashif, Aga - Divan box in overview shows dashboard on port 8601; US-10 asserts dashboard reachability via SSH tunnel Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Kashif-in-MVP) Rewrite of origin/archive-hermes-infisical:SULTANATE_MVP.md. Significant changes from the archived version: - Runtime: Hermes-native -> OpenClaw-native; one firman (openclaw-firman) and one berat (openclaw-coding-berat) - Secret vault: Infisical -> OpenBao (local binary, 127.0.0.1, manual unseal); Aga is sole client; grants carry lease IDs - Aga naming (was Sentinel) - Kashif promoted from deferred to in-MVP as the three-layer CPU-only content inspector; appeal flow now screens through Kashif first - Dashboard promoted from deferred to in-MVP; server-rendered Jinja2/HTMX in Divan's FastAPI process, 8 pages, 127.0.0.1:8601 - Added target-host section (Hetzner AX41-NVMe) and core RAM budget (~6-10 GB, leaving ~50 GB for provinces) - Added explicit threat-model section: only hostile Pasha in scope; trusted-core compromise deferred to Phase 2 - Startup order updated: OpenBao first, Kashif between Janissary and Aga, Aga authenticates via AppRole - Updated Deferred list to match Phase 2 backlog (dual audit sinks, AppRole rotation, Shamir split, auto-unseal, GPU upgrade, etc.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New file. Split Divan out into its own PRD rather than bundling it inside Janissary's PRD. Covers: - Scope: SQLite + FastAPI + Jinja2/HTMX dashboard in one process - Data held (province registry, grants with OpenBao lease IDs, whitelists, blacklist, appeals, port_requests, audit) - Role-based access control via pre-shared API keys - 8-page dashboard list (all read-only in MVP; mutations go through Aga) - HTTP basic auth + 127.0.0.1 binding (Sultan tunnels via SSH) - Startup order, fail-closed behavior - Deferred list (dashboard mutations, charts, multi-operator, signed audit, dual sinks, OAuth/OIDC) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ashif verdict + audit + dashboard routes)
Adopted from origin/archive-hermes-infisical:DIVAN_API_SPEC.md with edits:
Authentication roles:
- Renamed sentinel role to aga
- Added kashif role (reads appeals, writes kashif_verdict, writes audit)
- Added dashboard role (reads everything with inject.value always masked,
no writes)
Grant schema:
- Added openbao_lease_id and lease_expires_at fields
- Added lease expiry semantics section (Janissary fail-closed skip)
- Added PATCH /grants/{id} for Aga-driven lease renewal
- Added SQL columns and idx_grants_lease_expires_at index
Appeals:
- Added kashif_verdict, kashif_notes, screened_at fields
- Added PATCH /appeals/{id}/kashif_verdict endpoint (Kashif role)
- Documented verdict transitions: allow -> approved, block -> denied,
escalate -> stays pending for Sultan
- Added SQL columns and idx_appeals_kashif_verdict index
Audit endpoint (new):
- POST /audit for Janissary/Aga/Kashif/Vizier
- GET /audit for dashboard (with filters: component, province_id, since, limit)
- New audit table with created_at desc index
Dashboard routes section (new): 8 HTML pages listed, auth model
documented (basic auth + 127.0.0.1 binding on port 8601).
Firman/berat example payloads updated: hermes-firman ->
openclaw-firman, hermes-coding-berat -> openclaw-coding-berat. IP
example changed to WireGuard range (10.13.13.5).
Configuration split: API host/port and dashboard host/port as
separate env vars.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…late; clarify OpenBao KV vs dynamic User feedback triggered three related updates: 1. ARCHITECTURE.md: replaced the short Kashif-verdict footnote with a T+0 timeline walkthrough (Setup, T+0.0s through T+∞) covering all three Kashif outcomes (allow / block / escalate) and the rule that Case B and Case C both notify Sultan AND Aga. Rationale inline: streaks of Kashif blocks signal Pasha drift; the operator needs visibility even when the decision is already automatic. 2. SULTANATE_MVP.md: replaced prose appeal-flow section with a summary that points at ARCHITECTURE.md for the full timeline. Explicitly calls out that only Kashif=allow stays silent. 3. SULTANATE_MVP.md credential model: split into KV mode (Phase 1 default, Sultan-pasted PATs, no lease, no auto-expiry) and dynamic mode (Phase 2, GitHub App / DB creds / SSH CA, true OpenBao leases with TTL-bounded server-side revocation). Earlier doc implied all grants had leases; that was wrong for KV-stored PATs. 4. DIVAN_API_SPEC.md: added `severity` field (info/alert/error) to audit entries. Kashif=allow -> severity info; Kashif=block and Kashif=escalate -> severity alert. Vizier and Aga poll severity=alert for Sultan notification. Updated audit SQL table and added idx_audit_severity index. Updated kashif_verdict PATCH endpoint to document the notification behaviour per verdict. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lease-aware injection) Adopted from origin/archive-hermes-infisical:JANISSARY_MVP_PRD.md. Edits: - Removed "deferred: Kashif" and "automated triage" lines from the does-not-do list; replaced with pointer to KASHIF_MVP_PRD.md. - Appeal system rewritten around Kashif triage: Janissary forwards payload + justification to Kashif /screen/appeal in parallel with the Divan write. Covered: Kashif-allow auto-approve, Kashif-block auto-deny, Kashif-escalate notifies Sultan + Aga. - Credential injection section gained a lease-awareness step: Janissary checks lease_expires_at; if past, skips injection and writes an audit entry with severity=alert. KV-mode grants (openbao_lease_id=null) inject unconditionally. - Divan integration table switched to /janissary/state bulk snapshot and added /audit for Janissary writes. - Sentinel renamed to Aga throughout. - Deferred list cleaned up to match current roadmap (Kashif removed from deferred; added "automated appeal approval beyond Kashif" to deferred to prevent scope creep). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…the App once User feedback: Sultan should not paste PATs. Aga should create and manage tokens. Rewrote SULTANATE_MVP.md: - Credential Model: GitHub App dynamic mode is now the Phase 1 default (not Phase 2). KV mode demoted to fallback for services without a dynamic mint path. - GitHub Token Strategy: replaced the "Sultan pre-creates PATs" flow with a one-time App setup (Sultan creates/installs the GitHub App, hands the private key to Aga once) followed by automatic per-province minting. Documented renewal loop (every ~15 min, refresh any grant expiring within 20 min, stop on province stop/destroy). - Tokens have GitHub's hard-capped 1h TTL; Aga tracks expiry in the grant as lease_expires_at. Janissary's existing lease-aware fail-closed check handles expired tokens. MVP mints via direct GitHub API calls from Aga using the App private key held in OpenBao KV. Phase 2 may adopt community vault-plugin-secrets-github for a proper OpenBao secret-engine plugin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, lease-aware injection, OpenClaw/OpenBao/Aga)
Adopted from origin/archive-hermes-infisical:JANISSARY_SPEC.md
(1231 lines). Preserved all Sandcat fork specifics, WireGuard setup,
mitmproxy addon pattern, CA cert lifecycle, traffic rule evaluation,
Divan polling, and Docker deployment topology.
Edits:
- Sentinel renamed to Aga throughout
- Hermes image references -> openclaw/openclaw:vYYYY.M.D
- Grant cache and lookup: lease-aware. Grant dataclass gains
openbao_lease_id and lease_expires_at fields. JanissaryStateCache
parses ISO 8601 timestamps. _inject_credentials() checks expiry
before injecting; expired lease -> skip + audit severity=alert;
request continues without header (upstream 401 signals staleness
to agent; Aga re-issues).
- New section "Kashif Integration" (§8): KashifClient class, endpoints
/screen/appeal and /screen/ingress, 5s timeout with configurable
escalate_on_timeout, fail-closed (Kashif down -> verdict escalate,
never auto-approve).
- Appeal API rewritten: /api/appeal now takes payload, forwards to
Kashif in parallel with Divan write, returns 202 Accepted to agent.
Kashif itself writes the verdict to Divan
(PATCH /appeals/{id}/kashif_verdict); Divan auto-transitions on
allow/block. Janissary only writes kashif_verdict=escalate if
Kashif times out.
- /api/request_access also screens via Kashif /screen/ingress.
- Audit logging canonicalised to Divan /audit endpoint with severity
field (info/alert/error). Local JSONL file kept as optional mirror
for recovery/debugging. Info-severity allow decisions, alert for
blocks and lease-expired skips, error for component failures.
- Docker topology section updated: added openbao service (2.5.3,
127.0.0.1 only, manual unseal), kashif service (model pre-warm on
startup), aga service (AppRole auth to OpenBao, bootstrap private
key via OpenBao KV).
- Startup sequence expanded to seven steps (was four): OpenBao ->
Divan -> Janissary -> Kashif -> Aga -> Vizier -> Provinces on
demand. Janissary entrypoint now waits for both Divan and Kashif
health before starting mitmdump.
- Added failure modes for Kashif down, OpenBao sealed, and expanded
the existing matrix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New PRD for Kashif, the content inspector that was "deferred" in the archived MVP docs. Consolidated from KASHIF_NOTES.md and the Kashif sections previously bundled in JANISSARY_PRD_V2.md. Key decisions documented: - Three-layer pipeline on CPU: LLM Guard regex scanners -> Prompt Guard 2 22M classifier -> Llama Guard 3 1B Q4 LLM judge. Any layer can block; all must agree to allow. 5 s hard total timeout. - Two endpoints: POST /screen/appeal (Janissary role), POST /screen/ingress (Janissary + Aga roles). GET /health for Docker. - Kashif writes kashif_verdict directly to Divan; Divan auto- transitions the appeal record (allow -> approved, block -> denied). On escalate/timeout, appeal stays pending for Sultan. - Fail-closed on timeout or layer error -> always "escalate", never auto-approve/block on failure. - Resource budget: ~2.5-4 GB RAM, 2 CPU threads under load. Fits on Hetzner AX41-NVMe alongside other core components. - Model weights baked into Docker image at build time (no first-run network dependency). Llama Community License acceptance required at build, not runtime. - Stateless between requests. No memory, no history, no province- specific configuration -- pattern detection is Aga's job. - Weights baked into image, pre-warmed at startup (~10-30 s cold boot before /health goes 200). - Deferred list: Llama Guard 3 8B (GPU), per-province thresholds, content-hash caching, red-team evaluation harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b App, Kashif-screened ingress) Rewrite of origin/archive-hermes-infisical:SENTINEL_MVP_PRD.md. Structural and substantive changes: - Renamed Sentinel -> Aga; Ottoman framing (Agha of the Janissaries) inline in the intro. - Runtime: Hermes -> OpenClaw. Includes the OpenClaw config notes (openclaw.json, built-in tools list, custom aga-ops MCP server). - Secret vault: Infisical -> OpenBao. AppRole authentication at startup. Least-privilege policy with explicit denies on sys/audit, sys/policies, sys/seal, sys/generate-root, and auth/approle/role/aga (so Aga cannot rotate itself). - GitHub App is the Phase 1 default path (not KV-pasted PATs). Sultan sets up the App once; Aga mints per-province installation tokens, renews every ~15 min while province is running, stops on destroy. GitHub's 1-hour TTL acts as the safety net. - KV fallback documented for non-GitHub services where Sultan still pastes a token. - Ingress screening: all Pasha-originated content and fetched web content is pre-screened by Kashif before reaching Aga's LLM context. Removed the "no Kashif needed in MVP" rationale. - Divan integration section lists five polling loops (province, lease renewal, appeal escalation, Kashif-block alert counter, port request). - Sultan interactions section shows day-to-day behaviour where Sultan never handles tokens manually; added examples of Aga's unsolicited alerts based on Kashif-block counters and lease renewal failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…App minting, OpenClaw) Rewrite of origin/archive-hermes-infisical:SENTINEL_SPEC.md (803 lines). Substantive rewrite rather than mechanical rename; the core Infisical-CLI assumption is replaced throughout. Structural changes: - §1 OpenBao Integration (replaces Infisical Integration): HTTP API instead of CLI. AppRole auth flow with token refresh. Explicit HCL policy with least-privilege paths and deny list (sys/audit, sys/policies, sys/seal, sys/generate-root, sys/rotate, auth/approle/role/aga). - §2 GitHub App Integration (new): private-key storage in OpenBao KV, bootstrap workflow, JWT + installation-token minting flow, installation discovery. Phase 1 default path. - §3 Grant Provisioning Workflow: dynamic-mode default (GitHub App tokens). KV fallback documented separately for Sultan-pasted tokens. WireGuard peer IPs (10.13.13.x) instead of Docker bridge IPs (172.18.x). - §4 Grant Revocation and Renewal: explicit renewal loop for dynamic grants (every 15 min, renew if within 20 min of expiry). Revocation handles both dynamic (GitHub-TTL takes care of it) and KV (explicit delete). - §5-6 Port request / whitelist / blacklist: mechanical rename only. - §7 Appeal Escalation and Kashif-Block Alerts (new): covers Aga's role as Sultan's context-builder. Polls escalated appeals, composes advisory messages, maintains per-province Kashif-block counter with threshold alerting. - §8 OpenClaw Agent Configuration (replaces Hermes): SOUL.md + AGENTS.md + IDENTITY.md (OpenClaw auto-loaded files); openclaw.json (JSON not YAML); sandbox.mode=off (Aga is trusted, not sandboxed); aga-ops custom MCP server for structured ops tools. - §9 Divan Polling Loop: five loops instead of one (provinces, lease renewal, appeal escalation, Kashif-block counter, port requests). Three state files persisted to disk for restart recovery. - §10 Error Handling: added OpenBao unreachable/sealed case and GitHub API rate-limit / not-installed / transient cases. - §11 Deployment: docker-compose includes no-new-privileges, depends_on chain includes openbao/janissary/kashif healthy. Pre-deployment checklist grew from 10 to 13 items (OpenBao setup, Kashif health, GitHub App creation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, Aga appeal relay rules) Adopted from origin/archive-hermes-infisical:VIZIER_MVP_PRD.md. Edits: - Hermes agent -> OpenClaw agent; hermes-firman/hermes-coding-berat -> openclaw-firman/openclaw-coding-berat. - CLI renamed vizier -> vizier-cli (the user clarified distinction between vizier-cli tool and Vizier agent earlier in the archive history; this rewrite keeps that). - Province creation flow expanded to show WireGuard peer IP creation, OpenBao/Aga handoff for grant provisioning (dynamic GitHub App primary + KV fallback), and OpenClaw gateway startup command. - Appeal relay section updated: Vizier only forwards actionable appeals (kashif_verdict=escalate or null/timeout) for Sultan decision, plus informational notices for auto-blocks. Auto-allows never reach Sultan. Aligns with the notification rules documented in SULTANATE_MVP.md and DIVAN_API_SPEC.md. - Privileges table clarified: Vizier writes whitelist defaults and appeal decisions but never modifies grants or OpenBao. - OpenClaw Configuration section replaces the Hermes one: built-in bash/read/write/edit tools, SOUL.md + AGENTS.md auto-load, channels.telegram config. - Deferred list: additional runtimes (OpenHands, CrewAI) are Phase 2 firmans; not a runtime-level concern anymore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eGuard alloc, Kashif-aware appeal relay) Adopted from origin/archive-hermes-infisical:VIZIER_SPEC.md (1044 lines). Structural changes: - CLI binary: vizier -> vizier-cli throughout. - Hermes image references -> openclaw/openclaw:vYYYY.M.D. HERMES_HOME -> OPENCLAW_HOME. hermes gateway -> openclaw gateway --port 18789. terminal tool -> bash tool (OpenClaw built-in name). - Firman schema: startup.command example switched to openclaw gateway with --port 18789 args. - Berat schema: added templates.identity (OpenClaw auto-loads IDENTITY.md from workspace root) and templates.config renamed to openclaw.json. Grant templates gained a `kind` field (dynamic|kv) reflecting the OpenBao KV-vs-dynamic modes. - Template rendering: build_template_context now takes pasha_bot_token (Vizier provisions bots from a pool) and janissary_api explicitly, since berat templates need to embed them into openclaw.json. - Province creation flow rewritten into 13 steps: added WireGuard peer-IP allocation + keygen, Pasha Telegram bot provisioning from a pool, waiting for Aga to write the GitHub grant before git clone, OpenClaw-specific workspace layout (SOUL.md/AGENTS.md/IDENTITY.md in workspace/, openclaw.json in .openclaw/), and openclaw gateway startup command. - Docker integration section updated: docker network reshaped (sultanate-internal for Vizier only, not provinces; provinces share wg-client sidecar netns), volumes simplified (no more HERMES_HOME, just OPENCLAW_HOME), CLI examples rewritten for openclaw. - Appeal relay: Kashif-aware. Polls /appeals?status=pending&kashif_verdict=escalate (actionable) and /audit?severity=alert&component=kashif (informational). Two distinct Telegram message formats, four Sultan-response handlers (approve once / approve forever / deny / kill province). - OpenClaw Agent Wrapper section replaces Hermes Agent Wrapper: SOUL.md + AGENTS.md, openclaw.json (JSON, with channels.telegram, mcp_servers.janissary_security, sandbox mode off). - Deployment: security-opt no-new-privileges added. Startup order extended to seven steps (OpenBao -> Divan -> Janissary -> Kashif -> Aga -> Vizier -> Provinces). - Module map added wireguard.py (peer alloc + wg0.conf emit) and bot_pool.py (pre-provisioned Telegram bot assignment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te from hermes-firman) Rewrite of HERMES_FIRMAN_MVP_PRD.md + HERMES_FIRMAN_SPEC.md from origin/archive-hermes-infisical. Changes: - Image: nousresearch/hermes-agent:vYYYY.M.D -> openclaw/openclaw:vYYYY.M.D - Env: HERMES_HOME -> OPENCLAW_HOME; config path /opt/data/config.yaml -> /opt/data/.openclaw/openclaw.json (JSON, not YAML) - Workspace auto-loaded files: OpenClaw loads SOUL.md, AGENTS.md, IDENTITY.md, USER.md, TOOLS.md, BOOTSTRAP.md from the workspace root at first session turn. Berat templates for MVP write SOUL.md, AGENTS.md, (optional) IDENTITY.md in workspace and openclaw.json in .openclaw/. - Startup: hermes gateway -> openclaw gateway --port 18789 - Gateway HTTP endpoint: port 18789; Vizier health-checks via curl http://127.0.0.1:18789/health during provision - Bootstrap commands support {{repo_name}} / {{branch}} / {{workspace_dir}} substitution, same engine as berat templates - Sandbox mode explicitly "off" in MVP -- Sultanate's outer container (WireGuard + kill-switch) is the isolation boundary; nested Docker-in-Docker sandboxing is a Phase 2 option - CA cert installation retained (docker cp + update-ca-certificates + NODE_EXTRA_CA_CERTS for Node.js-based OpenClaw runtime) - Upstream image section rewritten for OpenClaw's capabilities (Node.js 22+, multi-provider models, MCP support, channels) - Image versioning section adds note on digest pinning and renovate-driven update PRs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…EC.md (rewrite from hermes-coding-berat) Rewrite of HERMES_CODING_BERAT_MVP_PRD.md + HERMES_CODING_BERAT_SPEC.md from origin/archive-hermes-infisical. Changes: - OpenClaw-specific layout: templates/SOUL.md, AGENTS.md, IDENTITY.md (optional), openclaw.json (JSON, not YAML) - Workspace auto-loaded files written to /opt/data/workspace/; openclaw.json written to /opt/data/.openclaw/ - Tool list swapped from Hermes (9 curated) to OpenClaw built-ins (bash, read, write, edit, browser, canvas, nodes). Excluded by default: process, cron, discord, gateway, sessions_* - openclaw.json template: model, sandbox off, channels.telegram with botToken+allowFrom, mcp_servers.janissary_security (HTTP transport). Sandbox off because Sultanate's outer container is the isolation boundary - Grant templates gained `service` and `kind` fields (dynamic|kv) to tell Aga which mint path to use. For GitHub in MVP: kind=dynamic, Aga mints App installation tokens - Grants now specify `domains` (list) instead of single `domain`, and Aga writes one Divan grant per domain, all sharing the same token value and lease metadata (OpenBao lease ID + expires_at) - Secret provisioning flow rewritten: Aga reads berat, mints GitHub App token via OpenBao KV private key + GitHub API, writes lease-bound grants to Divan. Aga's renewal loop keeps grants fresh while province is running - Template rendering: validates openclaw.json as JSON (was YAML for Hermes); variable catalog expanded to include pasha_telegram_bot_token (Vizier provisions from a pool) and sultan_telegram_user_id (from deploy env) - Cross-reference table (firman <-> berat boundary) updated: OPENCLAW_HOME, IDENTITY.md, sandbox mode, dynamic vs kv grant kind Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All content migrated to the new MVP PRD + SPEC tier during Phase A-C
of the Sultanate MVP rewrite. Original content preserved on branch
origin/archive-hermes-infisical (which also retains the Hermes +
Infisical + Sentinel naming for historical reference).
Deleted:
- SULTANATE.md -> superseded by SULTANATE_MVP.md
- JANISSARY_PRD_V2.md -> superseded by JANISSARY_MVP_PRD.md
+ JANISSARY_SPEC.md + KASHIF_MVP_PRD.md
+ AGA_MVP_PRD.md
- VIZIER_PRD_V3.md -> superseded by VIZIER_MVP_PRD.md
+ VIZIER_SPEC.md
- HERMES_FIRMAN_PRD_V1.md -> superseded by OPENCLAW_FIRMAN_MVP_PRD.md
+ OPENCLAW_FIRMAN_SPEC.md
- HERMES_CODING_BERAT_PRD_V1.md -> superseded by
OPENCLAW_CODING_BERAT_MVP_PRD.md +
OPENCLAW_CODING_BERAT_SPEC.md
- KASHIF_NOTES.md -> superseded by KASHIF_MVP_PRD.md
SENTINELGATE_ANALYSIS.md retained -- it remains a useful reference
for Phase 2 (session tracking, tool-level RBAC, ECDSA-signed audit
records). Consolidation into a Phase 2 roadmap is a future step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…if/Aga)
Post-rewrite README. Replaces the legacy V1/V2/V3 index with the new
MVP PRD + SPEC tier. Sections:
- Where to Start (reading order: MOTIVATION -> ARCHITECTURE ->
SULTANATE_MVP -> component PRDs)
- Document Index, organised by layer:
- Cross-cutting (MOTIVATION, ARCHITECTURE, SULTANATE_MVP, CLAUDE.md)
- Security perimeter (JANISSARY_MVP_PRD, JANISSARY_SPEC,
KASHIF_MVP_PRD, AGA_MVP_PRD, AGA_SPEC, DIVAN_MVP_PRD,
DIVAN_API_SPEC)
- Orchestration (VIZIER_MVP_PRD, VIZIER_SPEC)
- Province runtime (OPENCLAW_FIRMAN_MVP_PRD + SPEC,
OPENCLAW_CODING_BERAT_MVP_PRD + SPEC)
- Exploratory (SENTINELGATE_ANALYSIS for Phase 2 reference)
- Notes for archive-hermes-infisical branch as history source
- Implementation Notes updated for:
- Sandcat (VirtusLab, not SoftwareMill -- research agent found
the active fork)
- Kashif three-layer design
- OpenBao (dynamic GitHub App minting, Sultan no longer pastes
tokens in common path)
- OpenClaw as the Phase 1 agent runtime
- Hetzner AX41-NVMe hardware target + core RAM budget
- Planned Repo Structure: renamed hermes-* -> openclaw-*, added
divan content under janissary repo (ships together per the
security perimeter rule)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-rewrite CLAUDE.md reflecting the new doc tier (MVP PRDs + SPECs). Changes: - Document hierarchy: SULTANATE.md -> SULTANATE_MVP.md as the umbrella. - Reading order: expanded from 4 steps to 9, ordered by layer: README -> MOTIVATION -> ARCHITECTURE -> SULTANATE_MVP -> DIVAN_API_SPEC -> security perimeter (Janissary/Kashif/Aga/Divan PRDs + SPECs) -> Vizier -> OpenClaw firman/berat -> SENTINELGATE_ANALYSIS. Archive branch noted as historical reference. - Architecture invariants: updated for OpenClaw Phase 1 (not Hermes), GitHub App as primary credential path, Hetzner AX41-NVMe target, OpenBao sealed failure case. - New section: "Kashif verdict -> Divan auto-transition" codifies the rule that only Kashif=allow is silent; block and escalate both reach Sultan and Aga via their own polling loops. - Editing conventions: added SPEC doc convention (concrete impl detail, links back to MVP PRD for scope) and file-naming rules (_MVP_PRD.md vs _SPEC.md suffix). - Planned submodule name update: hermes-* -> openclaw-*. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… radius Three additions to SULTANATE_MVP.md based on user feedback that the Pasha Telegram channel was under-emphasized: - Architecture diagram: added "Pasha <-> Sultan via dedicated per-province Telegram bot" line under Provinces. Also clarified port 18789 is loopback only. - New "Communication Channels" section between Architecture and Trust Model: enumerates the 2 + N bots (Vizier, Aga, one per province), bot lifecycle (pool-allocated, returned on destroy), and -- most importantly -- the trust property: a leaked Pasha bot token has per-province blast radius. Cannot reach other Pashas, cannot reach Vizier/Aga, cannot bypass Janissary, cannot mint or read OpenBao secrets. Recovery is destroy-and-recreate the province after revoking in BotFather. - Credential Model "low-risk config" line strengthened to reference the new Communication Channels section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… credential flow, cross-refs)
Four small follow-ups identified during user review of the MVP rewrite,
applied in one commit to keep the PR clean.
A1 -- Tailscale as primary dashboard access:
- DIVAN_MVP_PRD.md: dashboard binds to host's Tailscale interface IP;
Sultan reaches it from any device on the tailnet (mobile-friendly).
HTTP basic auth retained as second factor. SSH-tunnel + 127.0.0.1
documented as a fallback for non-Tailscale environments.
- DIVAN_API_SPEC.md: Configuration table now describes
DIVAN_DASHBOARD_HOST as operator-supplied (Tailscale IP recommended);
Overview section updated.
- SULTANATE_MVP.md and ARCHITECTURE.md US-10: cross-references updated
to match.
A2 -- Vizier ingress / egress documented explicitly:
- VIZIER_MVP_PRD.md: Privileges table now has separate
"Network -- egress" and "Network -- ingress" rows. Inbound = none
(gateway port 18789 bound to 127.0.0.1 inside the container).
- New "Vizier Egress" subsection: Telegram polling + documentation
reads via Vizier's whitelist key ("vizier") in Divan. Default
whitelist includes docs.python.org, docs.openclaw.ai, github.com,
pypi.org, registry.npmjs.org, docs.docker.com, stackoverflow.com,
api.telegram.org. Janissary read-only passthrough covers unlisted
GETs.
A3 -- New "Mid-task credential request flow" in ARCHITECTURE.md:
- §2.3 added between the appeal flow and the kill-switch.
- T+0.0s timeline format parallel to the appeal flow, covering
Pasha calls request_access -> Janissary forwards to Kashif
/screen/ingress -> Kashif verdict (allow/block/escalate).
- All three cases require Sultan involvement on the allow path
(granting a long-lived credential is more consequential than
letting one write through). Block case auto-denies + alerts both
Sultan and Aga; escalate case parallels Case A but with escalate
framing.
A4 -- Cross-references to ARCHITECTURE.md from MVP PRDs:
- KASHIF_MVP_PRD.md: header pointer to ARCHITECTURE.md §2.
- AGA_MVP_PRD.md: header pointer to ARCHITECTURE.md §2 (appeal flow)
and §2.3 (credential request flow).
- (JANISSARY_MVP_PRD.md and OPENCLAW_CODING_BERAT_MVP_PRD.md already
reference ARCHITECTURE.md from the appeal flow sections; left as
is.)
- KASHIF_NOTES.md reference removed from KASHIF_MVP_PRD.md (the
notes file was deleted in commit f0d057d).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds extensive PRDs and technical specifications defining Sultanate’s MVP components (Aga, Divan, Janissary, Kashif, Vizier, OpenBao, OpenClaw firmans/berats), API contracts, Divan SQLite schema, credential lifecycle (OpenBao/GitHub), Janissary proxy rules, Kashif screening, Vizier orchestration, docs index updates, and deletes legacy PRDs. ChangesSultanate MVP & Component Specs
sequenceDiagram
participant Sultan as rgba(0,102,204,0.5)
participant Vizier as rgba(0,153,51,0.5)
participant Divan as rgba(204,102,0,0.5)
participant Aga as rgba(153,51,255,0.5)
participant OpenBao as rgba(255,153,51,0.5)
participant Janissary as rgba(51,153,204,0.5)
participant Province as rgba(102,102,102,0.5)
Sultan->>Vizier: create province (firman + berat)
Vizier->>Divan: POST /provinces (status=creating)
Divan-->>Vizier: stored
Vizier->>Aga: province recorded (polling trigger)
Aga->>OpenBao: AppRole login & read private key
Aga->>OpenBao: mint GitHub installation token
OpenBao-->>Aga: token + lease metadata
Aga->>Divan: POST /grants (inject header, lease_expires_at)
Janissary->>Divan: GET /janissary/state (poll)
Divan-->>Janissary: grants/whitelist/cache
Province->>Janissary: outbound request
Janissary->>Janissary: evaluate rules, inject header (if lease valid)
Janissary->>Upstream: proxied request
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 22
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
MOTIVATION.md (1)
58-60:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClarify Phase 1 runtime boundary in this section.
Line 59 reads as if OpenHands/CrewAI are active MVP options, which conflicts with the locked Phase 1 runtime scope elsewhere. Add a short qualifier like “Phase 2” here to avoid scope drift.
Based on learnings: “Phase 1 runtime is OpenClaw only — openclaw-firman is the only firman and openclaw-coding-berat is the only berat; additional runtimes (OpenHands, CrewAI) are Phase 2 concerns”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MOTIVATION.md` around lines 58 - 60, Update the sentence "runtime-agnostic: it works the same whether the province runs OpenClaw, OpenHands, CrewAI, or anything else." to explicitly limit Phase 1 by adding a Phase 2 qualifier (e.g., "...OpenClaw, OpenHands, CrewAI (Phase 2), or anything else") and, if helpful, add a short clarifying clause that "Phase 1 runtime is OpenClaw only — openclaw-firman and openclaw-coding-berat are the Phase 1 components" so the section no longer implies OpenHands/CrewAI are Phase 1 options.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AGA_MVP_PRD.md`:
- Around line 234-237: The MVP spec forbids direct Aga→Kashif calls, so update
the aga-ops MCP action screen_and_reply to route all coordination through Divan
instead of invoking Kashif directly: remove any direct call or callback to
Kashif inside screen_and_reply and replace it with a Divan-facing request (e.g.,
emit a Divan operation or message containing the screening payload and intended
Kashif action), ensure Divan is the component that subsequently invokes Kashif,
and update any related MCP endpoints/handlers (screen_and_reply in aga-ops, and
any consumers expecting direct Kashif responses) to use the Divan-mediated flow.
In `@AGA_SPEC.md`:
- Around line 386-387: Update the spec to consistently require Kashif screening
for all Sultan-originated token text by changing the source label in the
/screen/ingress step to source=`sultan` (instead of `pasha`) and remove or
reconcile the exception in §6 that exempts Sultan-authored messages; ensure the
flow explicitly states "abort if Kashif=`block`" for Sultan-originated/pasted
text and add a clarifying sentence in §6 that aligns with §3 (or replace the §6
exception with a precise condition when screening is not required).
- Around line 411-412: Update the HTTP method for the GitHub installation token
revocation in the spec: change the line currently stating "POST
https://api.github.com/installation/token" to use "DELETE
https://api.github.com/installation/token" (the correct REST API endpoint that
returns HTTP 204 No Content) so the revocation call is accurate and will
succeed.
In `@ARCHITECTURE.md`:
- Around line 398-401: The flow currently requires Sultan to paste a GitHub
token; change it so Aga performs GitHub App dynamic-minting and writes
lease-aware grants to Divan instead: remove the manual "Ask Sultan for GitHub
token" and "Sultan provides token" steps and replace them with "Aga mints
installation token via GitHub App, receives 1-hour TTL and schedules ~15min
auto-renew, then writes the lease-aware grant to Divan (POST /grants)"; also
state OpenBao is the sole secret vault with Aga as its sole client and Pashas
authenticate via Janissary-injected headers so no manual secret entry is needed.
- Around line 424-437: The architecture doc currently mixes two lifecycle styles
(direct DELETE /provinces/{id} and status transitions like destroying); pick and
enforce one authoritative province lifecycle contract across the document and
update the flow diagram and all references accordingly: either replace the
DELETE /provinces/{id} step and “record deleted (or status=destroyed)” with
explicit status transitions (e.g., PATCH /provinces/{id} -> {status: destroying}
then {status: destroyed}) in the Vizier "destroy" path and elsewhere (also
update the "Stop" path that uses PATCH /provinces/{id} -> {status: stopped}), or
conversely change the PATCH usages to a single DELETE approach everywhere;
ensure all occurrences (including the lines referencing "destroying", DELETE
/provinces/{id}, PATCH /provinces/{id}, and status=destroyed/stopped) are
consistent and document the chosen lifecycle state names and transitions.
- Around line 275-337: The documentation introduces a new API domain
(/access_request and /access_requests) and Divan state transitions that are not
defined in DIVAN_API_SPEC.md; update the spec to formally document the new
resources, request/response schemas, status values (pending/approved/denied),
fields like kashif_verdict, grant/lease semantics, and relevant endpoints used
by Vizier/Aga/Janissary, or alternatively rewrite this ARCHITECTURE.md section
to reference only currently supported endpoints; locate references to
"/access_request(s)" and "Divan" in the diff and either add the corresponding
definitions to DIVAN_API_SPEC.md or change the flow to map to existing API
surfaces.
In `@DIVAN_API_SPEC.md`:
- Around line 12-16: The spec currently lists both binding to the host Tailscale
interface (port 8601) and binding to 127.0.0.1:8601 as runtime defaults—make one
authoritative and describe the other as an explicit fallback: update the wording
that mentions "Tailscale interface bind" and "bind to `127.0.0.1:8601`" to state
a single primary mode (e.g., "By default the dashboard binds to the host's
Tailscale interface on port 8601") and then add a clear fallback sentence (e.g.,
"If Tailscale is not available, the dashboard will bind to 127.0.0.1:8601 and
must be accessed via SSH tunnel or similar"). Apply the same unification to the
other occurrence referenced (lines ~678-680) and add a short config/flag note
(dashboard bind mode or explicit --bind-address) so runtime behavior is
deterministic and documented.
In `@DIVAN_MVP_PRD.md`:
- Around line 135-136: Replace the sentence that says "Kashif continues to
screen (it's stateless) but its verdicts are buffered in-process until Divan
returns" with a fail-closed escalation policy: specify that when Divan is
unreachable Kashif should use last-cached rules if available, otherwise block
all requests; on Kashif timeouts escalate (do not buffer deferred writes);
document that OpenBao sealed means existing leases remain valid but expired
leases are treated as fail-closed; and remove any mention of in-process
buffering to avoid stale/reordered decisions.
- Around line 46-48: The PRD and API spec disagree on who can read the secret
field `inject.value` (PRD says only role Janissary; API spec allows Aga), so
pick the authoritative access-control contract and make both docs consistent: if
Janissary-only is correct, update the API spec role matrix to remove Aga's read
permission for `inject.value`; if Aga should also read it, update
DIVAN_MVP_PRD.md to include Aga alongside Janissary for `inject.value`. Ensure
the chosen contract is explicitly documented in both places (mention
`inject.value` and the roles Janissary and Aga) and adjust any related wording
or tables so the access-control statement is identical across the PRD and API
spec.
- Around line 76-90: The dashboard networking text is self-contradictory: it
claims "Primary access: Tailscale" and "dashboard listener is never bound to
0.0.0.0" but elsewhere locks startup/scope to 127.0.0.1; unify by picking one
clear binding policy (e.g., Tailscale-first with a documented fallback to
127.0.0.1) and update all references ("Primary access: Tailscale", "dashboard
listener", instances of "127.0.0.1" and the "never bound to 0.0.0.0" sentence)
so the document consistently states the default bind (Tailscale interface IP
like 100.x.y.z:8601), the fallback bind (127.0.0.1) and the
precedence/conditions that trigger the fallback.
In `@JANISSARY_MVP_PRD.md`:
- Around line 29-32: Replace non-Ottoman terminology ("agent") used in PRD prose
with repository-standard Ottoman terms (e.g., Pasha, Aga, Province, Firman) to
ensure consistency; specifically update the "Write block" rule and its
description (the sentence beginning "POST, PUT, PATCH, DELETE to a
non-whitelisted domain? Block. Return 403...") to use the chosen Ottoman term
(e.g., "Pasha" or "Aga") and reword accompanying guidance to reference the
appeal tool in Ottoman vocabulary if applicable; apply the same replacement to
the other flagged occurrences (the prose at the other noted locations) so all
mentions conform to the Ottoman glossary.
In `@JANISSARY_SPEC.md`:
- Around line 1133-1137: The docker-compose/service orchestration currently
lists a depends_on block requiring kashif (and divan) to be healthy before
Janissary but the documented startup sequence says Janissary starts before
kashif; fix the contradiction by making the declared startup order and the
depends_on directive consistent: either change the depends_on entries (the
depends_on: divan / kashif / condition: service_healthy) so Janissary does not
require kashif (remove or reorder kashif) if Janissary must start first, or
update the startup sequence/runbook to start kashif (and divan) before Janissary
so depends_on remains correct; apply the same change to the second occurrence
referenced (the block around the later section) so both the depends_on directive
and the operator runbook for Janissary vs kashif/divan are aligned.
- Around line 1347-1356: The health-check snippet for Divan and Kashif currently
loops forever (the until curl ... loops) but the prose states a 60s timeout;
pick one and make them consistent—either update the prose to say the script
blocks indefinitely, or (recommended) add a timeout implementation to the
script: introduce a WAIT_TIMEOUT (or MAX_ATTEMPTS) variable and replace the
infinite until loops for Divan and Kashif with a bounded loop that exits with a
non-zero status after the timeout, ensuring the same behavior is documented in
the startup section; update the corresponding prose lines (also mentioned around
1393-1394) to reflect the chosen timeout behavior.
In `@KASHIF_MVP_PRD.md`:
- Around line 156-162: The table row for "Kashif container unreachable (from
Janissary's perspective)" is missing the Verdict and Reason columns; update that
row so it has three cells: Condition = "Kashif container unreachable (from
Janissary's perspective)", Verdict = "escalate", and Reason = "Janissary writes
`kashif_verdict=escalate` itself with `kashif_timeout=true`" (ensuring the
backtick-wrapped token stays in the Reason cell).
In `@OPENCLAW_CODING_BERAT_MVP_PRD.md`:
- Around line 7-23: The PRD uses non‑canonical terms like "agent" and "template"
in the prose; update the section headings and body to use Ottoman naming
consistently (e.g., replace "agent" -> "Pasha" or "Berat" where referring to the
employee/profile, "container"/"template" -> "Firman" or "Province/Realm" as
appropriate) and ensure references to the five defined items use canonical names
(SOUL.md, AGENTS.md -> verify AGENTS.md maps to "Instructions"/"Divan" language,
IDENTITY.md stays as Identity, openclaw.json -> "OpenClaw configuration",
security policy -> use "whitelist, grants, non‑HTTP port declarations" but adopt
Sultanate terms like Firman/Province); keep meaning identical but eliminate
alternate vocabulary to conform to the guideline.
In `@OPENCLAW_FIRMAN_MVP_PRD.md`:
- Around line 1-10: Update the document to use Ottoman-first terminology:
replace occurrences of "container template", "container", "template", and
"agent" in the title and core definition sections with Firman and Province
(e.g., change the title and the sentence "The default container template
(firman) for Sultanate" to prioritize "Firman" and describe it as the default
Firman for the Province), keep only minimal parenthetical glosses if absolutely
necessary (e.g., "Firman (container template)") and ensure other core terms use
Sultan/Pasha/Berat/Divan/Realm/Aga consistently; also apply the same
replacements and consistency fixes in the later block referenced (lines 33-43)
so all prose uses Ottoman naming exclusively.
In `@OPENCLAW_FIRMAN_SPEC.md`:
- Around line 262-263: The spec currently references a non-existent field
channels.telegram.botTokenEnv; update the documentation to use the correct
OpenClaw config key channels.telegram.botToken (replace any occurrence of
channels.telegram.botTokenEnv with channels.telegram.botToken) so the SPEC
matches the actual configuration contract and the berat artifacts already using
channels.telegram.botToken remain consistent.
In `@SENTINELGATE_ANALYSIS.md`:
- Line 43: Update the flow text/diagram that currently shows "Kashif -> Aga ->
Sultan" to route all hops through Divan (e.g., "Kashif -> Divan -> Aga -> Divan
-> Sultan") so component-to-component communication is always mediated by Divan;
specifically modify the occurrences referencing Kashif, Aga, and Sultan (lines
around the flow diagram entries at the shown diff) and any other occurrences
(notably the similar chain at lines ~242-243) to reflect Divan as the
intermediary and adjust wording to state “All component inter-communication must
be routed through Divan — no direct component calls.”
In `@SULTANATE_MVP.md`:
- Around line 71-78: Update the Kashif timeout semantics: change the fail-closed
behavior for Kashif (the "Kashif" component handling POST /screen/appeal and
POST /screen/ingress) so that a timeout or unavailable LLM results in "escalate
+ alert Sultan" instead of "block + alert Sultan"; edit the line that currently
reads "Fail-closed: timeout or LLM down -> block + alert Sultan" to
"Fail-closed: timeout or LLM down -> escalate + alert Sultan" and ensure related
references in the appeal-flow and fail-closed contract sections consistently use
"escalate" for Kashif timeouts/unavailability.
In `@VIZIER_MVP_PRD.md`:
- Around line 8-12: Rewrite the PRD prose to use Ottoman terms as the primary
nouns: replace occurrences of "agent", "container", "template", and similar
alternates with "Pasha", "Province", "Firman", "Berat", etc., keeping a single
one-time gloss only at the first mention (e.g., "Pasha (agent)"). Ensure all
core definitions and sentences that currently read "CLI tool + OpenClaw agent",
"isolated containers", "container templates", "agent profiles", and any other
alternates are converted to the Ottoman vocabulary (Vizier, Pasha, Province,
Firman, Berat, Divan, Realm, Aga, Sultan, Janissary) and remove repeated
alternate terms; apply the same normalization to the other affected paragraphs
referenced in the comment.
In `@VIZIER_SPEC.md`:
- Around line 116-123: The spec incorrectly instructs vizier-cli destroy to call
Divan's DELETE /grants?province_id={id}; remove that step and update the Vizier
behavior text so vizier-cli destroy does not attempt to call the Divan grants
endpoint (DELETE /grants) and instead documents that grant cleanup is performed
exclusively by Aga; also make vizier-cli destroy resilient by noting it should
ignore or gracefully handle any 403s if such a call is attempted elsewhere
(e.g., in code paths named vizier-cli destroy or steps referencing DELETE
/grants) so the cleanup semantics remain unambiguous.
- Around line 155-181: The VIZIER_SPEC.md firman.yaml example uses an
object-structured schema (nested keys like image.repository, bootstrap.commands,
startup.command, defaults.branch) that conflicts with the flat-field shape
defined in OPENCLAW_FIRMAN_SPEC.md; choose one canonical schema and update both
files to match: either convert VIZIER_SPEC.md to the flat field names used by
OPENCLAW_FIRMAN_SPEC (or vice versa), ensuring keys such as image.repository,
image.tag, bootstrap.commands, startup.command/args and defaults.branch are
present and consistently documented, and run a quick validation of the chosen
spec against any schema validator or examples referenced by OPENCLAW_FIRMAN_SPEC
to prevent artifact drift.
---
Outside diff comments:
In `@MOTIVATION.md`:
- Around line 58-60: Update the sentence "runtime-agnostic: it works the same
whether the province runs OpenClaw, OpenHands, CrewAI, or anything else." to
explicitly limit Phase 1 by adding a Phase 2 qualifier (e.g., "...OpenClaw,
OpenHands, CrewAI (Phase 2), or anything else") and, if helpful, add a short
clarifying clause that "Phase 1 runtime is OpenClaw only — openclaw-firman and
openclaw-coding-berat are the Phase 1 components" so the section no longer
implies OpenHands/CrewAI are Phase 1 options.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f265a701-e4cd-4917-91ae-e09066e2d18e
📒 Files selected for processing (24)
AGA_MVP_PRD.mdAGA_SPEC.mdARCHITECTURE.mdCLAUDE.mdDIVAN_API_SPEC.mdDIVAN_MVP_PRD.mdHERMES_CODING_BERAT_PRD_V1.mdHERMES_FIRMAN_PRD_V1.mdJANISSARY_MVP_PRD.mdJANISSARY_PRD_V2.mdJANISSARY_SPEC.mdKASHIF_MVP_PRD.mdMOTIVATION.mdOPENCLAW_CODING_BERAT_MVP_PRD.mdOPENCLAW_CODING_BERAT_SPEC.mdOPENCLAW_FIRMAN_MVP_PRD.mdOPENCLAW_FIRMAN_SPEC.mdREADME.mdSENTINELGATE_ANALYSIS.mdSULTANATE.mdSULTANATE_MVP.mdVIZIER_MVP_PRD.mdVIZIER_PRD_V3.mdVIZIER_SPEC.md
💤 Files with no reviewable changes (5)
- HERMES_FIRMAN_PRD_V1.md
- VIZIER_PRD_V3.md
- SULTANATE.md
- HERMES_CODING_BERAT_PRD_V1.md
- JANISSARY_PRD_V2.md
| -> Layer 2: Size gate (block large outbound payloads) | ||
| -> Layer 3: Blacklist (block known bad) | ||
| -> Layer 4: Appeal -> Kashif (LLM triage) -> Sentinel -> Sultan | ||
| -> Layer 4: Appeal -> Kashif (LLM triage) -> Aga -> Sultan |
There was a problem hiding this comment.
Route these flow diagrams through Divan, not direct component hops.
Line 43 and Line 242 depict Kashif -> Aga -> Sultan as a direct chain. In this repo’s MVP contract, component-to-component communication must be mediated by Divan; otherwise this introduces architecture drift in a document that readers will treat as design guidance.
Suggested wording update
- -> Layer 4: Appeal -> Kashif (LLM triage) -> Aga -> Sultan
+ -> Layer 4: Appeal -> Kashif (LLM triage) -> Divan -> (Aga + Vizier) -> Sultan
...
- -> Blocked? -> Kashif (LLM triage) -> Aga -> Sultan
+ -> Blocked? -> Kashif (LLM triage) -> Divan -> (Aga + Vizier) -> SultanBased on learnings: “All component inter-communication must be routed through Divan — no component should call another directly”.
Also applies to: 242-243
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@SENTINELGATE_ANALYSIS.md` at line 43, Update the flow text/diagram that
currently shows "Kashif -> Aga -> Sultan" to route all hops through Divan (e.g.,
"Kashif -> Divan -> Aga -> Divan -> Sultan") so component-to-component
communication is always mediated by Divan; specifically modify the occurrences
referencing Kashif, Aga, and Sultan (lines around the flow diagram entries at
the shown diff) and any other occurrences (notably the similar chain at lines
~242-243) to reflect Divan as the intermediary and adjust wording to state “All
component inter-communication must be routed through Divan — no direct component
calls.”
There was a problem hiding this comment.
Declining. SENTINELGATE_ANALYSIS.md is marked "exploratory, not normative" in CLAUDE.md and preserved as Phase-2 reference material. The diagrams compare Sultanate's appeal flow against SentinelGate's at a simplified level; injecting "via Divan" hops into a SentinelGate-comparison diagram would clutter without clarifying the comparison.
The authoritative appeal-flow diagram is in ARCHITECTURE.md §2 — which DOES show Kashif writing verdicts to Divan via PATCH /appeals/{id}/kashif_verdict. This file is contextual reading, not the source of truth.
There was a problem hiding this comment.
@stranma, understood — the simplified flow in SENTINELGATE_ANALYSIS.md is intentionally abstracted for comparison purposes, and ARCHITECTURE.md §2 is the correct normative reference for the Divan-mediated appeal flow. I'll defer to that document for architecture accuracy.
✏️ Learnings added
Learnt from: stranma
Repo: stranma/sultanate PR: 1
File: SENTINELGATE_ANALYSIS.md:43-43
Timestamp: 2026-05-05T13:05:06.435Z
Learning: `SENTINELGATE_ANALYSIS.md` in the stranma/sultanate repo is explicitly marked "exploratory, not normative" in `CLAUDE.md` and is preserved as Phase-2 reference material. Its simplified flow diagrams (e.g., `Kashif -> Aga -> Sultan`) are intentional comparisons against SentinelGate's architecture and should not be flagged for missing Divan-mediation hops. The authoritative appeal-flow diagram showing Divan mediation via `PATCH /appeals/{id}/kashif_verdict` lives in `ARCHITECTURE.md` §2.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| CLI tool + OpenClaw agent that manages provinces (isolated containers). | ||
| Creates provinces from firmans (container templates) and berats (agent | ||
| profiles), tracks the realm, writes province state to Divan. Talks to | ||
| Sultan via Telegram. Routes through Janissary for all outbound traffic. | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Normalize PRD prose to Ottoman terms as the primary vocabulary.
This _MVP_PRD uses “agent”/“container” as primary nouns in core definitions (Line 8 onward). Keep Ottoman terms primary (Vizier/Pasha/Province/Firman/Berat), and only use one-time glosses where needed. This doc is likely to be copied into other PRDs, so terminology drift here propagates quickly.
As per coding guidelines: “Use Ottoman naming consistently (Sultan, Pasha, Province, Firman, Berat, Divan, Realm, Aga) — do not introduce alternate names like 'agent', 'container', or 'template' in PRD prose”.
Also applies to: 18-23
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@VIZIER_MVP_PRD.md` around lines 8 - 12, Rewrite the PRD prose to use Ottoman
terms as the primary nouns: replace occurrences of "agent", "container",
"template", and similar alternates with "Pasha", "Province", "Firman", "Berat",
etc., keeping a single one-time gloss only at the first mention (e.g., "Pasha
(agent)"). Ensure all core definitions and sentences that currently read "CLI
tool + OpenClaw agent", "isolated containers", "container templates", "agent
profiles", and any other alternates are converted to the Ottoman vocabulary
(Vizier, Pasha, Province, Firman, Berat, Divan, Realm, Aga, Sultan, Janissary)
and remove repeated alternate terms; apply the same normalization to the other
affected paragraphs referenced in the comment.
There was a problem hiding this comment.
Pushing back. CLAUDE.md's Ottoman-naming invariant explicitly permits first-use gloss: "First use per document may gloss the term, e.g. 'Pasha (agent inside a province)'." The flagged passages follow exactly that pattern — "agent profile (berat)", "container template (firman)", "firmans (container templates) and berats (agent profiles)" — gloss on first mention, then Ottoman terms downstream.
A strict "no alternate names ever" reading would require updating CLAUDE.md to forbid the gloss. Current state matches the invariant as written. Closing as "matches documented invariant".
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check CLAUDE.md for the Ottoman naming invariant and gloss rules
cat CLAUDE.md
Length of output: 8857
@stranma, you are correct — I misread the invariant. CLAUDE.md explicitly states: "First use per document may gloss the term, e.g. 'Pasha (agent inside a province)'." The flagged passages (firmans (container templates), berats (agent profiles), provinces (isolated containers)) all follow exactly that pattern: Ottoman term primary, alternate in parentheses, on first mention only. My original comment applied the "no alternate names" clause as an absolute prohibition while ignoring the gloss carve-out in the same sentence. The current prose matches the documented invariant. Withdrawing the suggestion.
Applied via 6 parallel agents working on disjoint file sets, then
two carry-over reconciliations done in this commit (startup order
swap, VIZIER_SPEC code blocks).
Tier 1 -- factual fixes:
- MOTIVATION.md: add (Phase 2) qualifier next to OpenHands/CrewAI
examples so Phase-1 OpenClaw-only scope reads correctly.
- AGA_SPEC.md: source label in KV-fallback Kashif call is `sultan`
not `pasha`; reconcile §6 (routine commands skip Kashif, secret
paste does not). GitHub installation revoke uses DELETE not POST
(correct REST verb, returns 204).
- ARCHITECTURE.md: province destroy uses PATCH /provinces/{id} with
status transitions (destroying -> destroyed), not DELETE
(DIVAN_API_SPEC has no DELETE for provinces). US-10 boot order
asserts Kashif healthy before Janissary.
- DIVAN_API_SPEC.md: Tailscale/127.0.0.1 wording unified across
Overview, Dashboard Routes, and Configuration table (primary =
Tailscale, fallback = 127.0.0.1+SSH tunnel).
- DIVAN_MVP_PRD.md: inject.value plaintext readable by `aga` AND
`janissary` (was: only janissary, contradicted API spec). Same
Tailscale/fallback unification end-to-end (Auth + network,
Startup, Phase 1 Scope). "Buffered in-process" verdict storage
removed; replaced with fail-closed escalation + degraded
Vizier-polling. OpenBao-sealed bullet added.
- JANISSARY_SPEC.md: bounded wait loops with WAIT_TIMEOUT=60 in
janissary-entrypoint.sh; startup sequence narrative says Kashif
starts before Janissary (Janissary forwards /screen/appeal so
Kashif must be healthy first); depends_on: kashif healthy.
- KASHIF_MVP_PRD.md: Kashif-unreachable row in fail-closed table
gets its missing Verdict column (=`escalate`).
- OPENCLAW_FIRMAN_SPEC.md: channels.telegram.botTokenEnv ->
channels.telegram.botToken (matches berat artifacts).
- SULTANATE_MVP.md: Kashif fail-closed line says
`escalate + alert Sultan` (was: block + alert; contradicted the
appeal-flow narrative right below).
- VIZIER_SPEC.md: vizier-cli destroy no longer calls DELETE /grants
(Aga's job, polls status=destroying). firman.yaml schema in §2
matches OPENCLAW_FIRMAN_SPEC (flat image string,
bootstrap as list of {command,description}, openclaw_home).
Tier 2 -- substantive but bounded:
- ARCHITECTURE.md: Province creation flow replaced manual
Sultan-pastes-token steps with Aga's GitHub App dynamic minting
(1-hour TTL, ~15 min auto-renewal, lease-bound grant write).
- DIVAN_API_SPEC.md: new "Access Requests" resource added between
Appeals and Port Requests -- POST /access_requests, GET, PATCH
Sultan decision, PATCH kashif_verdict (Kashif role only). New
access_requests SQL table with three indexes. Role permissions
matrix updated to include access_requests R/W matrix.
- JANISSARY_SPEC.md: docker-compose depends_on chain corrected so
Janissary depends on Kashif healthy (and Divan healthy).
Carry-over reconciliations (added in this commit by main session):
- SULTANATE_MVP.md, AGA_SPEC.md, VIZIER_SPEC.md, ARCHITECTURE.md
US-10: startup order swapped Kashif (now #3) before Janissary
(now #4) -- consistent with JANISSARY_SPEC.md narrative.
- VIZIER_SPEC.md §5 code blocks: image string flat (was
image['repository']:image['tag']); bootstrap iterates list of
{command,description} dicts (was nested commands list).
CodeRabbit Tier 3 (5 items: Aga->Kashif direct call, Ottoman
naming enforcement x4) deferred for explicit per-item discussion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (2)
ARCHITECTURE.md (1)
447-448:⚠️ Potential issue | 🟠 Major | ⚡ Quick winProvince destroy lifecycle still conflicts with the documented status model.
This section still mixes “deleted record” /
status=destroyedsemantics, while the API lifecycle contract is status-driven and does not document deletion as the normal terminal behavior. Please make the lifecycle wording fully consistent.Also applies to: 663-664
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ARCHITECTURE.md` around lines 447 - 448, Update the lifecycle wording so it reflects the status-driven API contract instead of implying record deletion: replace occurrences under the PATCH /provinces/{id} example that show "{status: destroyed}" or language implying deletion with the documented terminal status term used by the spec (e.g., "{status: archived}" or the canonical terminal status name), and adjust the surrounding sentence to describe a status transition to the terminal state rather than physical deletion; apply the same change to the other occurrence that currently mixes deletion semantics (the second instance referenced).AGA_SPEC.md (1)
741-742:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSOUL policy contradicts the mandatory Kashif screening path for Sultan-pasted secret material.
Line 741-742 says Sultan content bypasses Kashif, but Line 386-387 and Section 6 require screening for Sultan-pasted secret payloads. Keep one rule to avoid divergent implementations.
As per coding guidelines: “All Aga ingress … must be pre-screened by Kashif …”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@AGA_SPEC.md` around lines 741 - 742, The document currently contradicts itself by exempting "Sultan's Telegram channel" content from Kashif screening while Section 6 and the SOUL policy mandate Kashif pre-screening for Sultan-pasted secret payloads; update the sentence that claims Sultan content is trusted to instead state that all Sultan-originating ingress (including Telegram/pasted content) must pass Kashif screening, and remove or reword any language that creates an exception so the AGA_SPEC and SOUL policy/Section 6 are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@AGA_SPEC.md`:
- Around line 292-295: The spec currently treats openbao_lease_id as an
opaque/fake metadata field; change the model and any code that writes or reads
grant records so openbao_lease_id holds the real backing lease reference (or
null when no lease exists) to match the Divan grant contract—update validation,
serializers/deserializers, and any places that generate values for
openbao_lease_id (and related logic using lease_expires_at) to accept a real
lease identifier format or null and ensure consumers expect a real lease ref
rather than an opaque placeholder.
In `@ARCHITECTURE.md`:
- Around line 269-276: Update the flow diagram to use the canonical
access-request endpoint name everywhere: replace occurrences of "POST /api/req"
with "POST /access_requests" (and ensure any related arrows/labels that
reference /api/req are changed as well) so the sequence diagram consistently
matches the API contract; verify symbols like "POST /access_requests",
"screen/ingress(text)", "regex", "PromptGuard", and "LlamaGuard" remain
unchanged aside from the endpoint rename.
In `@DIVAN_API_SPEC.md`:
- Around line 233-235: Update the doc text to enforce fail-closed semantics:
replace the current sentence that implies forwarding without credentials with
language that an expired lease causes Janissary to deny/block the request for
that credentialed path (no pass-through), log the audit entry for Aga to detect
and re-issue credentials, and return the appropriate 401/403 to the client;
reference the components "Janissary" and "Aga" and the "expired lease" /
"credentialed path" concepts so readers know where the behavior applies and that
the system never falls back to pass-through.
In `@DIVAN_MVP_PRD.md`:
- Around line 3-5: Replace the current top-line subordination text with the
exact mandated sentence: "For shared glossary and architecture see
[SULTANATE_MVP.md](SULTANATE_MVP.md)." — update the header in DIVAN_MVP_PRD.md
so that the very first line matches that sentence exactly (remove or replace the
existing line "For shared glossary, deployment model, and component overview see
[SULTANATE_MVP.md](SULTANATE_MVP.md). For HTTP contract detail see
[DIVAN_API_SPEC.md](DIVAN_API_SPEC.md)." with the required canonical
subordination sentence), preserving the rest of the document.
In `@JANISSARY_SPEC.md`:
- Around line 816-818: The spec and client currently describe Janissary making a
direct POST to Kashif at /screen/appeal with {appeal_id, payload, justification,
url, method, province_id}; change this so Janissary routes the request through
Divan instead of calling Kashif directly: update the spec text and client code
that constructs the POST to call Divan's forwarding endpoint (e.g.,
Divan.forward or Divan proxy API) with the same payload and configurable timeout
handling, remove the hardcoded "http://127.0.0.1:8082/screen/appeal" target and
any direct Kashif URL usage, and ensure downstream handlers (Kashif) expect the
forwarded envelope format and tests/docs are updated to reflect
Janissary→Divan→Kashif flow.
- Around line 868-870: The spec currently offers two incompatible persistence
paths ("POST /access_requests" vs overloading "/appeals" with
"access-request://{service}"), causing divergent implementations; pick a single
explicit path (recommend using POST /access_requests), remove the alternative
overloading of /appeals and the "access-request://{service}" scheme, and update
all occurrences, examples, and API descriptions (e.g., the section referencing
"POST /access_requests" and the paragraph mentioning "/appeals" and
"access-request://{service}") to use only the chosen endpoint and its
payload/response shape.
In `@SULTANATE_MVP.md`:
- Around line 170-172: The doc inconsistently states two GitHub App renewal
intervals (~30 min here vs ~15 min later); pick one canonical interval (e.g., 15
or 30 minutes) and update every occurrence so they match: change the sentence
referencing `lease_expires_at` and the description of the "background renewal
loop in Aga" (and any other mention of renewal cadence) to use the chosen
interval, and ensure notes about stopping on `destroy` and GitHub token lifetime
("GitHub kills the token within 1 hour") remain accurate and consistent with
that interval.
- Around line 62-77: Update the architecture text so Janissary does not call
Kashif directly: make Divan the authoritative contract for routing screening
state transitions and describe Janissary as sending appeal/ingress screening
requests to Divan (which then forwards to Kashif), remove any language implying
direct Janissary→Kashif coupling, and clarify endpoints (e.g., POST
/screen/appeal, POST /screen/ingress) and fail-closed semantics under the
Divan-mediated flow so all inter-component coordination references point to
Divan as the single router.
In `@VIZIER_SPEC.md`:
- Around line 479-482: The code currently mutates Janissary config via
add_peer_to_janissary_config(public_key, wg_peer_ip) and signals the container
with subprocess.run(["docker", "kill", "-s", "HUP", "janissary"], check=True);
instead, remove these direct actions and record the peer lifecycle intent in
Divan's state store (e.g., create or update a peer-intent record containing
public_key and wg_peer_ip and desired state like "add"), using the existing
Divan state API/functionality your repo provides (replace references to
add_peer_to_janissary_config and the subprocess.run call with a single Divan
state mutation call such as publish_peer_intent or update_divan_state). Ensure
the intent includes sufficient metadata for Janissary to reconcile (public_key,
wg_peer_ip, desired_action, caller/ts) and return or log the Divan operation
result; Janissary will be responsible for watching Divan state and applying the
actual wg0.conf changes and HUP.
---
Duplicate comments:
In `@AGA_SPEC.md`:
- Around line 741-742: The document currently contradicts itself by exempting
"Sultan's Telegram channel" content from Kashif screening while Section 6 and
the SOUL policy mandate Kashif pre-screening for Sultan-pasted secret payloads;
update the sentence that claims Sultan content is trusted to instead state that
all Sultan-originating ingress (including Telegram/pasted content) must pass
Kashif screening, and remove or reword any language that creates an exception so
the AGA_SPEC and SOUL policy/Section 6 are consistent.
In `@ARCHITECTURE.md`:
- Around line 447-448: Update the lifecycle wording so it reflects the
status-driven API contract instead of implying record deletion: replace
occurrences under the PATCH /provinces/{id} example that show "{status:
destroyed}" or language implying deletion with the documented terminal status
term used by the spec (e.g., "{status: archived}" or the canonical terminal
status name), and adjust the surrounding sentence to describe a status
transition to the terminal state rather than physical deletion; apply the same
change to the other occurrence that currently mixes deletion semantics (the
second instance referenced).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2d4461b5-ee8e-406a-8bee-c1499805a398
📒 Files selected for processing (10)
AGA_SPEC.mdARCHITECTURE.mdDIVAN_API_SPEC.mdDIVAN_MVP_PRD.mdJANISSARY_SPEC.mdKASHIF_MVP_PRD.mdMOTIVATION.mdOPENCLAW_FIRMAN_SPEC.mdSULTANATE_MVP.mdVIZIER_SPEC.md
| |--POST /api/req->| | | | | | | ||
| | |--screen/ingress(text)-->| | | | | ||
| | | |--regex | | | | | ||
| | | |--PromptGuard | | | | ||
| | | |--LlamaGuard | | | | ||
| | |<--verdict--| | | | | | ||
| | |--POST /access_requests ->| | | | | ||
| |<-- pending -----| | | | | | |
There was a problem hiding this comment.
Use the canonical access-request endpoint name consistently.
This sequence uses POST /api/req, but the API contract elsewhere specifies /access_requests. Please align this flow to the canonical endpoint to prevent implementation drift.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ARCHITECTURE.md` around lines 269 - 276, Update the flow diagram to use the
canonical access-request endpoint name everywhere: replace occurrences of "POST
/api/req" with "POST /access_requests" (and ensure any related arrows/labels
that reference /api/req are changed as well) so the sequence diagram
consistently matches the API contract; verify symbols like "POST
/access_requests", "screen/ingress(text)", "regex", "PromptGuard", and
"LlamaGuard" remain unchanged aside from the endpoint rename.
| 4. **Forward to Kashif:** `POST http://127.0.0.1:8082/screen/appeal` with | ||
| `{ appeal_id, payload, justification, url, method, province_id }`. | ||
| Wait for Kashif's response with `timeout=5s` (configurable). |
There was a problem hiding this comment.
Route Janissary↔Kashif interactions through Divan, not direct HTTP calls.
Line 816 and Line 862 define direct Janissary→Kashif calls, and the client code (Line 970 onward) hardcodes that path. This breaks the documented one-way dependency rule and couples two components at runtime.
As per coding guidelines: “Enforce one-way component dependencies via Divan: no component should call another directly. Route all inter-component communication through Divan instead”.
Also applies to: 862-864, 970-1013
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_SPEC.md` around lines 816 - 818, The spec and client currently
describe Janissary making a direct POST to Kashif at /screen/appeal with
{appeal_id, payload, justification, url, method, province_id}; change this so
Janissary routes the request through Divan instead of calling Kashif directly:
update the spec text and client code that constructs the POST to call Divan's
forwarding endpoint (e.g., Divan.forward or Divan proxy API) with the same
payload and configurable timeout handling, remove the hardcoded
"http://127.0.0.1:8082/screen/appeal" target and any direct Kashif URL usage,
and ensure downstream handlers (Kashif) expect the forwarded envelope format and
tests/docs are updated to reflect Janissary→Divan→Kashif flow.
There was a problem hiding this comment.
Pushing back. The "one-way dependencies via Divan" invariant in CLAUDE.md is about state coordination — records that live in Divan (provinces, grants, appeals, audit, etc.). Kashif's /screen/appeal and /screen/ingress are stateless function calls (input → verdict). Routing them through Divan would add ~5-15s poll-cycle latency per appeal (caller→Divan→Kashif poll→verdict→Divan→caller poll) for the same trust boundary.
Codified the exception in CLAUDE.md (commit 8e413c5): the invariant now explicitly distinguishes state coordination (must go through Divan) from synchronous capability calls to stateless services (direct HTTP allowed). Kashif still writes verdicts to Divan for audit and appeal-state; only the request/response path is direct. Janissary→Kashif and Aga→Kashif both fall under this exception consistently.
There was a problem hiding this comment.
@stranma, that's a well-drawn distinction — the invariant applies to stateful coordination (records that live in Divan), not to synchronous capability calls against inherently stateless services. Routing a blocking content-inspection call through a 5s poll cycle would be architecturally harmful, not safe. The pattern where Kashif answers the request/response path directly while still writing the verdict and audit entry to Divan keeps the right things in the right place.
Acknowledged — my original comment was wrong on this one.
✏️ Learnings added
Learnt from: stranma
Repo: stranma/sultanate PR: 1
File: JANISSARY_SPEC.md:816-818
Timestamp: 2026-05-05T13:05:13.353Z
Learning: In the Sultanate architecture (repository: stranma/sultanate), the "one-way dependency via Divan" invariant applies only to *state coordination* — records that live in Divan (provinces, grants, appeals, audit, etc.). Direct HTTP calls between components are explicitly allowed for *synchronous capability calls to stateless services*. Kashif (`/screen/appeal`, `/screen/ingress`) is a stateless input→verdict service; Janissary→Kashif and Aga→Kashif direct HTTP calls are the correct pattern. Kashif still writes verdicts and audit entries to Divan; only the synchronous request/response path is direct. This exception is codified in CLAUDE.md (commit 8e413c5).
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| +-- Janissary (transparent proxy via WireGuard, dumb) | ||
| | Reads all state from Divan | ||
| | Whitelist: pass all traffic | ||
| | Non-whitelist: read-only (GET/HEAD pass, writes blocked) | ||
| | Blocked + appeal: forwarded to Kashif for triage; unclear | ||
| | cases escalate to Aga and Sultan | ||
| | Credential injection: reads grants from Divan; skips if | ||
| | OpenBao lease expired (fail closed) | ||
| | | ||
| +-- Kashif (paranoid content inspector, local LLM on host) | ||
| | Job: screen appeal payloads and Aga ingress for malice / | ||
| | prompt injection | ||
| | Models (all CPU): LLM Guard regex scanners + Prompt Guard 2 | ||
| | 22M + Llama Guard 3 1B Q4 | ||
| | Endpoints: POST /screen/appeal, POST /screen/ingress | ||
| | Fail-closed: timeout or LLM down -> escalate + alert Sultan |
There was a problem hiding this comment.
Route screening state transitions through Divan instead of direct Janissary→Kashif coupling.
This section defines direct Janissary-to-Kashif workflow semantics in the architecture core, which conflicts with the repository rule that inter-component coordination must be routed via Divan. Please make one contract authoritative and align this section accordingly to avoid split orchestration logic.
As per coding guidelines: "Enforce one-way component dependencies via Divan: no component should call another directly. Route all inter-component communication through Divan instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@SULTANATE_MVP.md` around lines 62 - 77, Update the architecture text so
Janissary does not call Kashif directly: make Divan the authoritative contract
for routing screening state transitions and describe Janissary as sending
appeal/ingress screening requests to Divan (which then forwards to Kashif),
remove any language implying direct Janissary→Kashif coupling, and clarify
endpoints (e.g., POST /screen/appeal, POST /screen/ingress) and fail-closed
semantics under the Divan-mediated flow so all inter-component coordination
references point to Divan as the single router.
There was a problem hiding this comment.
Pushing back. The "one-way dependencies via Divan" invariant in CLAUDE.md is about state coordination — records that live in Divan (provinces, grants, appeals, audit, etc.). Kashif's /screen/appeal and /screen/ingress are stateless function calls (input → verdict). Routing them through Divan would add ~5-15s poll-cycle latency per appeal (caller→Divan→Kashif poll→verdict→Divan→caller poll) for the same trust boundary.
Codified the exception in CLAUDE.md (commit 8e413c5): the invariant now explicitly distinguishes state coordination (must go through Divan) from synchronous capability calls to stateless services (direct HTTP allowed). Kashif still writes verdicts to Divan for audit and appeal-state; only the request/response path is direct. Janissary→Kashif and Aga→Kashif both fall under this exception consistently.
There was a problem hiding this comment.
@stranma, that's a well-reasoned distinction and I agree with it. The Divan invariant is correctly scoped to state coordination — durable records (provinces, grants, appeals, audit entries, etc.) that multiple components need to read over time. Kashif's /screen/appeal and /screen/ingress are synchronous RPC-style capability calls with ephemeral request/response semantics; routing those through a poll-cycle intermediary would be both high-latency and architecturally backward (Divan has no business orchestrating compute). The fact that Kashif still writes verdicts back to Divan for appeal-state transitions and audit entries means the state layer remains authoritative — only the synchronous invocation path is direct, which is the right split.
Retracting the comment. The codification in CLAUDE.md is the right call.
✏️ Learnings added
Learnt from: stranma
Repo: stranma/sultanate PR: 1
File: SULTANATE_MVP.md:62-77
Timestamp: 2026-05-05T13:05:10.753Z
Learning: In the Sultanate project (docs-only repo, Markdown architecture specs), the "route all inter-component coordination through Divan" invariant applies exclusively to *state coordination* — durable records such as provinces, grants, appeals, whitelists, audit entries. It does NOT apply to synchronous, stateless capability calls (e.g., Janissary→Kashif POST /screen/appeal or Aga→Kashif POST /screen/ingress). Those are direct HTTP calls to avoid poll-cycle latency (~5-15s). Kashif still writes verdicts to Divan for audit and appeal-state transitions; only the synchronous request/response path bypasses Divan. This exception is codified in CLAUDE.md.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| # Add peer to Janissary server config and HUP | ||
| add_peer_to_janissary_config(public_key, wg_peer_ip) | ||
| subprocess.run(["docker", "kill", "-s", "HUP", "janissary"], check=True) | ||
| ``` |
There was a problem hiding this comment.
Avoid direct Janissary config mutation from Vizier.
Having Vizier edit Janissary wg0.conf and signal Janissary directly creates a cross-component control path outside Divan. Move peer lifecycle intent into Divan state and let Janissary reconcile from that state.
As per coding guidelines: "Enforce one-way component dependencies via Divan: no component should call another directly. Route all inter-component communication through Divan instead".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@VIZIER_SPEC.md` around lines 479 - 482, The code currently mutates Janissary
config via add_peer_to_janissary_config(public_key, wg_peer_ip) and signals the
container with subprocess.run(["docker", "kill", "-s", "HUP", "janissary"],
check=True); instead, remove these direct actions and record the peer lifecycle
intent in Divan's state store (e.g., create or update a peer-intent record
containing public_key and wg_peer_ip and desired state like "add"), using the
existing Divan state API/functionality your repo provides (replace references to
add_peer_to_janissary_config and the subprocess.run call with a single Divan
state mutation call such as publish_peer_intent or update_divan_state). Ensure
the intent includes sufficient metadata for Janissary to reconcile (public_key,
wg_peer_ip, desired_action, caller/ts) and return or log the Divan operation
result; Janissary will be responsible for watching Divan state and applying the
actual wg0.conf changes and HUP.
Tier 3 (5 items, decisions):
- CLAUDE.md "One-way dependencies via Divan" invariant: added
exception for synchronous capability calls to stateless services
(Kashif /screen/*). Reasoning: routing Janissary->Kashif and
Aga->Kashif through Divan would add 5-15s poll-cycle latency per
appeal without security benefit. Kashif's verdict still goes
through Divan (audit + appeal record); only the synchronous
request/response is direct. Codifies the actual current behaviour
in JANISSARY_SPEC.md and AGA_SPEC.md.
- JANISSARY_MVP_PRD.md rules 3-4: rephrased two generic-noun "agent"
usages ("Agents can browse..." -> "Used for browsing..."; "the
agent" -> ""). Other Ottoman-naming items rejected: they already
follow CLAUDE.md's explicit "first-use gloss is allowed" rule
(e.g., "agent profile (berat)" then Ottoman downstream).
Second-pass CodeRabbit quick fixes:
- ARCHITECTURE.md L269 (mid-task credential request flow diagram):
POST /api/req -> POST /api/request_access (canonical name).
- DIVAN_MVP_PRD.md header subordination line: matches the canonical
"For shared glossary and architecture see [SULTANATE_MVP.md]"
phrasing used in other component PRDs.
- JANISSARY_SPEC.md /api/request_access flow: removed the "or
overload /appeals" ambiguity; canonical persistence path is now
POST /access_requests with cross-ref to DIVAN_API_SPEC.
- SULTANATE_MVP.md credential-model section: GitHub App renewal
interval unified to ~15 min everywhere (was ~30 min in one
place); GitHub TTL is 1h, so 15 min gives 4x safety margin.
Two new substantive items from the second pass surfaced for
explicit user discussion (not addressed in this commit):
- AGA_SPEC.md L295: openbao_lease_id as fake opaque ID
- DIVAN_API_SPEC.md L235: expired-lease pass-through vs fail-closed-at-proxy
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Item A -- openbao_lease_id semantics honest:
- For GitHub App grants (Phase 1 default), openbao_lease_id is now
null. The field is reserved for grants backed by real OpenBao
lease-issuing secret engines (DB creds, SSH CA, PKI, future
dynamic plugins). GitHub App tokens are minted by Aga directly
via the GitHub API; the token itself never goes through OpenBao's
lease machinery, so claiming a lease ID would be misleading.
- lease_expires_at is populated either way: GitHub's expires_at for
App tokens, OpenBao lease TTL for dynamic-engine credentials,
null for KV-fallback tokens with no expiry.
- Updated SULTANATE_MVP.md credential model, AGA_MVP_PRD.md and
AGA_SPEC.md grant-provisioning examples, OPENCLAW_CODING_BERAT_SPEC.md
examples, ARCHITECTURE.md province-creation narrative + US-3
test assertion.
Item B -- expired-lease fail-closed at proxy:
- DIVAN_API_SPEC.md "Lease Expiry Semantics": Janissary now BLOCKS
the request with 503 + body
{error: credential_renewing, domain, lease_expired_at, message}
instead of pass-through-without-header. Aligns with CLAUDE.md
fail-closed-everywhere invariant ("never propose pass-through on
degraded service").
- JANISSARY_SPEC.md §6 _inject_credentials() pseudocode rewritten:
on expired lease, sets flow.response = 503 directly, writes
audit severity=alert with action=lease_expired_block. Added
prose explaining the recovery handshake (Janissary blocks ->
audit alert -> Aga renews -> Pasha retries -> succeeds).
- AGA_SPEC.md §9 polling-loop pseudocode: unified the
Kashif-block-counter poll into a general audit-alert poll.
Now handles both component=kashif (drift counter) and
component=janissary action=lease_expired_block (expedited
renewal). Recovery time: ~30-60 s (Aga's 30 s poll + GitHub
mint round-trip + Janissary 5 s grant-cache refresh).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (3)
AGA_SPEC.md (1)
746-747:⚠️ Potential issue | 🟠 Major | ⚡ Quick winResolve contradiction on Sultan-originated screening in SOUL instructions.
Line 746–747 says Sultan channel content does not require Kashif screening, but Section 6 explicitly requires screening Sultan-pasted secret material via
/screen/ingress. Please make SOUL.md consistent with Section 6 to avoid bypass behavior.Based on learnings: "Aga ingress must be pre-screened by Kashif ..."; and Section 6 already scopes the exception correctly for structured commands vs pasted secret payloads.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@AGA_SPEC.md` around lines 746 - 747, Update the SOUL instructions to remove the contradictory statement that "Content from Sultan's Telegram channel is trusted and does not require Kashif screening" and instead align with Section 6 by stating that any Sultan-originated pasted secret material must be pre-screened by Kashif via the /screen/ingress flow; keep the existing exception language for structured, authorized Sultan commands (if any) but explicitly require Kashif screening for pasted payloads and reference the /screen/ingress process and Kashif as the required gatekeeper to prevent bypass.ARCHITECTURE.md (2)
664-664:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPick one province-destroy contract (status transition vs delete).
Line 664 still says “record deleted (or status=destroyed)”, which reintroduces ambiguity after the rest of the doc moved to explicit lifecycle states. Keep only the status-based contract for consistency.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ARCHITECTURE.md` at line 664, Replace the ambiguous checklist item "Divan province record deleted (or status=destroyed)" with a single status-based contract phrase to match the rest of the doc — e.g., change it to "Divan province status=destroyed" (remove any mention of record deletion) so the file consistently uses lifecycle state transitions; update the checklist item that currently contains "Divan province record deleted (or status=destroyed)" accordingly.
269-269:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse canonical API paths consistently in flow + test assertions.
Line 269 and the test assertions at Line 541/588 still use
/api/appealand/api/request_access, while the canonical contract in this PR is/appealsand/access_requests. This will cause integration/test drift.Also applies to: 541-541, 588-588
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ARCHITECTURE.md` at line 269, Update the API path strings so the diagram and tests use the canonical contract: replace any occurrences of "/api/appeal" with "/appeals" and "/api/request_access" with "/access_requests" (this affects the flow diagram entry currently showing "|--POST /api/request_access ->|" and the test assertions referencing "/api/appeal" and "/api/request_access"); ensure both the architecture flow text and the tests assert the new "/appeals" and "/access_requests" endpoints consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@DIVAN_API_SPEC.md`:
- Around line 723-727: The "alert" bullet in DIVAN_API_SPEC.md still reads
"lease-expired skips" but the spec defines fail-closed blocking as
lease_expired_block; update the wording in that bullet to "lease-expired blocks"
(or "lease_expired_block") so the phrase matches the defined behavior and
symbol. Locate the "alert" list item (the line containing "Kashif block, Kashif
escalate, repeated blacklist hits, lease-expired skips, port request approvals")
and replace the "lease-expired skips" token with "lease-expired blocks" to align
with the lease_expired_block definition.
In `@DIVAN_MVP_PRD.md`:
- Around line 13-15: Rewrite the sentence that currently states "no component
calls another directly" to specify that the rule applies only to
durable/stateful coordination via Divan, and explicitly allow direct synchronous
HTTP calls between stateless services (e.g., Janissary→Kashif and Aga→Kashif)
for screening/validation; reference the components Vizier, Janissary, Kashif,
Aga and the coordination store Divan so readers know the exception is limited to
stateless, synchronous interactions while durable state updates must go through
Divan.
In `@JANISSARY_MVP_PRD.md`:
- Around line 64-69: The doc currently instructs Janissary to forward requests
without the header when lease_expires_at is past, but this conflicts with the
fail-closed contract; update the text for the lease_expires_at branch to instead
block the request (do not forward), return a 503 with error code/label
"credential_renewing", and write an audit entry with severity=alert,
action=lease_expired, credential_injected=false so agents know to renew;
reference the lease_expires_at check and the fail-closed/credential_renewing
behavior so the wording matches other sections describing fail-closed blocking.
In `@JANISSARY_SPEC.md`:
- Around line 565-570: The Grant model currently uses openbao_lease_id as the
stable grant identifier but that field can be null for GitHub tokens; add an
explicit non-optional id field on class Grant (e.g., Grant.id: str) and keep
openbao_lease_id: str | None as-is, then change all emissions/serializations
that set grant_id to use grant.id instead of grant.openbao_lease_id (search for
usages of grant.openbao_lease_id in serialization/emit logic and the code paths
referenced around the Grant class, e.g., any lease_expired_block/ audit recovery
emitters) so consumers get a stable grant_id while preserving openbao_lease_id
as an optional field.
- Around line 47-65: In JANISSARY_SPEC.md update all fenced code blocks (e.g.,
the project tree block under the janissary/ example) to include a language
identifier (like ```text or ```bash) to satisfy MD040, ensure JSON fenced blocks
have the required blank lines before and after the fences to satisfy MD031, and
correct the spelling of “re-establishes” in the failure-mode table (search for
the table entries where that word appears). Apply the same fixes to the other
occurrences listed in the comment ranges so all fenced blocks across the
document include language tags and proper blank-line spacing and the typo is
fixed.
- Around line 810-812: Replace all occurrences of the generic term "agent" in
the runtime/request/appeal flows with the Ottoman term "Pasha" (or introduce a
single first-use gloss like "Pasha (agent)" and thereafter use only
"Pasha")—specifically update the sentence about write requests blocked by rule 4
and the 403 response body wording, and apply the same replacement throughout the
blocks flagged (around the current lines 810, 857–874, and 953–965) so all
request paths, appeal paths, examples, and descriptive text use "Pasha"
consistently.
- Around line 763-770: Update the prose describing the lease-expiry flow to
match the fail-closed behavior: when grant.lease_expires_at <= now the proxy
must block and return a 503 with error code/phrase credential_renewing (do not
forward upstream), and record an audit entry with severity=alert; replace any
wording that says “skip injection, write audit … continue” or “upstream 401/403”
with explicit fail-closed language. Apply the same change to the other
referenced passage that currently describes pass-through behavior (the section
mirroring lines 799-803) so both places consistently state proxy-level block,
503 credential_renewing, and audit logging.
In `@SULTANATE_MVP.md`:
- Around line 221-225: The spec currently states "Grant secret values are only
returned to the Janissary role" which contradicts the Divan API docs that allow
plaintext grant visibility to both Aga and Janissary; update this section to
reflect the correct access-control contract by changing the visibility statement
to indicate that grant secret values are returned to both the Aga and Janissary
roles (not Janissary-only), and add a cross-reference to DIVAN_API_SPEC.md and
the relevant `aga` and `janissary` role entries so readers can verify
endpoint-level read/write permissions.
---
Duplicate comments:
In `@AGA_SPEC.md`:
- Around line 746-747: Update the SOUL instructions to remove the contradictory
statement that "Content from Sultan's Telegram channel is trusted and does not
require Kashif screening" and instead align with Section 6 by stating that any
Sultan-originated pasted secret material must be pre-screened by Kashif via the
/screen/ingress flow; keep the existing exception language for structured,
authorized Sultan commands (if any) but explicitly require Kashif screening for
pasted payloads and reference the /screen/ingress process and Kashif as the
required gatekeeper to prevent bypass.
In `@ARCHITECTURE.md`:
- Line 664: Replace the ambiguous checklist item "Divan province record deleted
(or status=destroyed)" with a single status-based contract phrase to match the
rest of the doc — e.g., change it to "Divan province status=destroyed" (remove
any mention of record deletion) so the file consistently uses lifecycle state
transitions; update the checklist item that currently contains "Divan province
record deleted (or status=destroyed)" accordingly.
- Line 269: Update the API path strings so the diagram and tests use the
canonical contract: replace any occurrences of "/api/appeal" with "/appeals" and
"/api/request_access" with "/access_requests" (this affects the flow diagram
entry currently showing "|--POST /api/request_access ->|" and the test
assertions referencing "/api/appeal" and "/api/request_access"); ensure both the
architecture flow text and the tests assert the new "/appeals" and
"/access_requests" endpoints consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 924a1866-8da6-40e3-a789-87bd0d25cbf7
📒 Files selected for processing (10)
AGA_MVP_PRD.mdAGA_SPEC.mdARCHITECTURE.mdCLAUDE.mdDIVAN_API_SPEC.mdDIVAN_MVP_PRD.mdJANISSARY_MVP_PRD.mdJANISSARY_SPEC.mdOPENCLAW_CODING_BERAT_SPEC.mdSULTANATE_MVP.md
| - `info`: routine operation (allowed request, Kashif allow, grant | ||
| injected). Not polled for notifications. | ||
| - `alert`: requires Sultan + Aga awareness (Kashif block, Kashif | ||
| escalate, repeated blacklist hits, lease-expired skips, port | ||
| request approvals). Vizier and Aga poll for these. |
There was a problem hiding this comment.
Update audit example wording from “lease-expired skips” to “lease-expired blocks.”
Line 726 still references the old “skip” behavior, but this spec now defines fail-closed blocking (lease_expired_block). Please align the wording to avoid operational confusion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@DIVAN_API_SPEC.md` around lines 723 - 727, The "alert" bullet in
DIVAN_API_SPEC.md still reads "lease-expired skips" but the spec defines
fail-closed blocking as lease_expired_block; update the wording in that bullet
to "lease-expired blocks" (or "lease_expired_block") so the phrase matches the
defined behavior and symbol. Locate the "alert" list item (the line containing
"Kashif block, Kashif escalate, repeated blacklist hits, lease-expired skips,
port request approvals") and replace the "lease-expired skips" token with
"lease-expired blocks" to align with the lease_expired_block definition.
| Vizier, Janissary, Kashif, Aga all read from and write to Divan; no | ||
| component calls another directly. Coordination happens through Divan. | ||
|
|
There was a problem hiding this comment.
Narrow the “no direct calls” rule to state coordination only.
Line 14 currently says no component calls another directly, which contradicts the documented exception for direct synchronous stateless screening calls (Janissary→Kashif and Aga→Kashif). Please scope this sentence to durable state coordination through Divan.
Based on learnings: "all inter-component coordination through Divan applies only to state coordination ... direct synchronous HTTP calls to stateless services are exempt."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@DIVAN_MVP_PRD.md` around lines 13 - 15, Rewrite the sentence that currently
states "no component calls another directly" to specify that the rule applies
only to durable/stateful coordination via Divan, and explicitly allow direct
synchronous HTTP calls between stateless services (e.g., Janissary→Kashif and
Aga→Kashif) for screening/validation; reference the components Vizier,
Janissary, Kashif, Aga and the coordination store Divan so readers know the
exception is limited to stateless, synchronous interactions while durable state
updates must go through Divan.
| 1. Check `lease_expires_at`. If set and past current UTC time, **skip | ||
| injection** and write an audit entry with `severity=alert, | ||
| action=lease_expired, credential_injected=false`. The request | ||
| proceeds without the header; upstream will return `401`/`403` | ||
| naturally, signalling to the agent that the token is stale. Aga | ||
| sees the audit entry and re-issues or renews. |
There was a problem hiding this comment.
Align expired-lease behavior to fail-closed blocking (not pass-through).
Line 64–69 currently says Janissary should forward without header when lease_expires_at is expired. That conflicts with the documented fail-closed contract elsewhere in this PR (block with 503 credential_renewing and alert/audit), and will create implementation drift.
Proposed doc fix
-1. Check `lease_expires_at`. If set and past current UTC time, **skip
- injection** and write an audit entry with `severity=alert,
- action=lease_expired, credential_injected=false`. The request
- proceeds without the header; upstream will return `401`/`403`
- naturally, signalling to the agent that the token is stale. Aga
- sees the audit entry and re-issues or renews.
+1. Check `lease_expires_at`. If set and past current UTC time, **fail
+ closed**: return `503` with `error=credential_renewing`, do not
+ forward upstream, and write an alert-severity audit entry
+ (`action=lease_expired_block`). Aga uses that alert to expedite
+ renewal, then the Pasha retries.Based on learnings: "Implement fail-closed behavior everywhere: ... expired leases fail closed".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_MVP_PRD.md` around lines 64 - 69, The doc currently instructs
Janissary to forward requests without the header when lease_expires_at is past,
but this conflicts with the fail-closed contract; update the text for the
lease_expires_at branch to instead block the request (do not forward), return a
503 with error code/label "credential_renewing", and write an audit entry with
severity=alert, action=lease_expired, credential_injected=false so agents know
to renew; reference the lease_expires_at check and the
fail-closed/credential_renewing behavior so the wording matches other sections
describing fail-closed blocking.
| ``` | ||
| janissary/ | ||
| ├── janissary_addon.py # mitmproxy addon (forked from SandcatAddon) | ||
| │ # traffic rules, credential injection (lease-aware), | ||
| │ # audit writes to Divan | ||
| ├── janissary_api.py # FastAPI appeal/access-request HTTP API (port 8081) | ||
| ├── janissary_kashif.py # Kashif client (POST /screen/appeal, /screen/ingress) | ||
| ├── janissary_state.py # DivanPoller + JanissaryStateCache | ||
| ├── janissary_audit.py # AuditClient (POST /audit to Divan) | ||
| ├── janissary_config.py # Config loading (config.yaml) | ||
| ├── janissary_main.py # Process entry point (starts mitmproxy + API server) | ||
| ├── wg-client/ | ||
| │ ├── Dockerfile # WireGuard + iptables kill-switch container | ||
| │ └── entrypoint.sh # WireGuard interface setup + iptables rules | ||
| ├── Dockerfile # mitmproxy + addon container | ||
| ├── janissary-entrypoint.sh | ||
| ├── config.yaml # Default config template | ||
| └── requirements.txt # mitmproxy, fastapi, uvicorn, httpx, pyyaml | ||
| ``` |
There was a problem hiding this comment.
Fix markdownlint blockers and spelling issue to keep docs CI clean.
Add language identifiers to fenced blocks (MD040), add required blank lines around JSON fences (MD031), and fix the typo “re-establishes” in the failure-mode table.
Also applies to: 100-104, 182-186, 210-228, 280-307, 467-475, 952-972, 1429-1474, 1491-1493, 1547-1547
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 47-47: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_SPEC.md` around lines 47 - 65, In JANISSARY_SPEC.md update all
fenced code blocks (e.g., the project tree block under the janissary/ example)
to include a language identifier (like ```text or ```bash) to satisfy MD040,
ensure JSON fenced blocks have the required blank lines before and after the
fences to satisfy MD031, and correct the spelling of “re-establishes” in the
failure-mode table (search for the table entries where that word appears). Apply
the same fixes to the other occurrences listed in the comment ranges so all
fenced blocks across the document include language tags and proper blank-line
spacing and the typo is fixed.
| class Grant: | ||
| header: str | ||
| value: str | ||
| openbao_lease_id: str | None = None | ||
| lease_expires_at: datetime | None = None # parsed from ISO 8601 | ||
|
|
There was a problem hiding this comment.
grant_id is sourced from openbao_lease_id, which can be null for GitHub tokens.
The spec currently returns grant_id: grant.openbao_lease_id. With the documented GitHub App path, openbao_lease_id may be null, which breaks correlation for lease_expired_block recovery/audit workflows expecting a stable grant identifier. Add explicit Grant.id and emit that as grant_id; keep openbao_lease_id as a separate optional field.
Also applies to: 727-731, 741-742
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_SPEC.md` around lines 565 - 570, The Grant model currently uses
openbao_lease_id as the stable grant identifier but that field can be null for
GitHub tokens; add an explicit non-optional id field on class Grant (e.g.,
Grant.id: str) and keep openbao_lease_id: str | None as-is, then change all
emissions/serializations that set grant_id to use grant.id instead of
grant.openbao_lease_id (search for usages of grant.openbao_lease_id in
serialization/emit logic and the code paths referenced around the Grant class,
e.g., any lease_expired_block/ audit recovery emitters) so consumers get a
stable grant_id while preserving openbao_lease_id as an optional field.
| ``` | ||
| Request arrives (already decrypted by mitmproxy MITM) | ||
| -> Look up grant for (source_ip, domain) | ||
| -> If grant.lease_expires_at <= now: | ||
| -> skip injection, write audit severity=alert, continue | ||
| -> Else: set/replace the specified header | ||
| -> Forward to upstream | ||
| ``` |
There was a problem hiding this comment.
Lease-expiry flow prose contradicts the documented fail-closed 503 behavior.
These sections still describe pass-through/continue behavior and upstream 401/403 after expiry, but the implementation snippet and PR objectives define proxy-level block with 503 credential_renewing. Please align this text to the fail-closed handshake (block at proxy, no upstream forward).
Based on learnings: “Implement fail-closed behavior everywhere …”.
Also applies to: 799-803
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 763-763: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_SPEC.md` around lines 763 - 770, Update the prose describing the
lease-expiry flow to match the fail-closed behavior: when grant.lease_expires_at
<= now the proxy must block and return a 503 with error code/phrase
credential_renewing (do not forward upstream), and record an audit entry with
severity=alert; replace any wording that says “skip injection, write audit …
continue” or “upstream 401/403” with explicit fail-closed language. Apply the
same change to the other referenced passage that currently describes
pass-through behavior (the section mirroring lines 799-803) so both places
consistently state proxy-level block, 503 credential_renewing, and audit
logging.
| A write request (POST/PUT/PATCH/DELETE) to a non-whitelisted domain is | ||
| blocked by rule 4. The 403 response body tells the agent how to appeal. | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Use Ottoman naming consistently (Pasha instead of agent) in runtime flows.
This SPEC still uses “agent” repeatedly in request/appeal paths. Replace with “Pasha” (or use a single first-use gloss, then Ottoman terms consistently) to match repo terminology invariants.
As per coding guidelines: “Use Ottoman naming consistently across all PRDs—use terms (Pasha, Province, Firman, Berat, Divan, Realm, Aga) rather than alternate names like 'agent', 'container', or 'template'”.
Also applies to: 857-874, 953-965
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@JANISSARY_SPEC.md` around lines 810 - 812, Replace all occurrences of the
generic term "agent" in the runtime/request/appeal flows with the Ottoman term
"Pasha" (or introduce a single first-use gloss like "Pasha (agent)" and
thereafter use only "Pasha")—specifically update the sentence about write
requests blocked by rule 4 and the 403 response body wording, and apply the same
replacement throughout the blocks flagged (around the current lines 810,
857–874, and 953–965) so all request paths, appeal paths, examples, and
descriptive text use "Pasha" consistently.
| **Access control:** Divan authenticates callers via pre-shared API keys (one | ||
| per component, generated at deploy time). Each key maps to a role with | ||
| endpoint-level read/write permissions. Grant secret values are only returned | ||
| to the Janissary role. See `DIVAN_API_SPEC.md` for details. | ||
|
|
There was a problem hiding this comment.
Fix grant secret-field visibility contract (Aga + Janissary, not Janissary-only).
Line 224 says secret grant values are only returned to Janissary, but Divan docs in this PR specify plaintext visibility for both aga and janissary. Please align this section to the same access-control contract.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@SULTANATE_MVP.md` around lines 221 - 225, The spec currently states "Grant
secret values are only returned to the Janissary role" which contradicts the
Divan API docs that allow plaintext grant visibility to both Aga and Janissary;
update this section to reflect the correct access-control contract by changing
the visibility statement to indicate that grant secret values are returned to
both the Aga and Janissary roles (not Janissary-only), and add a cross-reference
to DIVAN_API_SPEC.md and the relevant `aga` and `janissary` role entries so
readers can verify endpoint-level read/write permissions.
Summary
Full rewrite of the Sultanate design docs onto a new MVP baseline. Replaces the original V1/V2/V3 PRDs (Hermes-native, Infisical-vault, Kashif-deferred, Sentinel-named) with a coherent MVP doc set built around four locked architectural decisions:
This PR is doc-only — no implementation. Every file is Markdown.
Why this PR exists
The repo previously had two parallel histories:
main(commits4dda7f4..176d609) — original design phaseorigin/archive-hermes-infisical(preserved, not on main) — substantially-rewritten MVP docs that committed to Hermes + Infisical + SentinelThis PR consolidates everything onto a single MVP baseline that:
Locked architectural decisions
File changes (high level)
New files (16):
MOTIVATION.md,ARCHITECTURE.md,SULTANATE_MVP.md,CLAUDE.mdJANISSARY_MVP_PRD.md,JANISSARY_SPEC.md,KASHIF_MVP_PRD.md,AGA_MVP_PRD.md,AGA_SPEC.md,DIVAN_MVP_PRD.md,DIVAN_API_SPEC.mdVIZIER_MVP_PRD.md,VIZIER_SPEC.mdOPENCLAW_FIRMAN_MVP_PRD.md,OPENCLAW_FIRMAN_SPEC.md,OPENCLAW_CODING_BERAT_MVP_PRD.md,OPENCLAW_CODING_BERAT_SPEC.mdDeleted files (5):
SULTANATE.md→ superseded bySULTANATE_MVP.mdJANISSARY_PRD_V2.md→ split acrossJANISSARY_*+KASHIF_*+AGA_*+DIVAN_*VIZIER_PRD_V3.md→ superseded byVIZIER_*HERMES_FIRMAN_PRD_V1.md→ superseded byOPENCLAW_FIRMAN_*HERMES_CODING_BERAT_PRD_V1.md→ superseded byOPENCLAW_CODING_BERAT_*Modified (2):
README.md→ new doc index by layer, OpenBao + Sandcat + OpenClaw + Kashif implementation notes, Hetzner targetSENTINELGATE_ANALYSIS.md→ only the Sentinel→Aga rename (kept as Phase-2 reference)Preserved branches:
origin/archive-hermes-infisical— the prior Hermes/Infisical MVP attempt, untouchedorigin/mvp-rewrite— this branch (the work being PR'd)Highlights worth reviewing carefully
Appeal flow (
ARCHITECTURE.md§2) — T+0.0s timeline narrative covering all three Kashif outcomes (allow / block / escalate). Locks in the rule that only Kashif=allow stays silent; both block and escalate notify Sultan AND Aga (so they can spot Pasha drift, e.g. 3 blocks in 10 min → consider destroying the province).Mid-task credential request flow (
ARCHITECTURE.md§2.3) — parallel to the appeal flow, but the allow path always involves Sultan because granting a long-lived credential is more consequential than letting one write through.GitHub App as Phase 1 default (
SULTANATE_MVP.mdGitHub Token Strategy) — Sultan's only manual action is the one-time GitHub App install. Aga then mints per-province installation tokens (1-hour TTL, auto-renewed every ~15 min while province is running). Sultan does not paste tokens day-to-day. KV fallback exists for services without a dynamic mint path.Communication Channels (
SULTANATE_MVP.md) — explicit per-province Telegram bot model (2 + N bots: Vizier, Aga, one per active Pasha). Compromise of a Pasha bot token has per-province blast radius — cannot reach other Pashas, cannot reach Aga, cannot bypass Janissary.Lease-aware credential injection (
JANISSARY_SPEC.md§6) — Janissary checkslease_expires_atbefore injecting; expired lease → skip + auditseverity=alert(request continues without header so upstream returns 401, signalling staleness; Aga sees the audit and re-issues).Kashif verdict → Divan auto-transition (
DIVAN_API_SPEC.mdAppeals) — formalised as part of the API contract: Kashif=allow auto-approves the appeal, Kashif=block auto-denies, Kashif=escalate stays pending for Sultan.Dashboard via Tailscale (
DIVAN_MVP_PRD.md) — primary access path (mobile-friendly, identity-gated). SSH-tunnel + 127.0.0.1 binding documented as fallback for non-Tailscale environments.Test plan
SULTANATE_MVP.md, notSULTANATE.md;OPENCLAW_FIRMAN_*, notHERMES_FIRMAN_*; etc.)git grep -i "sentinel" -- '*.md'returns onlySentinelGate(product name) + explicit archive-branch referencesgit grep -i "infisical" -- '*.md'returns onlyarchive-hermes-infisicalbranch referencesgit grep -i "hermes-firman\|hermes-coding-berat" -- '*.md'returns zerogit grep -i "kashif deferred" -- '*.md'returns zeroCLAUDE.mdinvariants are consistent with what the rewritten docs actually say (architectural invariants table, Kashif verdict transitions, MVP threat model scope)Out of scope
openclaw-coding-berat🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Chores