Skip to content

docs: Add production guide#302

Open
zeljkoX wants to merge 6 commits into
mainfrom
299-production-guide
Open

docs: Add production guide#302
zeljkoX wants to merge 6 commits into
mainfrom
299-production-guide

Conversation

@zeljkoX

@zeljkoX zeljkoX commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Implements #299 — an end-to-end production deployment guide under docs/guides/production/.

⚠️ Depends on #301 — merge after gets merged

This guide is written #301 (horizontal scaling)](#301) are merged**. It deliberately documents features not yet on main:

  • horizontal scaling: Postgres-backed shared coordination, the coordination mode=shared … startup log line, GUARDIAN_MAX_REPLICAS, and the prod-stage cursor-secret/filesystem guards (from feat: 010 scalability improvements #301).

Automated reviewers (Copilot) flag these as "not in the codebase" — that is expected and by design. Each was cross-checked against the #293/#301 contracts and is correct for the post-merge world. This PR should merge only after #293 and #301 land.

Wording that is enforced by the committed Compose stacks (the ${VAR:?} cursor-secret check) rather than the server has been corrected to say so, so the Compose guides are accurate against main today.


Summary by CodeRabbit

  • New Features

    • Added a full production deployment guide with step-by-step instructions for running Guardian in AWS or via Docker Compose.
    • Added a self-hosted production setup with example environment files and container configuration.
  • Documentation

    • Expanded production checklist links and added clearer guidance on required secrets, validation steps, and troubleshooting.
    • Updated guide index pages to surface the new production deployment path and related setup details.

@zeljkoX zeljkoX requested a review from Copilot June 24, 2026 12:13
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 332173a3-b6e5-4c30-b895-1884d1cb3517

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 299-production-guide

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/guides/aws-signers/.env.example`:
- Around line 34-36: Reword the cursor signing key note in the .env.example
guidance so it does not claim the prod stage itself refuses to start; the
enforcement happens through Compose variable expansion in the docker-compose
setup. Update the comment near the pagination cursor secret description to match
this behavior, keeping the requirement for a 32-byte hex key but removing any
runtime-startup wording tied to prod.

In `@docs/guides/aws-signers/README.md`:
- Around line 65-67: The README wording for GUARDIAN_DASHBOARD_CURSOR_SECRET
should be updated because the current phrasing incorrectly implies the server
itself enforces the startup failure. Rephrase the explanation in the aws-signers
guide to make it clear that the immediate check happens in the Compose
required-variable validation driven by GUARDIAN_ENV=prod, and that this is what
blocks startup when the secret is missing.

In `@docs/guides/production/docker-compose.yml`:
- Line 24: The docker-compose service is publishing the metrics port externally
via the 9464 mapping, which exposes the endpoint on all interfaces by default.
Update the compose configuration for the affected service entries to bind
metrics to loopback only or remove the published port entirely, and keep the
change consistent across all referenced instances in the docker-compose file.
- Around line 19-20: The production docker compose service is defaulting to an
unstable image tag via the image field that references GUARDIAN_VERSION with a
latest fallback, which can cause non-reproducible deployments. Update the
compose configuration to require an explicit version tag for the guardian image
and remove the latest default from the image reference in the docker-compose
setup, keeping the change localized to the service definition that uses
pull_policy.

In `@docs/superpowers/specs/2026-06-24-production-guide-design.md`:
- Around line 17-25: The spec currently states an AWS-only scope and explicitly
says there is no committed Compose track, which conflicts with the new Docker
Compose deliverables. Update the scope/non-goals text in the production guide
spec to match the implemented `docs/guides/production/docker-compose.yml` and
related README content, using the existing “Scope” section and any references to
`PRODUCTION.md`/Compose so acceptance criteria are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 497b1ace-da82-47c9-9755-a3f8306bdef8

📥 Commits

Reviewing files that changed from the base of the PR and between 57a43d1 and e2dacda.

📒 Files selected for processing (9)
  • docs/PRODUCTION.md
  • docs/guides/README.md
  • docs/guides/aws-signers/.env.example
  • docs/guides/aws-signers/README.md
  • docs/guides/aws-signers/docker-compose.yml
  • docs/guides/production/.env.example
  • docs/guides/production/README.md
  • docs/guides/production/docker-compose.yml
  • docs/superpowers/specs/2026-06-24-production-guide-design.md

Comment thread docs/guides/aws-signers/.env.example Outdated
Comment on lines +34 to +36
# Required (GUARDIAN_ENV=prod): 32-byte hex (64 chars) signing key for dashboard
# pagination cursors. The prod stage refuses to start if this is unset. Generate:
# openssl rand -hex 32

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Fix enforcement wording for missing cursor secret

Lines 35-36 currently imply the prod stage itself refuses startup. In this guide, the hard requirement is enforced by Compose variable expansion (see docs/guides/aws-signers/docker-compose.yml Line 45). Please reword to avoid runtime-behavior mismatch.

Suggested doc tweak
-# pagination cursors. The prod stage refuses to start if this is unset. Generate:
+# pagination cursors. This Compose stack requires it to be set before startup. Generate:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Required (GUARDIAN_ENV=prod): 32-byte hex (64 chars) signing key for dashboard
# pagination cursors. The prod stage refuses to start if this is unset. Generate:
# openssl rand -hex 32
# Required (GUARDIAN_ENV=prod): 32-byte hex (64 chars) signing key for dashboard
# pagination cursors. This Compose stack requires it to be set before startup. Generate:
# openssl rand -hex 32
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/aws-signers/.env.example` around lines 34 - 36, Reword the cursor
signing key note in the .env.example guidance so it does not claim the prod
stage itself refuses to start; the enforcement happens through Compose variable
expansion in the docker-compose setup. Update the comment near the pagination
cursor secret description to match this behavior, keeping the requirement for a
32-byte hex key but removing any runtime-startup wording tied to prod.

Comment thread docs/guides/aws-signers/README.md Outdated
Comment thread docs/guides/production/docker-compose.yml Outdated
Comment thread docs/guides/production/docker-compose.yml Outdated
Comment thread docs/superpowers/specs/2026-06-24-production-guide-design.md Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new end-to-end “Production deployment” guide under docs/guides/production/, and wires it into the docs entry points so operators can follow a single step-by-step walkthrough from docs/PRODUCTION.md / docs/guides/README.md.

Changes:

  • Add docs/guides/production/README.md plus a companion Compose stack and .env.example.
  • Link the new guide from docs/PRODUCTION.md and list it in docs/guides/README.md.
  • Update the existing aws-signers guide’s Compose setup to include the dashboard cursor secret.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 22 comments.

Show a summary per file
File Description
docs/superpowers/specs/2026-06-24-production-guide-design.md Design/spec notes for the production guide deliverable and scope.
docs/PRODUCTION.md Adds prominent link to the new production deployment guide and includes it in the “Where details live” table.
docs/guides/README.md Adds the new Production deployment guide to the guides index and explains its artifacts.
docs/guides/production/README.md New step-by-step production walkthrough (AWS ECS/Fargate + optional Compose track).
docs/guides/production/docker-compose.yml New Compose stack for a self-hosted, single-replica run using AWS-managed secrets.
docs/guides/production/.env.example Example environment file for the new production Compose track.
docs/guides/aws-signers/README.md Documents the new required env var for the aws-signers Compose setup and points readers to the production guide.
docs/guides/aws-signers/docker-compose.yml Adds GUARDIAN_DASHBOARD_CURSOR_SECRET to the aws-signers Compose environment.
docs/guides/aws-signers/.env.example Adds GUARDIAN_DASHBOARD_CURSOR_SECRET to the aws-signers example env file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md Outdated
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
Comment thread docs/guides/production/README.md
@codecov-commenter

codecov-commenter commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.95%. Comparing base (5b3f9e9) to head (babeb54).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #302      +/-   ##
==========================================
+ Coverage   76.64%   76.95%   +0.30%     
==========================================
  Files         155      160       +5     
  Lines       27745    28565     +820     
==========================================
+ Hits        21264    21981     +717     
- Misses       6481     6584     +103     

Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 89591a3...babeb54. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Attribute cursor-secret enforcement to the Compose ${VAR:?} expansion rather
  than a server-side prod guard (the server cursor secret is optional on main;
  the hard requirement lands with #301).
- Production compose: require an explicit GUARDIAN_VERSION (drop the :latest
  default) and bind the metrics port to loopback (127.0.0.1:9464).
- Track B smoke: drop the unconfirmed "storage encryption" log grep; rely on
  ECDSA-signer-ready + clean startup.
- Remove the design-spec artifact from the PR (brainstorming doc, not repo
  content).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zeljkoX

zeljkoX commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

Review response

Fixed (valid regardless of merge order):

  • Cursor-secret wording in aws-signers/ and production/ now attributes the requirement to the Compose ${VAR:?} expansion, not a server-side prod guard (the server cursor secret is optional on main; the hard requirement arrives with feat: 010 scalability improvements #301).
  • production/docker-compose.yml: dropped the :latest default in favor of a required GUARDIAN_VERSION, and bound the metrics port to loopback (127.0.0.1:9464).
  • Track B smoke no longer greps an unconfirmed storage encryption log line.
  • Removed the design-spec artifact from the PR (resolves the spec-scope findings).

Copilot findings re: missing features (storage encryption envs/commands, GUARDIAN_MAX_REPLICAS, coordination mode=shared … log line, Postgres-backed sessions, prod-stage guards): these are by design — the guide is written as if #293 and #301 are merged, and each item was cross-checked against those PRs' contracts (none are post-merge errors). See the PR description; this lands after #293 + #301.

zeljkoX and others added 2 commits June 24, 2026 15:18
The allowlist section now states there is no operator-key bootstrap (the server
only holds operator public keys) and points at DASHBOARD.md "Enrolling an
operator" for how an operator generates their own Falcon keypair.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Show both allowlist options: Terraform-managed from a public-key JSON list
(dashboard:read only) vs. an externally-managed Secrets Manager secret via
GUARDIAN_OPERATOR_PUBLIC_KEYS_SECRET_ARN (runtime _SECRET_ID), which is the only
path that can grant accounts:pause via object entries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zeljkoX zeljkoX moved this from Backlog to In Progress in OZ Development for Miden Jun 24, 2026
@zeljkoX zeljkoX marked this pull request as ready for review June 25, 2026 10:22
@zeljkoX zeljkoX requested a review from haseebrabbani as a code owner June 25, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: allowlist documentation Improvements or additions to documentation

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Add docs/guides/production/ — end-to-end production walkthrough

3 participants