Skip to content

Portal admin seeding, Agent Management rename, spiffe-proxy admin-key fix, README portal tour#12

Merged
brandwe merged 9 commits into
mainfrom
dev
May 27, 2026
Merged

Portal admin seeding, Agent Management rename, spiffe-proxy admin-key fix, README portal tour#12
brandwe merged 9 commits into
mainfrom
dev

Conversation

@brandwe

@brandwe brandwe commented May 27, 2026

Copy link
Copy Markdown
Member

Summary

Roll-up of the dev work behind the recent demo deployments — naming, portal admin seeding, a real enforcement bug fix, and visual polish in the README.

Highlights

Naming

  • Repo-wide rename aim-*isp-* for resource prefixes (d4eaa20).
  • Shortened Entra display names + portal branding from "Identity Research for Agent Management Using SPIFFE …" to "Agent Management …" (8a11dbe, 611263d). Long-form name preserved everywhere in docs/ prose.

Portal admin seeding

  • New deploy.sh --with-admin=<upn> flag (repeatable) + ISP_INITIAL_ADMINS env var. Falls back to the signed-in az user when neither is set (2855699).
  • New scripts/portal-members.sh helper: add-admin / add-viewer / remove-admin / remove-viewer / list.
  • Auto-invites missing external emails as B2B guests via Microsoft Graph /v1.0/invitations (7d41dde). Pass --no-invite to disable.
  • README + quickstart updated with the new flag and the "you'll be denied access if you sign in as a different UPN than az login" callout.

Bug fix: portal flips to LIVE FAILED after deploy

  • spiffe-proxy ingress was stripping every X-Spiffe-* header on inbound requests to prevent caller-ID spoofing — but that also removed X-Spiffe-Admin-Key, the shared-secret credential that admin-control-plane forwards on every /mgmt/* call.
  • End result: portal /system-status → ACP /admin/health → budget-backend /mgmt/health returned 401 "Invalid or missing X-Spiffe-Admin-Key header" and the portal badge flipped to LIVE FAILED even though the deployment was healthy.
  • Fix in 3e9397e: allow-list X-Spiffe-Admin-Key while still stripping the spoofable identity-bearing headers. Regression test added in src/spiffe-proxy/internal/inspect/http_test.go.

README + docs visual polish

  • New enforcement flow hero image rendered full-width under the Enforcement model table (a74e63a).
  • New Portal tour section with the dashboard overview as a hero and clickable thumbnails for Test Calls, Enforcement Layers, and the A2A direct-call example (bf958a7) showing JWT-valid / risk-low traffic correctly denied at the target by a Conditional Access tag mismatch.
  • Same enforcement-flow hero added to docs/index.md so the published GitHub Pages site leads with the visual after this merge.

Why one PR

All eight commits were authored against the same live demo env this evening. Merging them together rebuilds the docs site (.github/workflows/docs.yml only deploys on push to main) so the renamed prefixes, screenshots, and quickstart copy land in one consistent published cut.

Validation done

  • go test ./internal/inspect/... ./internal/rbac/... in src/spiffe-proxy/: pass.
  • python3 -m unittest discover -s scripts/tests: pass.
  • bash -n deploy.sh and bash -n scripts/portal-members.sh: clean.
  • Verified the LIVE FAILED bug against the live gentlesea-c0fa16e3 deployment: /admin/agents returned 200, /admin/health returned 401 with the exact error message above, fix derived directly from that trace.
  • Ran ./scripts/portal-members.sh add-admin <upn> end-to-end including the B2B invite path.

Co-author

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

brandwe and others added 9 commits May 26, 2026 17:57
Replaces the legacy 'aim-' resource/identifier prefix and 'AIM_' env var
prefix with the shorter 'isp-' / 'ISP_' to align with the repo's new
identity-spiffe branding.

- deploy.sh: env-name guard now accepts ^(identity-spiffe|isp-); RG
  discovery searches both rg-identity-spiffe* and rg-isp-*; fixed the
  misleading suggestion that previously recommended a non-conforming name
- infra/main.bicep + main.parameters.bicepparam: resource prefixes, tag
- scripts/, portal/, securityportal-mock/, src/budget-backend/: renamed
  identifiers and env vars
- Docs and CLAUDE.md files updated
- Tests updated (sanitizer input)

Deliberately NOT renamed (separate, deeper refactor):
- SPIFFE trust domains aim.microsoft.com / gcp.aim.microsoft.com /
  aws.aim.microsoft.com
- Go package aimtls in src/spiffe-proxy
- Frontend localStorage keys (aim_log_*, aim_pinned_agents)
- agency.toml [agents.aim]

Validated: bash -n clean, Python compile clean, 16/16 targeted tests pass,
deploy.sh runs past the env guard into azd provisioning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The verbose 'Identity Research for Agent Management Using SPIFFE *'
prefix on Entra-stored objects (blueprint, apps, groups, CA policy,
provisioner) was unreadably long in Azure Portal and in preflight
output, e.g.:

  Identity Research for Agent Management Using SPIFFE Budget Backend
    Agents [identity-spiffe]

Renamed Entra-display constants only — the project's prose name stays
'Identity Research for Agent Management Using SPIFFE' in README, docs,
mkdocs, and CLI banners. Only the strings Entra stores get the short
form.

New display names (preflight will now show):
- Blueprint:        Agent Management Budget Backend Agents [<env>]
- Admin group:      Agent Management Administrators
- Viewer group:     Agent Management Viewers
- Management app:   Agent Management Portal - Management [<env>]
- Security app:     Agent Management Portal - Security Portal Mock [<env>]
- Provisioner:      Agent Management Agent ID Provisioner
- CA policy:        Agent Management: Block agents based on risk

Files:
- scripts/entra_scope.py: 5 LEGACY_*_DISPLAY_NAME constants
- scripts/entra_provisioning.py: PROVISIONER_APP_DISPLAY_NAME
- scripts/create-entra-agent-ids.py: PROVISIONER_APP_DISPLAY_NAME
- scripts/create-custom-attributes.py: CA_POLICY_NAME, OLD_CA_POLICY_NAME
  (OLD points at the long-form name so any internal-test deployment
  gets the policy renamed in-place on next provision),
  ATTRIBUTE_SET_DESCRIPTION
- portal/app/clients/graph.py: fetch_ca_policies default display_name_filter
  updated to 'Agent Management:' to match the new CA policy prefix
- deploy.sh: updated user-facing echo referencing the admin group name
- scripts/tests/test_entra_scope.py: 6 expected-string updates

Tests: scripts/tests/test_entra_scope (6) and portal/tests (39) pass.
deploy.sh passes bash -n.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rename Entra-tracked object display names from the long 'Identity Research
for Agent Management Using SPIFFE …' prefix to a shorter 'Agent Management …'
prefix to avoid name-length blowup in scoped-mode environments.

Renamed:
- Blueprint, Portal management app, Security Portal app
- Provisioner app, CA policy, CA-policy filter prefix
- Portal auth groups (Administrators, Viewers)

Portal/security-portal UI also updated to match (titles, sidebar brand,
auth splash, access-denied messages, JWT validator role hint).

Docs retain the long 'Identity Research for Agent Management Using SPIFFE'
name where used as the project/repo description.

The previous CA policy display name is kept as OLD_CA_POLICY_NAME so the
existing cleanup path will delete the prior policy on next provisioning run.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
deploy.sh
- New --with-admin=<upn|oid> flag (repeatable, accepts UPN/email or
  object ID) and ISP_INITIAL_ADMINS env var to seed the portal
  Administrators group with one or more tenant users during deploy.
- Falls back to the signed-in az CLI user when neither is provided,
  preserving prior behavior, and prints a tip pointing at the new
  flag/env so first-time users discover it.
- Resolves UPN→OID via az ad user show, treats 'already a member' as
  success, and surfaces clear messages for skipped/failed entries.
- Updated --help and top-of-file usage comments.

scripts/portal-members.sh
- New helper for post-deploy group membership management:
    add-admin / add-viewer / remove-admin / remove-viewer / list
- Reads ISP_ADMIN_GROUP_ID / ISP_VIEWER_GROUP_ID from azd env.

Docs
- README.md and docs/getting-started/quickstart.md call out the
  --with-admin requirement up front so portal sign-in doesn't fail
  with 'Access Denied' on day one.
- scripts/CLAUDE.md scripts table lists portal-members.sh.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When ./scripts/portal-members.sh add-admin/add-viewer or deploy.sh
--with-admin=<email> targets a UPN that doesn't exist in the tenant,
send a Microsoft Graph B2B invitation via /v1.0/invitations and then
add the new guest object ID to the portal group.

Pass --no-invite to portal-members.sh to disable. Requires the caller
to hold the User.Invite.All Graph permission.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aders

The ingress proxy was stripping every X-Spiffe-* header to prevent a
caller from spoofing X-SPIFFE-Caller-ID / X-SPIFFE-Trust-Domain. The
overly-broad prefix match also stripped X-Spiffe-Admin-Key — the
shared-secret header that admin-control-plane forwards to
budget-backend on every /mgmt/* request.

End result: portal /system-status calls admin-control-plane
/admin/health, which proxies to budget-backend /mgmt/health, which
returns 401 "Invalid or missing X-Spiffe-Admin-Key header" because
the sidecar removed the credential mid-flight. The portal badge
flips to LIVE FAILED and the dashboard cards show '?'.

Fix: allow-list X-Spiffe-Admin-Key while still stripping the
identity-bearing X-SPIFFE-* headers. Added a regression test in
internal/inspect/http_test.go.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
README:
- Drop the live 'Request Enforcement Flow' diagram from the portal's
  Enforcement Layers page right under the 'Enforcement model' table so
  readers see all four layers (mTLS → RBAC → OAuth/JWT → CA) visually
  before reading the words.
- Add a 'Portal tour' section with the dashboard overview rendered
  full-width as the hero, plus clickable thumbnails for Test Calls and
  Enforcement Layers. Both link back to the full-size images.

docs/index.md:
- Mirror the enforcement-flow hero on the published landing page so the
  GitHub Pages site leads with the same visual.

Images stored under docs/assets/portal/ so they ship with both the
README and the mkdocs site.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shows BudgetReport (finance) → EmployeeMenus (HR) blocked at the target
by Conditional Access tag mismatch — JWT validates, risk is low, but
the agent tags don't match and the response is a clean
'403 agent_tag_mismatch' that names the deciding enforcement layer.

Rounds out the Portal tour: SPIFFE-tunneled call (Test Calls) +
per-layer status (Enforcement Layers) + cross-agent HTTPS path (A2A).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
By default azd down only removes Azure resources. The Blueprint app,
its child Agent Identities + federated credentials, the Provisioner,
Portal, and Security Portal Mock apps, and the
Administrators/Viewers groups are tenant directory objects and
survive teardown so the next ./deploy.sh can reuse them.

Pass --purge-entra to also delete those objects (idempotent — skips
anything already tombstoned) and clear the matching azd env vars so
the next deploy provisions fresh objects from scratch. Required for
a true clean-room first-run test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@brandwe brandwe merged commit a4e588b into main May 27, 2026
14 checks passed
@brandwe brandwe deleted the dev branch May 27, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants