Skip to content

fix(api): bind v1 role guard to HMAC token, not just header#222

Closed
antfleet-ops wants to merge 1 commit into
rennf93:masterfrom
antfleet-ops:fix/v1-role-auth-bypass
Closed

fix(api): bind v1 role guard to HMAC token, not just header#222
antfleet-ops wants to merge 1 commit into
rennf93:masterfrom
antfleet-ops:fix/v1-role-auth-bypass

Conversation

@antfleet-ops

Copy link
Copy Markdown

Summary

roboco/api/routes/v1/_role_dep.py::_require_roles trusts the X-Agent-Role header without verifying the agent's HMAC token. The main agent-context dep (get_agent_context in roboco/api/deps.py) does enforce HMAC via _check_agent_auth_token when ROBOCO_AGENT_AUTH_REQUIRED=true, but the v1 flow routers use _require_roles instead and bypass that path entirely.

If a caller can reach a v1 endpoint, they can pick any allowed role with just X-Agent-Role: developer (or any other allowed value) and pass the guard without proving they hold the matching token.

What changed

_require_roles now calls the same _check_agent_auth_token primitive that get_agent_context uses, before comparing the role to the allowed set:

  • ROBOCO_AGENT_AUTH_REQUIRED=true + missing token → 401
  • Any token presented + invalid HMAC → 401 (verified in dev too)
  • Valid token + role not in allowed set → 403 (unchanged behavior)
  • Auth-required unset (dev default) + role header alone → still accepted

This keeps the HMAC contract centralized in roboco.api.deps (the model's recommendation and is the minimum change that closes the gap. The import is deferred inside the closure to avoid pulling the full deps module at router-registration time.

Tests

  • New: — auth required + missing token → 401
  • New: — invalid token → 401 regardless of
  • Existing tests pass unchanged (they don't set , so the token check is a no-op).

How it was found

AntFleet's two-model security review (Claude Opus 4.7 + GPT-5) on a mirror of this repo. Both models independently flagged the same defect.

`_require_roles` previously trusted only the `X-Agent-Role` header.
When ROBOCO_AGENT_AUTH_REQUIRED is enabled (prod), the main agent
context dep (`get_agent_context`) enforces HMAC via
`_check_agent_auth_token` — but the v1 role guards bypass that path
entirely and check the raw header. A caller with network reach to a v1
flow endpoint could pick any allowed role by setting just `X-Agent-Role`
and pass the guard without proving they hold the matching token.

Add the same token-validation step `get_agent_context` uses to
`_require_roles` so the HMAC binding is enforced uniformly:
- ROBOCO_AGENT_AUTH_REQUIRED=true + no token → 401
- Token presented + invalid HMAC → 401 (verified in dev too)
- Token presented + valid HMAC + wrong role → 403
- Auth required-off + role header alone → preserved (dev parity)

The role comparison itself is unchanged. The token validation primitive
stays centralized in `roboco.api.deps` — `_role_dep` defers the import
inside the closure to avoid pulling the full deps module at router
registration time.

Regression tests:
- New: auth-required + missing token → 401
- New: token presented + invalid HMAC → 401
- Existing: dev mode without auth required still accepts role-only

Found via AntFleet two-model bench review (Claude Opus 4.7 + GPT-5):
AntFleet/bench-roboco#3 (comment)
@github-actions

Copy link
Copy Markdown

Thanks for opening your first pull request on RoboCo!

Quick checklist before review (most of these are enforced by CI, but worth a glance):

  • make quality — ruff format check, ruff check, mypy, pytest (≥80% coverage), and the rest of the gate
  • Panel changes pass pnpm lint and pnpm exec tsc --noEmit (run from panel/)
  • No # noqa / # type: ignore shortcuts; pre-existing violations in touched files are fixed
  • Added an entry under ## [Unreleased] in CHANGELOG.md
  • Signed the CLA (the bot will prompt you on this PR)
  • Signed your commits — master requires verified signatures (SSH signing setup)
  • Updated any affected docs under docs/

See CONTRIBUTING.md for the full workflow and the Code of Conduct for the community standards we follow.

Welcome aboard — a maintainer will review shortly.

@github-actions

Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@github-actions github-actions Bot added area: api Touches roboco/api/ (FastAPI routes, schemas, app) tests Test suite changes labels Jun 19, 2026

@rennf93 rennf93 left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change Request for external PR #222: fix(api): bind v1 role guard to HMAC token, not just header

Summary: The diff correctly identifies that the v1 role dependencies (require_dev, require_qa, etc.) in _role_dep.py only validated the X-Agent-Role header string and performed no token binding. This is a bypass vector because these route-level deps were the sole gate for the /api/v1/flow/* endpoints (unlike get_agent_context which already calls _check_agent_auth_token).

The fix hoists the token check into _require_roles before the membership test and reuses the centralized primitive. In header-trust mode (no ROBOCO_AGENT_AUTH_REQUIRED) a missing token is a no-op (legacy preserved); any presented token is still verified and bad ones rejected. This matches the documented contract and deps.py:215 / agents_config.py:79 behavior.

Findings (per-criterion / concrete evidence)

  1. File: roboco/api/routes/v1/_role_dep.py (patched _require_roles)

    • Expected: the role guard must cryptographically bind X-Agent-Role to a valid X-Agent-Token (HMAC over id:role:team) when a token is presented, and require the token when ROBOCO_AGENT_AUTH_REQUIRED=true, before allowing the role membership check to pass.
    • Actual (after patch): _check now declares x_agent_id, x_agent_role, optional team/token; calls _check_agent_auth_token(...) (deferred import) then if x_agent_role.lower() not in allowed: 403. Token failures raise 401 from the shared helper. Correct and minimal.
    • Note: deferred import placement inside the closure (not top-of-_require_roles) properly avoids import-time cycles with routers that import both this module and roboco.api.deps. Good.
  2. File: tests/unit/api/test_v1_role_dep.py (new tests after line ~82)

    • test_dev_route_requires_token_when_auth_enabled: sets env true, posts to /api/v1/flow/developer/give_me_work with only ID+Role (no Token) → asserts 401. Matches _check_agent_auth_token:215 (if _auth_required() and not x_agent_token: 401).
    • test_dev_route_rejects_invalid_token_even_in_dev: unsets required, sets a secret, sends ID+Role+bad-token → asserts 401. Matches verify_agent_token:86 (non-matching or sentinel → False) and the if x_agent_token and not verify: 401 path.
    • Observation (actionable): both tests are negative cases only. For a security-binding fix, add at least one positive case that constructs a valid token via issue_agent_token (or the panel token path) for an allowed role and asserts the route returns 200 (or the expected envelope) rather than 401/403. This proves the "good" path still works after the guard change.
  3. Test helper dependency (tests/unit/api/test_v1_role_dep.py)

    • The new tests call TestClient(_build_app()) and hit the exact v1 developer router path.
    • Expected: _build_app (or equivalent fixture) must be included or clearly defined in this PR so the regression is reproducible and the unit test can execute without full integration DB/lifespan side effects.
    • Actual: only the test bodies appear in the provided diff; _build_app definition is not visible. Provide the helper (or a minimal router-mounted app with dep overrides) as part of the change.
  4. Header declaration duplication (multiple files)

    • v1 flow routers (e.g. flow_dev.py:29, flow_pr_reviewer.py:25, flow_qa.py:28, flow_board.py:27, ...) declare _AgentIdHeader = Annotated[UUID, Header(alias="X-Agent-ID")] on handlers.
    • The patched guard now also declares x_agent_id: Annotated[str, Header(alias="X-Agent-ID")] (plus team/token).
    • Expected vs actual: works (FastAPI merges same alias), but the ID is validated twice (str vs UUID coercion) and declared in two places. Not a correctness bug for this fix, but consider a shared auth header dependency in a follow-up to keep the binding logic and declarations co-located.
  5. Scope and blast radius

    • Only the role-guarded v1 flow routers are affected (require_* created by _require_roles). v1/do.py has no role dep on its router (intentionally different surface). No change to get_agent_context or middleware. No new dependencies or secret handling introduced.
    • No evidence of injection, secret-in-code, or supply-chain changes in the diff.
  6. Error semantics preserved

    • Missing required headers (ID or Role) continue to yield FastAPI 422 (pre-existing behavior asserted by test_dev_route_rejects_missing_role_header).
    • Auth failures from token check yield 401; role-not-allowed yields 403. Consistent with deps.py and HTTP conventions.

Recommendation

Address items 2 and 3 (positive test + visible test helper) so the security regression coverage is complete and reviewable. Once those are present, this is a clean, correctly-scoped defense-in-depth improvement that aligns the v1 role path with the rest of the HMAC contract.

Requesting changes for the completeness items above.

rennf93 added a commit that referenced this pull request Jun 19, 2026
The v1 flow role guards (_require_roles) are the sole gate for /api/v1/flow/*
but checked only the X-Agent-Role string — unlike get_agent_context, which
verifies the token. So a forged role header passed, and in strict mode
(ROBOCO_AGENT_AUTH_REQUIRED) the token was never required on these routes.
Call _check_agent_auth_token before the role membership test (deferred import
to avoid the cycle): missing token stays a no-op in header-trust mode, any
presented token is verified, and strict mode now requires it.

Supersedes #222 (antfleet-ops flagged the bypass); re-done as our own change so
it lands without the contributor CLA, and adds the positive valid-token test
their PR was missing.

Co-authored-by: Renn F <rennf93@users.noreply.github.com>
@rennf93

rennf93 commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Superseded by #228 (merged to master): same token-binding fix on the v1 role guard (plus the positive valid-token test), re-done as our own change so it could land without the contributor CLA. Thanks for flagging the bypass.

@rennf93 rennf93 closed this Jun 19, 2026
@github-project-automation github-project-automation Bot moved this from Backlog to Done in RoboCo Kanban Jun 19, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area: api Touches roboco/api/ (FastAPI routes, schemas, app) tests Test suite changes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants