Skip to content

Knowledge Loop deploy blocked: bootstrap knowledge-loop Vault AppRole (engineering-loop apply 403) #306

Description

@Svaag

Summary

The Knowledge Loop runtime code is fully landed and Codex-approved (knowledge#18, network-operations#302 both merged), but the first app-promotion-deploy apply of engineering-loop failed because live Vault has not been bootstrapped for the new knowledge-loop AppRole. This is a one-time operator prerequisite, not a code regression. Blocked on operator VPN access to the internal Vault endpoint; resume when VPN is up.

Current state

Root cause

Failed at the workflow step Mint knowledge-loop Vault bootstrap (added in #302), before Ansible ran (the Apply step was skipped):

Error reading auth/approle/role/knowledge-loop/role-id: ...
URL: GET http://[2a0c:b641:b50:2::c0]:8200/v1/auth/approle/role/knowledge-loop/role-id
Code: 403. permission denied

The CI runner's github-runner Vault policy in live Vault does not yet include the new auth/approle/role/knowledge-loop/* paths (they exist in configs/vault/policies/github-runner.hcl on main but were never written to the server), and the knowledge-loop AppRole / policy / kv/knowledge-loop secrets have not been created. See docs/runbooks/bootstrap-knowledge-loop-vault.md.

⚠️ Because #302 wired the knowledge-loop mint into the engineering-loop apply job, this currently blocks all engineering-loop deploys, not just the Knowledge Loop, until the bootstrap is done.

Remediation (operator, from a Vault-reachable host, requires VPN + privileged token)

export VAULT_ADDR="http://[2a0c:b641:b50:2::c0]:8200"
# vault login ...

# 1) Refresh the CI runner policy (adds knowledge-loop AppRole paths; fixes the 403)
vault policy write github-runner  configs/vault/policies/github-runner.hcl
# 2) Knowledge Loop runtime policy
vault policy write knowledge-loop configs/vault/policies/knowledge-loop.hcl
# 3) Create the AppRole
vault write auth/approle/role/knowledge-loop \
  token_policies="knowledge-loop" token_ttl=1h token_max_ttl=4h \
  secret_id_ttl=24h secret_id_num_uses=0
# 4) Store runtime secrets (operator)
vault kv put kv/knowledge-loop github_app_id=... github_app_installation_id=... \
  github_app_private_key=@knowledge-loop-app.pem openrouter_api_key=... \
  create_pr="1" enrich_live="0" max_openrouter_calls_per_day="0"

Verify:

vault policy read github-runner | grep knowledge-loop      # shows the two new paths
vault read auth/approle/role/knowledge-loop/role-id        # returns a role_id

Resume checklist

  • Run Vault bootstrap steps 1–4 above (needs VPN to 2a0c:b641:b50:2::/64)
  • Re-run app-promotion-deploy (or the apply workflow for engineering-loop) — should pass the bootstrap step and provision the runtime on loop with the timer disabled
  • Manual one-shot smoke: run a single bounded cycle by hand (timer off) and confirm it opens a reviewable refresh PR against knowledge main from the workspace clone
  • Canary-enable PR: flip knowledge_loop_timer_enabled: true on loop + add passive run-status / timer monitoring checks

Notes

  • Could not run the bootstrap from the dev workstation: no IPv6 route to the internal Vault network (Network is unreachable).
  • Optional hardening discussed: make the knowledge-loop mint step in apply.yml non-blocking for engineering-loop until VAULT_KNOWLEDGE_LOOP_* is provisioned, so a missing bootstrap doesn't gate engineering-loop deploys. Decide whether to do this or just complete the bootstrap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions