feat: full §10.2 HDKD agent bootstrap — broker link-code endpoints + daemon redeem (#144) by hanwencheng · Pull Request #149 · litentry/agentKeys

hanwencheng · 2026-05-31T07:53:34Z

Issue #144 — Full arch.md §10.2 agent-bootstrap (HDKD omni + broker link-code endpoints)

Closes #144. Converges the PR #141 interim §10.2 (agent omni derived from the agent's own wallet; openssl rand link-code stub) to the literal ceremony.

The flow (the "install an app → approve its permissions" story)

P.0 create (master) — agentkeys agent create --label agent-a --services memory → broker mints a one-time link code bound to the HDKD child omni O_agent = SHA256("agentkeys-hdkd-v1" ‖ O_master ‖ "//label").
P.1 install (agent, in the sandbox) — agentkeys-daemon --init-link-code <code> generates its own K10 device key (never on the master), proves possession, redeems → J1_agent. The broker records a pending binding.
P.2 bind (master) — submits registerAgentDevice (no biometric).
P.3 grant (master) — setScopeWithWebauthn (one Touch ID). P.2+P.3 are conceptually one approval; two steps for deterministic test automation.

Decisions (asked + answered)

Master submits the on-chain binding — broker mints the code + J1_agent + records the pending binding; the master pulls it and binds. No Heima-mainnet contract change, no broker chain key. Async/push model (master = phone).
Child omni is PUBLIC + recomputable — unforgeability = the J1_master-gated /v1/agent/create + the master-submitted binding, NOT a secret. Agent keeps a K10 device key only (omni decoupled).
Daemon owns keygen + redeem (--init-link-code), sharing agentkeys-core::device_crypto with the CLI.

What landed

core: device_crypto (shared K10 keygen / EIP-191 / ecrecover / pop_sig + DeviceKey) + HDKD child_omni/child_omni_hex + validate_label (frozen vectors).
broker: POST /v1/agent/create, POST /v1/auth/link-code/redeem (pop_sig verified before consume → retryable), GET /v1/agent/pending-bindings; SQLite link-code + pending-binding store; AgentKeysClaims + mint_agent_session_jwt; mint-oidc-jwt reads actor_omni from the verified claim (STS-relay prerequisite; wallet sessions byte-identical, regression-tested).
daemon: --init-link-code one-shot.
cli: agent create + agent pending (master-side).
harness: Phase P rework (P.0→P.3); builds + uploads the daemon binary.
ci / broker-setup: §10.2 route smoke (401 = live, 404 = stale binary) + nm symbol check in setup-broker-host.sh; bumped the deploy job timeout 15→25min + SSM executionTimeout 900→1500 for the larger broker build closure (agentkeys-core pulls aws-sdk-s3/keyring/aes-gcm).
docs: arch.md §10.2 / §5 / §6.2 / route list; runbook Phase P + troubleshooting; docs/spec/plans/issue-144-hdkd-agent-bootstrap.md.

Deviation (vs the asked plan)

CLI agent bind/agent grant Rust subcommands are not added — chain submission lives in shell + cast, and the two existing chain helpers (heima-agent-create.sh --from-pubkey = bind, heima-scope-set.sh --webauthn = grant) already are the deterministic two-step split. The CLI ships the genuinely-new master surfaces instead (create + pending, incl. the production rendezvous). Recorded in the plan doc.

Out of scope (deferred)

Broker chain-write / meta-tx; secret-keyed HDKD; HDKD sub-actors; broker-side K11 verify (stays on-chain); production APNs/FCM push transport (the pending-binding data model + endpoint ship now).

Tests

cargo test -p agentkeys-core — 141 (HDKD frozen vectors, pop_sig sign→ecrecover round-trip).
cargo test -p agentkeys-broker-server --features auth-email-link — 179 lib + agent_bootstrap_flow (create-gated, bad-label, full create→redeem→pending, bad-pop_sig-retryable) + oidc byte-identical regression.
cargo clippy --workspace --all-targets -- -D warnings (default features) — clean. cargo fmt --all --check — clean. bash -n harness/phase1-wire-demo.sh + scripts/setup-broker-host.sh — OK.
The full --real --webauthn end-to-end needs the redeployed broker + Touch ID + a live sandbox — exercised after this deploy (CI deploy + the route smoke confirm the §10.2 code is live on the test broker).

🤖 Generated with Claude Code

…daemon redeem (#144) Converges the PR #141 interim §10.2 to the literal ceremony: the master mints a one-time link code bound to a hard-derived child omni O_agent = SHA256("agentkeys-hdkd-v1" || O_master || "//label"); the agent daemon generates its own K10 in the sandbox, redeems the code (pop_sig), and the broker mints J1_agent carrying the HDKD omni + parent lineage. The master then approves the binding + scope async (push → one Touch ID), iOS/Android-style. - core: device_crypto (shared K10 keygen / EIP-191 / ecrecover / pop_sig + DeviceKey) + HDKD child_omni/child_omni_hex + validate_label (frozen vectors) - broker: POST /v1/agent/create (J1_master-gated), POST /v1/auth/link-code/redeem (pop_sig verified before consume → retryable), GET /v1/agent/pending-bindings; SQLite link-code + pending-binding store; AgentKeysClaims + mint_agent_session_jwt; mint-oidc-jwt now reads actor_omni from the verified claim (STS-relay prerequisite; wallet sessions byte-identical, regression-tested) - daemon: --init-link-code one-shot (in-sandbox keygen → redeem → persist J1_agent → emit binding artifact) - cli: agentkeys agent create + agent pending (master-side) - harness: Phase P rework — P.0 create (real broker code) → P.1 install (daemon --init-link-code) → P.2 bind → P.3 grant; build + upload the daemon binary - ci / broker-setup: §10.2 route smoke (401 = live, 404 = stale binary) + nm symbol check in setup-broker-host.sh; bump the deploy job timeout 15→25min + SSM executionTimeout 900→1500 for the larger broker build closure (agentkeys-core) - docs: arch.md §10.2 (async master-submits ceremony), §5 agent_omni row, §6.2, route list; operator-runbook-wire.md Phase P + troubleshooting; new docs/spec/plans/issue-144-hdkd-agent-bootstrap.md Decisions (master submits the on-chain binding — no contract change; child omni is public + recomputable; daemon owns keygen+redeem with shared core) and deviations are recorded in docs/spec/plans/issue-144-hdkd-agent-bootstrap.md. Tests: core 141; broker 179 lib + agent_bootstrap_flow integration + oidc regression; clippy -D warnings clean (default features); cargo fmt clean; harness bash -n OK.

…rade prereq + P.0 in the install step The Phase P rewrite landed in the prior commit; this closes the two gaps an operator would hit: (1) a Real-mode prereq that the broker must be running the issue-144 code (else Phase P P.0 agent create 404s — folds in the setup-broker-host.sh --upgrade fix + the 401-not-404 deploy self-check), and (2) P.0 (master mints the link code) in the install-step summary bullet.

…us (idempotent) - demo (harness Phase P): P.0 now drives the agentkeys agent create CLI (was raw curl); P.1b drives agentkeys agent pending (the master-pull rendezvous); P.2 acks the broker after binding so pending self-cleans. Falls back to a raw POST only when no local host binary exists. - broker: new POST /v1/agent/pending-bindings/ack (J1_master-gated, operator-scoped mark_bound). The rendezvous never cleared before (bound_at was never set), so pending listed redeemed agents forever; the ack fixes that AND makes the list idempotent. agent_bootstrap_flow now asserts pending=1 then ack then pending=0. - idempotency: AGENT_LABEL stays stable (deterministic HDKD omni) and the K10 file persists in the long-lived sandbox (clean_slate never wipes ~/.agentkeys), so registerAgentDevice hits already-registered, scope re-set + seed overwrite are no-ops, and the ack keeps pending clean — re-runs converge. - docs: arch.md route list + 10.2 ack step; runbook Phase P (P.1b + ack + stable-label note). broker 179 lib + agent_bootstrap_flow (incl. ack) green; clippy -D warnings clean.

… references Per the CLAUDE.md never-pass---upgrade rule (it is a back-compat no-op; the script is idempotent): - operator-runbook-wire.md broker-version prereq: --ref main instead of --upgrade. - setup-broker-host.sh: idempotency comment no longer illustrates with --upgrade.

…test brokers) setup-cloud.sh step 15 (SSM SendCommand to bring up the MCP server) assumed the broker EC2 was already a registered SSM managed instance but never ensured it. Operators hit `SendCommand -> InvalidInstanceId` because the broker-host role was created WITHOUT AmazonSSMManagedInstanceCore, so the on-host amazon-ssm-agent can't register. (And separately, a caller lacking ssm:SendCommand got a misleading "does the instance have the agent?" message.) - ensure_ssm_managed(): runs before SendCommand. Resolves the role from the INSTANCE's attached profile (naming-agnostic, so the SAME code fixes BOTH the prod `agentkeys-broker-host` and the test broker's own profile), idempotently attaches AmazonSSMManagedInstanceCore if missing, then polls describe-instance-information until PingStatus=Online. If the agent never registers (role now correct => the agent itself isn't running), it dies with the exact restart remediation (ssh-broker.sh + setup-broker-host.sh --upgrade, or reboot). Idempotent: a re-run with the policy already attached skips. - SendCommand now captures stderr and distinguishes a CALLER ssm:SendCommand AccessDenied (identity-based policy gap) from a real instance problem, with a precise remediation (put-user-policy; see provision-ci-deploy-role.sh for the policy shape) instead of the misleading instance-agent message. - aws iam calls are global (no --region); ec2/ssm reads pass --region "$REGION" per the agentkeys-admin-defaults-to-us-west-2 trap (CLAUDE.md). Env-agnostic + idempotent: works for both broker envs and converges on re-run.

… refs + add CLAUDE.md rule setup-broker-host.sh treats --upgrade (and --skip-pull) as back-compat NO-OPS (it is idempotent + auto-detects bootstrap vs upgrade), so emitting it is misleading. Replace active-path references with --ref main (the canonical idempotent deploy invocation per CLAUDE.md) and codify the rule: - setup-cloud.sh: ensure_ssm_managed remediation suggests --ref main. - docs/ci-setup.md: prod-broker manual deploy uses --ref main. - CLAUDE.md (Remote broker host): explicit never-pass---upgrade rule.

…-u unbound-var) SSM RunShellScript executes the step-15 mcp-bring-up script as root with a MINIMAL env (no HOME). Under set -euo pipefail the first $HOME use (export PATH=$HOME/.cargo/bin) aborted with 'HOME: unbound variable' — a latent bug that only surfaced once SSM delivery started working (previously it failed at send-command). Default HOME to /root before any $HOME use. Also document --only-step 15 in the script's re-run examples.

…etup script Broaden the rule from setup-broker-host.sh to all idempotent setup scripts (setup-cloud.sh, setup-heima.sh, heima-* helpers) and restore the actionable guidance (invoke plain / --only-step N, or --ref main for a broker redeploy; replace existing active-path --upgrade references).

…ld is not a hang) setup-mcp-host.sh runs 'cargo install --git' to build agentkeys-mcp-server FROM SOURCE; a cold build on the t3.medium broker takes 10-20 min, but step 15 polled SILENTLY with a 10-min cap — so it looked hung and timed out mid-build. Add a ~30s heartbeat (status + elapsed) and raise the cap to 25 min. On timeout, state the build is likely still running on the broker (not a failure) and point to --from-step 15 to resume (cargo cache => fast) plus the get-command-invocation watch command.

…argo build, not cargo install --git Step 15 took ~10 min because setup-mcp-host.sh used 'cargo install --git --force': it re-clones the repo and builds agentkeys-mcp-server + its WHOLE dep tree (aws-sdk-s3, tokio, k256, reqwest, ...) in a THROWAWAY target every run — a cold release build on the 2-vCPU t3.medium broker. setup-broker-host.sh already compiled those same deps into $REPO_ROOT/target/release (persistent, incremental), and the harness builds the MCP server with persistent cargo caches too; the cargo-install path was the lone build throwing the cache away. Build it the same way the broker/workers (and harness) do: cd $REPO_ROOT (/opt/agentkeys-src, already checked out by step 15's clone/reset) and 'cargo build --release --locked -p agentkeys-mcp-server', install from target/release. Reuses the shared dep cache => incremental (seconds-2 min), not a 10-20 min cold build. Worst case (no prior build) is no slower than before, and all re-runs are fast since target/ persists.

…up, drop operator flags - harness/phase1-wire-demo.sh: cross-build target -> named docker volume (off the macOS bind-mount); copy only the 3 binaries out to the host path the upload step reads. - setup-cloud.sh: step 15 (hosted MCP on broker) is now a no-op redirect. It is a broker-host concern + the deferred Hosted-LLM path (#152), not cloud/IAM. - setup-broker-host.sh / setup-mcp-host.sh: state-driven idempotency over operator flags. Drop --without-workers / --without-build / --clean / --no-clean; unattended by default (--yes/--non-interactive kept as accepted no-ops for CI). Hosted MCP auto-converges when its binary is already installed. - docs/operator-runbook-wire.md: wrap-up section -- three setup entry points (run only what changed), no-flag idempotent posture, right test order; build-cache + troubleshooting updates. - docs/spec/plans/issue-107-mcp-demo-runbook.md: correct the B.5 deploy path (setup-mcp-host.sh, no --with-mcp). Refs #149, #152.

…-broker-host.sh step operator-runbook-wire.md now tells a from-nothing operator exactly what to run and ON WHICH MACHINE: (1) setup-cloud.sh on the laptop, (2) ssh-broker.sh + sudo setup-broker-host.sh --ref <branch> ON the broker host, (3) setup-heima.sh, (4) the harness. Explains the --ref branch choice (claude/impl-144-hdkd-bootstrap until #149 merges) and why a 'git pull' on main shows nothing (the work is on the feature branch). Keeps the re-run-only-what-changed table. Refs #149.

…registerAgentDevice Before: AGENT_LABEL is stable + the sandbox K10 persisted, so P.2 hit the registerAgentDevice already-registered skip and the pairing path was never exercised ('ok ... or already-active'). The contract refuses to re-register a hash (registeredAt stays set even after revoke), so a real re-pair needs a NEW key. Now, in fresh_pairing mode (--real without --reuse-agent): P.depair revokes the prior device (recorded as a PUBLIC-hash sidecar in the sandbox — key never leaves; heima-device-revoke.sh self-checks isActive + is idempotent, agent-tier so no Touch ID), wipes the sandbox K10 so P.1 mints a fresh key, and P.2 does a REAL registerAgentDevice. P.1 re-stashes the new hash for the next run's depair. First run after this lands wipes the pre-existing ('hardcoded') key and registers fresh; that prior device stays an orphan (revoke it once by hand if you want it gone). Runbook tx-count notes updated (now ~2 txs/run: revoke + register). Refs #149.

…-device vs stable-omni The depair/re-pair was only in the TL;DR. Add P.depair to the 'What happens, in order' walkthrough, the Install-(pair) flag note, and the Phase-P troubleshooting row (which wrongly said 'the agent reuses its K10' — it now mints a fresh K10 each run). Also reconcile the contradiction the depair surfaced: the label-derived omni is STABLE (same agent), only the device key is fresh — so memory persists and step 1.5 overwrites each run. Fixed the 'fresh/new agent identity ⇒ empty memory' wording in 4 spots. Refs #149.

…f a bare 'no link code' P.0 hitting the §10.2 routes on a stale broker returned a confusing 'agent/create returned no link code: HTTP 404'. Detect the 404 and name the exact fix (setup-broker-host.sh --ref claude/impl-144-hdkd-bootstrap on the broker) + the 401-not-404 verify. Notes that 1.4/1.5 are cascades. Refs #149.

… + root-owned fix Operator hit 'unable to unlink … Permission denied' on a manual git pull because a prior 'sudo bash setup-broker-host.sh' ran git/cargo as root → root-owned repo. Note: use --ref (its git runs consistently), chown -R agentkey to restore manual git, and that the broker uses git not jj. Refs #149.

…git-pull Permission denied) Running 'sudo bash setup-broker-host.sh' executes the script's plain git (--ref) + cargo as root, leaving root-owned files in the checkout that block the operator's next 'git pull' (error: unable to unlink … Permission denied). New §8c: when SUDO_USER is set, chown -R the repo back to the invoking user at the end — idempotent, scoped to $REPO_ROOT only (system files stay root/agentkeys-owned), no-op when run as the user directly or as root with no SUDO_USER (SSM/root-managed /opt clone). Runbook note updated. Refs #149.

…h registration (Codex finding #3) P.depair swallowed revoke + K10-wipe failures with '|| true' then unconditionally reported 'clean slate', and P.2 grepped only '"ok":true' — which heima-agent-create.sh also returns for the already-registered SKIP (no tx_hash). So a stale-sidecar / unreachable-sandbox / failed-revoke run could greenlight 'a real registration' while only exercising the already-active path. Now: (1) P.depair hard-fails if revoke is not ok:true, and confirms the sandbox K10 is actually gone (test -e) before claiming a clean slate; (2) P.1 hard-fails if the new device_key_hash == the prior sidecar hash (wipe didn't take); (3) P.2 distinguishes skipped:already-registered (FAIL — not a real registration) from a real tx_hash (ok) before acking. Refs #149.

…device-active (Codex finding #1) mint_oidc_jwt accepted ANY valid session and signed an AssumeRoleWithWebIdentity JWT with no binding/scope check. Since /v1/auth/link-code/redeem mints J1_agent pre-binding, a redeemed-but-unapproved agent could get STS creds to its own actor prefix before the master's registerAgentDevice. Now: an agent_hdkd session (device_pubkey present) must pass the SAME on-chain SidecarRegistry.getDevice check the cap-mint path uses — registered_at!=0, !revoked, and device.actor_omni == session omni — or it gets 403 (audited as AuthFailed). Wallet/master sessions (no device_pubkey) are unaffected. cap.rs ChainContracts/DeviceEntry/call_get_device exposed pub(crate) for reuse. cargo test -p agentkeys-broker-server green. Refs #149.

…ly file (Codex finding #2) The daemon's --init-link-code printed session_jwt in its stdout artifact; the harness captured it on the master and passed it to the sandbox MCP as --agent-session-bearer (a CLI arg), exposing the bearer in the master shell + the sandbox process list (ps), readable by co-resident untrusted code. Now: the daemon writes J1_agent to an owner-only ~/.agentkeys/agent-session.jwt (0600) and emits the PUBLIC session_file path instead of the JWT. The MCP server reads the bearer from --agent-session-bearer-file (env MCP_AGENT_SESSION_BEARER_FILE); a direct --agent-session-bearer still wins when set. The harness (fresh-pairing) passes the file path, never the value; --reuse-agent keeps the value path. Bearer now travels daemon->file->MCP entirely in the sandbox. K10 private-key custody was already intact; this closes the derived-bearer leak. cargo build (broker+daemon+mcp) green; bash -n clean. Refs #149.

…CTOR_OMNI for cap-mint Two issues surfaced on a full §10.2 fresh-pairing run: P.2 false-FAIL (my finding-3 regression): the check ran 'jq -e' on $reg = heima-agent-create.sh 2>&1 (stderr logs + the JSON line). jq can't parse the mixed text → silently fails → P.2 reported FAIL even though registerAgentDevice SUCCEEDED (real tx 0x90f7…, block 9690030). Now extract the JSON line (grep -oE '{.*}' | tail -1) before jq. cap-mint 400 'actor_omni must start with 0x' (1.5/3.1/4.2): the §10.2 child omni is un-prefixed by design (child_omni_hex), but cap-mint's validate_hex32 requires 0x (like OPERATOR_OMNI, which is already 0x). ACTOR_OMNI was set to the bare child omni → cap_mint rejected it. Now ACTOR_OMNI="0x${ds_actor#0x}" (exactly one 0x); ds_actor stays un-prefixed for the chain helpers + the child_omni== check. Follow-up (not demo-blocking): a production agent whose session omni is un-prefixed hits the same cap-mint 400; the durable fix is cap-mint accepting un-prefixed omnis (validate_hex32 -> normalize_hex32) or the MCP normalizing the request. Tracked separately. Refs #149.

…hardened key writes, reuse bearer custody A [high] oidc agent gate (oidc.rs) only checked active+actor; now mirrors the FULL cap-mint invariant — device.operator_omni == session parent_omni AND actor_omni == omni_account AND roles & ROLE_CAP_MINT AND active. Without the operator+role checks, any other registered operator could bind (this device hash, this actor) and the agent would pass, bypassing the master that issued the link code. Exposed cap.rs DeviceEntry.operator_omni/roles + ROLE_CAP_MINT as pub(crate). B [high] write_key_0600 (agentkeys-core/device_crypto.rs): mode(0o600) only applies on CREATE, so a pre-existing loose-perms file kept its mode, and open() followed symlinks. Now rejects a pre-existing symlink/non-regular target and force-chmods 0600 after open. Residual TOCTOU (needs O_NOFOLLOW/libc) noted as follow-up. C [medium] harness --reuse-agent passed --agent-session-bearer <value> (in the MCP argv/ps); now stages the master-minted bearer into the same owner-only sandbox file (umask 077) and passes only --agent-session-bearer-file, matching fresh-pairing. Neither mode leaves the JWT in ps. cargo test -p agentkeys-core green (141+3); broker+daemon+mcp build clean; bash -n harness clean. Refs #149.

The agent-gate edit in d7a4c01 wasn't rustfmt-clean → CI 'cargo fmt --all -- --check' failed at 23s (parent_omni method chain + dkh map_err block). Ran cargo fmt --all (only oidc.rs affected). Verified locally with the EXACT CI commands: fmt --all --check clean, 'clippy --workspace --all-targets -- -D warnings' exit 0, 'test --workspace --test-threads=1' exit 0. Refs #149.

Ports the Claude Design "agentkeyweb" handoff into apps/parent-control as the primary, demoable operator experience. Maps 1:1 to the 9 user workflows. Plan + verification + pushback: docs/plan/web-flow/issue-9step-flow.md. Flow (workflow → component): 1. WebAuthn login + onboarding ceremony → ceremony.tsx (OnboardingScreen + CeremonyRunner) 2. Memory panel: plant preserved memory → memory.tsx (empty-state + plant ceremony, auto-detect existing, dedup guard — plant hidden + blocked once planted) 3-4. Agent connects + master notified → App bell + pairing request (post-#149: a pending binding) 5. Request detail (agent + permissions) → pairing.tsx request card 6-7. Accept + Touch ID + ceremony → WebAuthnModal → CeremonyRunner (PAIRING_STEPS) 8. Device view + permission view → pairing.tsx (device-grid) + permissions.tsx (mobile-style scoped PermissionList — replaces tables, the "won't scale" ask) 9. Audit + decodable Heima TXs → dashboard.tsx AuditFeed → EventDecodeModal (decodeCalldata mock; real decode tracked in #153) New files: - lib/demoData.ts seed actors/events + ONBOARDING_STEPS, PAIRING_STEPS, PRESERVED_MEMORY, INCOMING_PAIRING, CHAIN_PROFILE, VAULT_ITEMS, txHash, decodeCalldata (mock), ONCHAIN_KINDS - _components/ceremony.tsx CeremonyRunner (progress bar + live step log + tx hashes) + OnboardingScreen - _components/memory.tsx MemoryPage (plant / dedup / per-namespace listing) - _components/pairing.tsx PairingPage (request → accept → device/permission view toggle) - _components/permissions.tsx PermSeg/PermSwitch/PermissionList/PermissionView (mobile scoped) - _components/dashboard.tsx ActorsList, ActorDetail (editable PermissionList), AuditFeed Changed: - _components/App.tsx rewritten as the self-contained flow orchestrator: onboarding gate (localStorage), header bell + badge, memory/pairing/audit/chain routes, WebAuthn + pairing-ceremony + tx-decode + memory-view modals - _components/types.ts + CeremonyStep/PreservedMemory/PairingRequest/ChainProfile/ ContractInfo/RequestedPerm; Actor.justPaired; ChipKind +scope/device/k11; Route +memory/pairing/chain - lib/constants.ts CHIP_STYLES covers the new chip kinds - app/globals.css ceremony/onboard/empty-memory/pair-req/view-toggle/device-grid/ bell/tx-decode/mem-body/perm-* blocks (no rounded corners, hairline rules) Scope note: M1 visible flow is seed-data + local ceremony state (the prototype's model). Real-daemon wiring stays behind the lib/client seam and is Phase 2 — #149 endpoints (agent create / pending-bindings / bind / grant), onboarding + master-memory endpoints, and the #153 audit decoder. The old client-based pages (pages.tsx/workers.tsx/onboarding.tsx) remain on disk for that wiring; App no longer imports them. Verified: npx tsc --noEmit clean · npm run build ok (4 static pages, 20.1 kB route) · dev smoke serves the onboarding screen (HTTP 200, all markers present).

… lowercase) (#154) The image job tagged ghcr.io/${{ github.repository }}/agentkeys-mcp-server, but github.repository keeps the repo's real casing (litentry/agentKeys — capital K), so docker buildx rejected it: 'invalid tag ... repository name must be lowercase'. The job runs only on push to main (skips on PRs), so it first failed on the #149 merge (Actions run 26719167517). Fix: a 'Resolve lowercase image name' step lowercases $GITHUB_REPOSITORY via tr (portable to bash 3.2 per CLAUDE.md, not the bash-4 ${,,}) → ghcr.io/litentry/agentkeys/agentkeys-mcp-server, fed to both :latest and :$sha tags. YAML validated; tr output verified locally.

…r-issued link codes (#159) * plan: method-A agent-initiated pairing design (replaces #149 front-half) Flip §10.2 from master-mints-link-code → agent-submits-request + master-claims-by-code (the IoT scan-the-device-QR model). Reuses #149's on-chain bind+scope tail. Unbind/factory-reset deferred → #156 (client) + #155 (on-chain self-revoke). * agentkeys: §10.2 method-A pairing — broker request/claim/poll endpoints Flip the agent bootstrap from master-initiated (link code) to agent- initiated (the agent shows a code, the master claims it — the Matter/ HomeKit IoT model). Replaces #149's master-mint front-half; reuses the on-chain bind + scope tail unchanged. Broker: - NEW storage/pairing_requests.rs — unbound, agent-created request pool (issue/claim/poll/pending_bindings/mark_bound/purge). J1 is NOT stored at rest; minted fresh at poll time on a re-proved pop_sig. - NEW handlers/agent/request.rs (agent, pop_sig-gated) — open an unbound request, return {request_id (secret), pairing_code (display)}. - NEW handlers/agent/claim.rs (master, J1-gated) — claim by code, derive O_agent=HDKD(O_master,//label), record pending binding. - NEW handlers/agent/poll.rs (agent, pop_sig-gated) — once claimed, mint + return J1_agent. - REMOVE handlers/agent/{create,redeem}.rs + storage/link_codes.rs. - Rename link_code_store -> pairing_request_store across state/boot/main. - Rewire routes: /v1/agent/pairing/{request,claim,poll}; keep pending-bindings + /ack (now keyed by request_id). Tests: 14 store unit tests + agent_bootstrap_flow rewritten for the request->claim->poll flow (5 cases incl. cross-device/bad-pop_sig poll rejection). clippy --all-features --all-targets -D warnings clean. Unbind/factory-reset re-pair deferred -> #156; on-chain self-revoke -> #155. * agentkeys: §10.2 method-A pairing — daemon + CLI + harness + broker-host smoke Flip the client + wire harness to agent-initiated pairing (issue #144, method A), matching the broker request/claim/poll endpoints. Daemon (--init-link-code → two one-shots mirroring the two endpoints): - --request-pairing: in-sandbox K10 keygen → POST /v1/agent/pairing/request → print {request_id, pairing_code, …}; persist a 0600 state file so --retrieve-pairing can resolve request_id (--request-id overrides). - --retrieve-pairing: poll /v1/agent/pairing/poll until claimed (bounded by --init-poll-timeout-seconds), mint+persist J1_agent (0600), emit artifact. CLI: agent create → agent claim --pairing-code <code> --label … --services … (POST /v1/agent/pairing/claim). agent pending unchanged (rows now keyed by request_id). Harness phase1-wire-demo.sh Phase P inverts: P.0 agent --request-pairing (shows code) → P.1 master agent claim → P.1b agent --retrieve-pairing (J1) → P.1c pending → P.2 bind + ack-by-request_id → P.3 grant. P.depair unchanged. 404 trap + route names updated. setup-broker-host.sh (runbook-fix-fold-back): nm symbol grep → pairing_{request,claim,poll}|pending_bindings; route smoke → no-bearer POST /v1/agent/pairing/claim must be 401 not 404. cli+daemon: clippy --all-targets -D warnings clean, fmt clean, tests pass (38/38 single-threaded; the 1 parallel-suite failure is a pre-existing k11 enroll test race on a shared HOME path, unrelated). bash -n both scripts OK. * agentkeys: §10.2 method-A pairing — docs + terminology (arch.md, runbooks) Reconcile every doc + code-comment surface with the agent-initiated pairing flip (issue #144, method A). arch.md (single source of truth): - §10.2 ceremony fully rewritten for method A (agent requests → master claims → agent retrieves), incl. the IoT/Sybil-safety rationale + the deferred unbind notes (#155/#156). - §6.2 route list: /v1/agent/pairing/{request,claim,poll} replace create + link-code/redeem; pending-bindings ack now by request_id. - §10.4 re-bootstrap inverted; §5 agent_omni row, §10.6 threat row, trust-boundary + actor-role tables, CLI inventory → pairing terms. Solidity link_code_redemption calldata param kept (contract unchanged). operator-runbook-wire.md: Phase P walkthrough (P.0 request → P.1 claim → P.1b retrieve → P.1c pending → P.2 bind+ack → P.3 grant), 404 trap + route checks, troubleshooting rows. v2-stage1-migration-and-demo.md §7: rewritten for method A (also fixes pre-existing drift — the master, not the broker, submits registerAgentDevice). issue-144 plan: superseded-front-half banner → method-A doc. issue-74 ephemeral-rebootstrap paragraph corrected. Code doc-comments (mcp-server config, core actor_omni, cli device_session) → pairing terms. fmt + clippy --all-targets -D warnings clean on the 3 comment-edited crates. * agentkeys: method-A plan doc — mark implemented (unbind deferred #155/#156) * agentkeys: setup-broker-host --ref uses git checkout -f (survive shadowing untracked files) The --ref deploy path ran a plain `git checkout $PULL_REF`, which ABORTS when an untracked working-tree file shadows a file the target ref tracks ("untracked working tree files would be overwritten by checkout" — hit on the broker host when switching to a branch that tracks docs/wiki/*.md the prior branch did not). The broker host is a deploy target, not a dev checkout: -f overwrites the colliding files with the tracked version + discards local edits to TRACKED files, while LEAVING unrelated untracked files (env, keys, certs — all gitignored) intact. Folds the fix back into the deploy runbook so the next operator does not hit the same abort.

hanwencheng added 8 commits May 31, 2026 15:52

hanwencheng mentioned this pull request May 31, 2026

fix(setup-cloud): self-heal the SSM precondition for step 15 (prod + test brokers) #151

Closed

hanwencheng added 2 commits May 31, 2026 18:45

hanwencheng mentioned this pull request May 31, 2026

Phase 3b: Hosted-LLM MCP deployment (xiaozhi / vendor-cloud) — broker-hosted mcp-endpoint #152

Open

5 tasks

hanwencheng added 13 commits May 31, 2026 21:53

hanwencheng merged commit dc94fba into main May 31, 2026
7 checks passed

hanwencheng mentioned this pull request May 31, 2026

ci: lowercase the GHCR image tag for the mcp-server publish job #154

Merged

This was referenced Jun 1, 2026

SidecarRegistry: agent/device-initiated on-chain self-revocation (device-key-signed revoke) #155

Open

Full arch.md §10.2 agent-bootstrap ceremony (HDKD omni + broker link-code endpoints) #144

Closed

This was referenced Jun 1, 2026

Demo: aiosandbox + Hermes + AgentKeys on ESP32 hardware #103

Open

docs: add 2026-06-01 AgentKeys progress report #158

Closed

hanwencheng mentioned this pull request Jun 1, 2026

agentkeys: §10.2 agent-initiated pairing (method A) — flip from master-issued link codes #159

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: full §10.2 HDKD agent bootstrap — broker link-code endpoints + daemon redeem (#144)#149

feat: full §10.2 HDKD agent bootstrap — broker link-code endpoints + daemon redeem (#144)#149
hanwencheng merged 23 commits into
mainfrom
claude/impl-144-hdkd-bootstrap

hanwencheng commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hanwencheng commented May 31, 2026

Issue #144 — Full arch.md §10.2 agent-bootstrap (HDKD omni + broker link-code endpoints)

The flow (the "install an app → approve its permissions" story)

Decisions (asked + answered)

What landed

Deviation (vs the asked plan)

Out of scope (deferred)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant