Skip to content

feat(runner): add flashduty_exec operation for capability-bound auth#50

Open
ysyneu wants to merge 4 commits into
mainfrom
feat/flashduty-exec
Open

feat(runner): add flashduty_exec operation for capability-bound auth#50
ysyneu wants to merge 4 commits into
mainfrom
feat/flashduty-exec

Conversation

@ysyneu
Copy link
Copy Markdown
Collaborator

@ysyneu ysyneu commented May 27, 2026

Summary

Adds a new runner WebSocket operation flashduty_exec that fork-execs the flashduty CLI with per-call auth env, separate from the generic bash code path. This is the runner-side counterpart to fc-safari's FlashdutyExec(verb, args) (PR flashcatcloud/fc-safari#74).

  • protocol/messages.go: TaskOpFlashdutyExec + FlashdutyExecArgs + FlashdutyExecResult = BashResult (alias keeps safari's existing decoder reusable)
  • environment/environment.go: executeFlashdutyExec mirrors executeBashCommand shape (timeout, env merge, output capture, truncation, .outputs/ spill) — only differences are direct exec.CommandContext("flashduty", verbTokens..., args...) instead of bash -c, and $FLASHDUTY_BINARY env hook for testability. Factored a shared runCapturedCommand helper from executeBashCommand to avoid duplication.
  • ws/handler.go: routes TaskOpFlashdutyExec through Environment.FlashdutyExec.

Why this exists

fc-safari currently injects FLASHDUTY_APP_KEY into every generic bash invocation (see PR #74 commit f1d4a06a). That env is now removed from the bash path — audited callers route through this new op instead, so the LLM can no longer leak the key via env, echo $FLASHDUTY_APP_KEY, or any accidental echo in skill output.

Test plan

  • go test ./environment -count=1TestExecuteFlashdutyExec_EnvIsolatedToSubprocess (env confinement) + TestExecuteFlashdutyExec_VerbSplitsOnWhitespace (verb tokenization via /bin/echo substitution)
  • Full suite: go test ./... -count=1 green
  • go build ./... && go vet ./... clean
  • CI green
  • Local BYOC E2E: safari worktree + this runner + flashduty CLI on PATH → exercise an AI-SRE incident query, verify (a) tool call is flashduty(...), (b) env | grep FLASHDUTY shows no values

Wire shape (for future safari callers)

{"verb":"incident list","args":["--json","--limit","10"],"workdir":"","timeout":120,"env":{"FLASHDUTY_APP_KEY":"","FLASHDUTY_API_BASE":""}}

Result is BashResult-shaped (exit_code, stdout/stderr with truncation + spill).

ysyneu added 4 commits May 28, 2026 00:16
…ction

Adds executeFlashdutyExec that fork-execs the flashduty CLI with per-call
auth env, isolated from the generic bash code path. Routes TaskOpFlashdutyExec
through the WebSocket handler. Phase 2 of fc-safari CLI adoption.
Phase 2 contract is that only the `flashduty_exec` path carries per-user
Flashduty credentials (FLASHDUTY_APP_KEY, FLASHDUTY_API_BASE, ...). The
generic `bash` tool must not see them. The previous executeBashCommand
inherited the runner's full os.Environ(), which leaked any FLASHDUTY_*
keys present in the runner process — easy to hit on a dev workstation
that exports them for safari, and impossible to audit in cloud sandboxes
when sandbox-manager forwards arbitrary entries.

Fix: scrubFlashdutySecrets() drops every FLASHDUTY_* entry from the
inherited env before bash sees it. Caller-supplied extraEnv layers on
top, so explicit hand-offs still work (test fixtures + non-secret
overrides unchanged). The flashduty_exec path is untouched — it relies
on safari's per-call extraEnv to re-add FLASHDUTY_APP_KEY in-place.

Caught in Phase 2 E2E: bash subprocess reported FLASHDUTY_APP_KEY=<key>
even though Phase 2 was supposed to remove it. Re-verified with the
scrub: bash sees zero FLASHDUTY_* entries; flashduty tool still returns
real incident data via its own auth pathway.
Two CI issues caught after the flashduty_exec PR push:

1. gofmt complained about the FlashdutyExecArgs struct alignment in
   protocol/messages.go — comment-column re-flow after the new fields.
2. windows-latest unit-test job failed because TestExecuteFlashdutyExec_
   VerbSplitsOnWhitespace shells out to /bin/echo to inspect argv. There
   is no PowerShell equivalent that preserves the same stdout shape, so
   the test is now skipped on GOOS=windows (mirrors the envd POSIX-only
   skips landed earlier).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant