CodeAct Pattern with Foundry Hosted Agents and ACA Sandboxes

A standalone sample that deploys a Foundry Hosted Agent (FHA) which executes Python code and shell commands inside an Azure Container Apps Sandbox (ACAS) — the "CodeAct" pattern.

The agent is built using the Microsoft Agent Framework (MAF), and is deployed to FHA. It uses the acas-toolkit to route execute_code and run_shell tool calls into a sandbox you own.

More Info:

CodeAct
ACA Sandboxes

Prerequisites

Azure subscription with quota for:
- Cognitive Services (Foundry account) in eastus2 or westus2
- One gpt-5.4 model deployment (default; configurable — see Bicep params, you can always use a different model)
- One ACA sandbox group (preview) in your chosen region
CLI tooling on your laptop:
- Azure CLI — az login once
- Azure Developer CLI (azd) — for azd up
- uv — for host-side Python
Python 3.12 (uv will fetch it if missing — see .python-version)

Deploy from scratch

git clone <this-repo>
cd fha-acas-codeact

# 1. Authenticate
az login
azd auth login

# 2. Provision infra + deploy agent (single command)
azd up

azd up will:

Prompt for an environment name (e.g. dev), location, and Foundry model choice
Provision a fresh resource group containing:
- ACAS sandbox group
- Foundry account + project + chosen model deployment
- Container Registry (for the agent image)
- User-assigned Managed Identity + role assignments
- Log Analytics + Application Insights
Build the agent container in ACR via remote build (no local Docker needed)
Register a new agent version in your Foundry project

See docs/deploy.md for the step-by-step breakdown and troubleshooting.

Invoke the deployed agent

After azd up completes, the post-deploy hook writes the agent endpoint into .azure/<env>/.env. Load it and run the orchestrator:

# Pull the deployed values into your shell
source .azure/$(azd env get-value AZURE_ENV_NAME)/.env

# Sync the host-side venv (uv reads pyproject.toml + uv.lock)
uv sync

# Run a prompt end-to-end (creates a sandbox, invokes the agent, deletes it)
uv run python scripts/orchestrate_codeact.py \
    "compute the first 20 fibonacci numbers"

Expected output: the assistant reports running Python in the sandbox and prints the list [0, 1, 1, 2, 3, ..., 4181].

More invocation modes

# Reuse an existing sandbox (skip create + delete)
uv run python scripts/orchestrate_codeact.py "show /etc/os-release" \
    --sandbox-id <existing-sandbox-id>

# Keep the sandbox around for the next call
uv run python scripts/orchestrate_codeact.py "install requests" \
    --keep-sandbox

# Force a fresh chat session (new per-conversation isolation)
uv run python scripts/orchestrate_codeact.py "import this" --new-session

Validate the deployment

Three scripts verify the agent is healthy after azd up or azd deploy. Run them in this order — each one is independent, prints its own results to stdout, and exits non-zero on failure.

1. Three-case latency probe (FS retention + same-microVM check)

uv run python scripts/three_case_latency.py --gap-s 1

Chains three calls via previous_response_id:

Case	What it tests
1. Cold FHA, warm sandbox	First call after microVM boot; plants a token in `/work/probe.txt`.
2. Warm FHA, same session	Same microVM, same sandbox; should be ~3–5× faster than case 1.
3. Sandbox FS retention	Re-reads the planted token via `cat /work/probe.txt`.

Expected: all three cases land on the same boot_id (FHA microVM pinning works), token is read back verbatim (sandbox FS persists). Exits non-zero if the microVM hops mid-session OR if the token doesn't round-trip.

2. Cold-vs-warm latency profile

uv run python scripts/timing_probe.py --calls 3 --gap-s 1

Makes N chained calls and reports:

Per-call wall-clock latency
Same-boot_id assertion across all calls
Cold-call vs warm-avg delta (≈ FHA cold-start cost)

Typical numbers: cold 12–25s, warm 5s, delta 7–20s. Exits non-zero if the microVM hops (warm measurement invalid).

3. Telemetry / App Insights check

uv run python scripts/query_appinsights.py --minutes 10

Row counts per table (AppTraces, AppRequests, AppDependencies, AppExceptions) for the last 10 min. Zero rows in AppTraces means telemetry isn't wired — see infra/modules/foundry.bicep appInsightsConnection resource. Drill into a table with --table AppExceptions --top 5.

Repo layout

fha-acas-codeact/
├── agent/                    # Code that ships inside the FHA container
│   ├── main.py               # ResponsesHostServer + tool wiring
│   ├── agent.yaml            # Foundry hosted-agent deployment spec
│   ├── agent.manifest.yaml   # Foundry agent manifest (template form)
│   ├── requirements.txt      # Human-readable dep list (source of truth)
│   ├── requirements.lock     # Flat pinned lock (what pip installs in the image)
│   └── Dockerfile
├── infra/                    # Bicep templates for `azd up` (phase 2)
├── scripts/
│   ├── orchestrate_codeact.py     # Single-call smoke test
│   ├── three_case_latency.py      # FS retention + microVM-pinning probe
│   ├── timing_probe.py            # Cold-vs-warm latency profile
│   ├── query_appinsights.py       # Post-deploy telemetry forensics
│   └── grant_agent_roles.sh       # azd postdeploy hook (runtime RBAC)
├── docs/
│   ├── architecture.md
│   ├── deploy.md
│   └── cleanup.md                 # The full teardown procedure
├── pyproject.toml            # Host-side dev deps (orchestrator only)
├── uv.lock
├── .python-version           # 3.12
└── azure.yaml                # azd service definition (phase 2)

The container's deps and the orchestrator's deps are intentionally separate stacks — see docs/architecture.md for why.

Tear down

See docs/cleanup.md for the full procedure.

azd down --purge alone is not enough. It leaks:

The Agent Blueprint Entra app registration + service principal (persists in your tenant after the resource group is gone).
A live FHA agent with active Responses sessions causes the Foundry account purge to return HTTP 409 Conflict.
The Log Analytics workspace soft-delete may need explicit purging.

docs/cleanup.md walks through the 7-step procedure with reasoning, gotchas, and a verification checklist that confirms everything is actually gone.

License

MIT — see LICENSE.

Other

Expected baseline noise: one requests.exceptions.ConnectionError per microVM cold-boot, target host 169.254.169.254:80 (IMDS). FHA microVMs don't expose IMDS; identity flows through the Foundry token broker. Always present, always benign. Any other exception type deserves triage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeAct Pattern with Foundry Hosted Agents and ACA Sandboxes

More Info:

Prerequisites

Deploy from scratch

Invoke the deployed agent

More invocation modes

Validate the deployment

1. Three-case latency probe (FS retention + same-microVM check)

2. Cold-vs-warm latency profile

3. Telemetry / App Insights check

Repo layout

Tear down

License

Other

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
docs		docs
infra		infra
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
azure.yaml		azure.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

CodeAct Pattern with Foundry Hosted Agents and ACA Sandboxes

More Info:

Prerequisites

Deploy from scratch

Invoke the deployed agent

More invocation modes

Validate the deployment

1. Three-case latency probe (FS retention + same-microVM check)

2. Cold-vs-warm latency profile

3. Telemetry / App Insights check

Repo layout

Tear down

License

Other

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages