fix(template-15): deploy fails with HTTP 409 because we PUT a second account-level capability host that the Cognitive Services resource provider already auto-created (closes #312)#753
Open
KazuOnuki wants to merge 1 commit into
Conversation
…lityHost call
Network-injected accounts (properties.networkInjections[].scenario='agent')
cause the Cognitive Services resource provider (Microsoft.CognitiveServices)
to auto-create an account-level capabilityHost named
{accountName}@aml_aiagentservice ~5s after the account PUT. This template
is network-secured-only and always passes that property, so any subsequent
PUT of a second account-level capabilityHost (caphostacct in microsoft-foundry#261) fails
with 409 'for the same ClientId'.
Remove the module call, related vars, and dependsOn entry from main.bicep,
and update README.md to describe the auto-create flow. The .bicep module
file is retained for non-network-secured sibling templates.
Verified twice 2026-06-04 with fresh deploys in a clean subscription
(japaneast): both variants Succeeded end-to-end, caphostacct never PUT,
project capabilityHost bound to all three connections.
Closes microsoft-foundry#312
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
15-private-network-standard-agent-setuptemplate fails with HTTP 409 on thecaphostacctstep.Microsoft.CognitiveServices) silently auto-creates an account-level capability host named{account}@aml_aiagentservicewhenever the account is created withnetworkInjections.scenario='agent'(which this template always does). Our explicitaddAccountCapabilityHostmodule then tries to PUT a second one calledcaphostacct— but only one capability host per account is allowed, so the resource provider rejects it.addAccountCapabilityHostcall from this template'smain.bicep. The auto-created@aml_aiagentservicecapability host is sufficient — the project capability host binds to it cleanly and the whole deploy succeeds.The bug you'll hit
Deploy this template against a clean subscription today, and the
caphostacctsub-deployment fails with this exact error from the Cognitive Services API:This is the symptom reported in #312 and several other open issues (#254, #255, #265 all show the same
for the same ClientIdtext).Why it happens
The resource provider has a hidden behavior:
Once that auto-created capability host exists, any other account-level capability host PUT — even with a different name like
caphostacct— returns 409, because each account can hold only one capability host (documented constraint: "Each account can have only one active capability host. If you try to create a second capability host with a different name at the same scope, you'll receive a 409 error").I verified the auto-create trigger on a clean subscription:
networkInjections.scenario='agent'?ai-account-identity.bicep){name}@aml_aiagentserviceappearsSo the moment this template's
ai-account-identity.bicepfinishes, we already have an account capability host — and theaddAccountCapabilityHoststep is guaranteed to collide.Why the existing code looked correct
addAccountCapabilityHostwas added in #261 to support BYO / basic-setup scenarios where the account does not auto-create a capability host. That intent is still valid for sibling templates. The author had no way to know that this template — because of itsnetworkInjections.scenario='agent'flag — already gets one for free, because:networkInjectionsschema (scenario,subnetArmId,useMicrosoftManagedNetwork) is documented in the Bicep / ARM reference and in the REST API spec (2025-04-01-preview, PR #32877 Mar 2025, PR #36395 Aug 2025).scenario='agent'(auto-creating@aml_aiagentservice) is not documented anywhere public — not in virtual-networks, not in use-your-own-resources.Two unrelated edits — adding
networkInjectionshere, and addingaddAccountCapabilityHostin #261 — therefore silently collide. This PR also adds an inline block comment inmain.bicepso the next maintainer doesn't re-introduce the same conflict.The fix (concrete changes)
In
infrastructure/infrastructure-setup-bicep/15-private-network-standard-agent-setup/:main.bicep(33 ins / 25 del):addAccountCapabilityHostmodule block (the one that PUTscaphostacct).scope:useExistingAccount,existingAccountIdParts,existingAccountSubscriptionId,existingAccountResourceGroupName.addAccountCapabilityHostentry fromaddProjectCapabilityHost.dependsOn.{account}@aml_aiagentservicebecause we setnetworkInjections.scenario='agent'. The project capability host below binds to that auto-created one."README.md(6 lines changed):add-account-capability-host.bicep" to "auto-created by the resource provider vianetworkInjections.scenario='agent'".After this PR,
addProjectCapabilityHoststill depends onaiProjectModule, which depends onaiAccount(which is when the auto-create fires). So the ordering is preserved withoutaddAccountCapabilityHostin the chain.What I intentionally did NOT delete
modules-network-secured/add-account-capability-host.bicepis kept even though this template no longer calls it. Reason: PR #261 added it for BYO / basic-setup templates that do not passnetworkInjections.scenario='agent'and therefore really do need to declare their own account capability host. Removing the file would regress those templates.grep -r add-account-capability-hostconfirms no other current external callers, but the file is preserved for the documented BYO scenario.Validation
networkInjections.scenario='agent'(no Portal, no SDK, no prior deploy)for the same ClientIdreproducednetworkInjectionsdoes NOT auto-create any capability host (4 control accounts)account PUT→VirtualWorkspace/CreateOrUpdateWorkspace202 →NotifyCapabilityHost202 (sameoperation_Id) →{name}@aml_aiagentservicevisible viaGET /capabilityHostsjapaneast, 2026-06-04) — variant A:if (false)guard{account}@aml_aiagentserviceexists, project capability host bound to vectorStore/storage/threadStorage connections,caphostacctGET returns 404 (because we never PUT it)az bicep build main.bicepmain.jsoncontains nocaphostacct/addAccountCapabilityHost/add-account-capability-hostreferencesRelated issues
for the same ClientId409 error stringadd-account-capability-host.bicepmodule file, preserving Force-replace project capability host on changes #261's BYO / basic-setup intent for sibling templates