fix(swift-vnet): reland #5834 with shellIdentity on mgmt Shell steps (ARO-28045)#5889
Conversation
|
Skipping CI for Draft Pull Request. |
d3fa1ee to
cc2a49f
Compare
…teps Every Shell step's shellIdentity is effectively mandatory: sdp-pipelines EV2RA manifest generation validates it unconditionally. ARO-HCP CI does not run that generation, so Shell steps missing shellIdentity slip through and only fail the bump after merge (as happened in Azure#5834). Declare shellIdentity on the dev-ci offenders so the repo is fully compliant ahead of making shellIdentity required in the ARO-Tools pipeline schema (AROSLSRE-1380): - dev-ci/e2e-subscription-rbac 'ci-bot-secret-{int,stg,prod}': reference a new outputOnly 'global-identity' step (output-opstool-global-identity.bicep, which already existed with its bicepparam) to expose globalMSIId. shellIdentity is only consumed by EV2; templatize's local executor ignores it, so dev/CI behaviour is unchanged. The swift-vnet and mgmt-solo 'fixes' Shell steps are handled in the reland PR Azure#5889, which already edits both mgmt pipeline files. Validated with 'templatize pipeline validate' against topology-dev-ci.yaml. AROSLSRE-1380
cc2a49f to
c734a2b
Compare
Relands Azure#5834 (reverted in Azure#5888) unchanged. Creates and tags the Swift management VNet via a managed-identity ACI launched from a synchronous `swift-vnet` Shell step, replacing the Azure-Policy-banned deploymentScripts approach. The follow-up commit adds the shellIdentity that was missing from the Shell step and caused the original revert.
A Shell step's shellIdentity is validated unconditionally by sdp-pipelines
EV2RA manifest generation:
step Management.Infra.management.swift-vnet: error validating step:
invalid shell identity: variable must specify config ref, value, or input chain
ARO-HCP CI does not run EV2RA generation, so the omission in Azure#5834 passed
all PR checks and only broke the Global bump after merge (the reason for
the Azure#5888 revert).
Declare shellIdentity: globalMSIId on the Shell steps in both
mgmt-pipeline.yaml and mgmt-solo-pipeline.yaml (swift-vnet in both, plus
the pre-existing `fixes` step in mgmt-solo-pipeline.yaml which had the
same omission). In EV2 the outer shell runs as globalMSI (harmless: the
container still performs the create/tag); templatize's local executor
ignores shellIdentity, so dev/CI behaviour is unchanged.
AROSLSRE-1380 makes shellIdentity required in the ARO-Tools pipeline
schema so this class of bug fails at PR time.
c734a2b to
3b3c6ed
Compare
There was a problem hiding this comment.
Pull request overview
This PR relands the Swift management VNet create/tag flow by replacing the Azure-Policy-banned deploymentScripts approach with a synchronous Shell step (swift-vnet) that launches an ACI container group assigned globalMSI to perform the Swift stamp tag write, and it adds the missing shellIdentity wiring to satisfy EV2RA manifest validation.
Changes:
- Introduce
scripts/swift-vnet.shto create/tag the Swift mgmt VNet by running the write inside an ACI container group assignedglobalMSI. - Split Swift VNet RBAC role assignments into a dedicated Bicep module/template (
vnet-rbac.bicep,swift-vnet-permissions.bicep) and update VNet module behavior for Swift to treat the VNet asexisting. - Update
mgmt-pipeline.yamlandmgmt-solo-pipeline.yamlto addswift-vnet-permissions+swift-vnetsteps and declareshellIdentityfor Shell steps that require it.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| dev-infrastructure/templates/swift-vnet-permissions.bicep | New template that conditionally applies RBAC grants for the Swift VNet identity. |
| dev-infrastructure/configurations/swift-vnet-permissions.tmpl.bicepparam | New bicepparam template wiring deploymentMsiId + enableSwift into the RBAC template. |
| dev-infrastructure/modules/network/vnet-rbac.bicep | New module that creates Network Contributor + Tag Contributor assignments for the Swift identity. |
| dev-infrastructure/modules/network/vnet.bicep | Removes deployment-script-based Swift path; for Swift, treats VNet as existing (created/tagged out-of-band). |
| dev-infrastructure/scripts/swift-vnet.sh | New script to launch ACI under globalMSI, poll for completion, stream logs, and clean up. |
| dev-infrastructure/mgmt-pipeline.yaml | Adds swift-vnet-permissions + swift-vnet steps and fixes EV2RA validation via shellIdentity. |
| dev-infrastructure/mgmt-solo-pipeline.yaml | Adds subscription lookup + Swift VNet steps and adds missing shellIdentity on fixes. |
| dev-infrastructure/templates/mgmt-cluster.bicep | Stops passing deploymentMsiId into the VNet module now that it no longer accepts it. |
| dev-infrastructure/templates/svc-cluster.bicep | Stops passing deploymentMsiId into the VNet module for non-Swift svc deployments. |
| dev-infrastructure/templates/opstool-cluster.bicep | Stops passing deploymentMsiId into the VNet module for non-Swift opstool deployments. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: geoberle, raelga The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test e2e-parallel |
…ns (ARO-28100) The swift-vnet step creates an ACI (az container create) in the management subscription, but Microsoft.ContainerInstance was not in mgmt.subscription.providers, so a fresh management subscription with Swift enabled could fail az container create with MissingSubscriptionRegistration. Add Microsoft.ContainerInstance to the mgmt providers list so the rpRegistration step registers it before swift-vnet runs, making fresh-subscription rollouts deterministic. Rendered dev configs regenerated via 'make -C config/ materialize'. Existing int/stg/prod mgmt subscriptions already have the provider registered (#5834's e2e ran the container), so this is a fast-follow hardening, not a fix for the current reland (#5889). Flagged by Copilot review on #5889.
…SLSRE-1380) Bumps the github.com/Azure/ARO-Tools modules from 2277df76598b (2026-06-17) to 4612291d5420 (2026-07-03), which makes shellIdentity a required field on shellStepBase in the pipeline schema (Azure/ARO-Tools#262). This lands the enforcement so that a Shell step missing shellIdentity now fails ARO-HCP 'make validate-config-pipelines' at PR time, instead of only breaking the sdp-pipelines EV2 bump after merge (the regression that hit ARO-28045 / Azure#5834). All in-repo pipelines validate cleanly against the stricter schema; the known offenders were already fixed in Azure#5889 and Azure#5890.
ARO-28045 (epic ARO-28038; follow-ups ARO-28066, ARO-28082)
Relands #5834 (reverted in #5888) with the EV2 fix. Rebased onto
main— now a single clean commit.What
Re-applies #5834 (Swift mgmt-VNet create/tag via a managed-identity ACI launched from a single synchronous
swift-vnetShell step, replacing the Azure-Policy-banneddeploymentScripts) and adds the missingshellIdentityon the Shell steps in bothmgmt-pipeline.yamlandmgmt-solo-pipeline.yaml:swift-vnetin both mgmt pipelines (the step reverted from fix: create/tag Swift VNet via a managed-identity container (ARO-28045) #5834).fixesinmgmt-solo-pipeline.yaml, which had the same omission — folded in here so allshellIdentitychanges to that file live in a single PR (the dev-ci offenders are handled separately in #5890).Why
#5834 was reverted (#5888) because its
swift-vnetShell step omittedshellIdentity, which sdp-pipelines EV2RA manifest generation validates unconditionally. The bump failed at manifest resolution:ARO-HCP CI does not exercise EV2RA generation, so it slipped through. Declaring
shellIdentity: globalMSIId(same pattern asupgrade-aks-cluster/cleanup-prometheus-pvc, and matching the step's ownDEPLOYMENT_MSI_IDinput) satisfies the validator. In EV2 the outer shell runs as globalMSI (harmless — the container still performs the create/tag); templatize's local executor ignoresshellIdentity, so e2e behaviour is identical to the proven #5834 run.Testing
yamllint -c .yamllint.ymlon both pipelines (clean).Special notes for your reviewer
Now rebased onto
mainafter the revert (#5888) merged — the diff is exactly the #5834 reintroduction plus theshellIdentityadditions, in one commit. The schema change that makesshellIdentityrequired lives in Azure/ARO-Tools#262 (AROSLSRE-1380), held until this and #5890 merge.PR Checklist