Skip to content

fix(swift-vnet): reland #5834 with shellIdentity on mgmt Shell steps (ARO-28045)#5889

Merged
openshift-merge-bot[bot] merged 2 commits into
Azure:mainfrom
raelga:rael/reland-5834-swift-vnet-shellidentity-ARO-28045
Jul 2, 2026
Merged

fix(swift-vnet): reland #5834 with shellIdentity on mgmt Shell steps (ARO-28045)#5889
openshift-merge-bot[bot] merged 2 commits into
Azure:mainfrom
raelga:rael/reland-5834-swift-vnet-shellidentity-ARO-28045

Conversation

@raelga

@raelga raelga commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

ARO-28045 (epic ARO-28038; follow-ups ARO-28066, ARO-28082)

Relands #5834 (reverted in #5888) with the EV2 fix. Rebased onto main — now a single clean commit.

What

Re-applies #5834 (Swift mgmt-VNet create/tag via a managed-identity ACI launched from a single synchronous swift-vnet Shell step, replacing the Azure-Policy-banned deploymentScripts) and adds the missing shellIdentity on the Shell steps in both mgmt-pipeline.yaml and mgmt-solo-pipeline.yaml:

    shellIdentity:
      input:
        resourceGroup: global
        step: output
        name: globalMSIId

Why

#5834 was reverted (#5888) because its swift-vnet Shell step omitted shellIdentity, which sdp-pipelines EV2RA manifest generation validates unconditionally. The bump failed at manifest resolution:

step Management.Infra.management.swift-vnet: error validating step:
invalid shell identity: variable must specify config ref, value, or input chain

ARO-HCP CI does not exercise EV2RA generation, so it slipped through. Declaring shellIdentity: globalMSIId (same pattern as upgrade-aks-cluster / cleanup-prometheus-pvc, and matching the step's own DEPLOYMENT_MSI_ID input) satisfies the validator. In EV2 the outer shell runs as globalMSI (harmless — the container still performs the create/tag); templatize's local executor ignores shellIdentity, so e2e behaviour is identical to the proven #5834 run.

Testing

Special notes for your reviewer

Now rebased onto main after the revert (#5888) merged — the diff is exactly the #5834 reintroduction plus the shellIdentity additions, in one commit. The schema change that makes shellIdentity required lives in Azure/ARO-Tools#262 (AROSLSRE-1380), held until this and #5890 merge.

PR Checklist

  • PR is scoped to a single task
  • Title follows Conventional Commits
  • Summary explains the "Why"
  • Linked to relevant ticket/issue
  • Self-reviewed the diff
  • CI/CD checks are passing (ignore Tide)
  • Commit history is clean
  • Tricky code blocks are commented

@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@raelga raelga force-pushed the rael/reland-5834-swift-vnet-shellidentity-ARO-28045 branch from d3fa1ee to cc2a49f Compare July 2, 2026 10:28
raelga added a commit to raelga/ARO-HCP that referenced this pull request Jul 2, 2026
…teps

Every Shell step's shellIdentity is effectively mandatory: sdp-pipelines
EV2RA manifest generation validates it unconditionally. ARO-HCP CI does
not run that generation, so Shell steps missing shellIdentity slip
through and only fail the bump after merge (as happened in Azure#5834).

Declare shellIdentity on the dev-ci offenders so the repo is fully
compliant ahead of making shellIdentity required in the ARO-Tools
pipeline schema (AROSLSRE-1380):

- dev-ci/e2e-subscription-rbac 'ci-bot-secret-{int,stg,prod}': reference
  a new outputOnly 'global-identity' step (output-opstool-global-identity.bicep,
  which already existed with its bicepparam) to expose globalMSIId.

shellIdentity is only consumed by EV2; templatize's local executor
ignores it, so dev/CI behaviour is unchanged.

The swift-vnet and mgmt-solo 'fixes' Shell steps are handled in the
reland PR Azure#5889, which already edits both mgmt pipeline files.

Validated with 'templatize pipeline validate' against topology-dev-ci.yaml.

AROSLSRE-1380
@raelga raelga changed the title fix(swift-vnet): reland #5834 with shellIdentity on the swift-vnet Shell step (ARO-28045) fix(swift-vnet): reland #5834 with shellIdentity on mgmt Shell steps (ARO-28045) Jul 2, 2026
@raelga raelga force-pushed the rael/reland-5834-swift-vnet-shellidentity-ARO-28045 branch from cc2a49f to c734a2b Compare July 2, 2026 10:36
@raelga raelga marked this pull request as ready for review July 2, 2026 10:37
Copilot AI review requested due to automatic review settings July 2, 2026 10:37
@openshift-ci openshift-ci Bot requested review from janboll and weherdh July 2, 2026 10:37
raelga added 2 commits July 2, 2026 10:38
Relands Azure#5834 (reverted in Azure#5888) unchanged.

Creates and tags the Swift management VNet via a managed-identity ACI
launched from a synchronous `swift-vnet` Shell step, replacing the
Azure-Policy-banned deploymentScripts approach.

The follow-up commit adds the shellIdentity that was missing from the
Shell step and caused the original revert.
A Shell step's shellIdentity is validated unconditionally by sdp-pipelines
EV2RA manifest generation:

    step Management.Infra.management.swift-vnet: error validating step:
    invalid shell identity: variable must specify config ref, value, or input chain

ARO-HCP CI does not run EV2RA generation, so the omission in Azure#5834 passed
all PR checks and only broke the Global bump after merge (the reason for
the Azure#5888 revert).

Declare shellIdentity: globalMSIId on the Shell steps in both
mgmt-pipeline.yaml and mgmt-solo-pipeline.yaml (swift-vnet in both, plus
the pre-existing `fixes` step in mgmt-solo-pipeline.yaml which had the
same omission). In EV2 the outer shell runs as globalMSI (harmless: the
container still performs the create/tag); templatize's local executor
ignores shellIdentity, so dev/CI behaviour is unchanged.

AROSLSRE-1380 makes shellIdentity required in the ARO-Tools pipeline
schema so this class of bug fails at PR time.
@raelga raelga force-pushed the rael/reland-5834-swift-vnet-shellidentity-ARO-28045 branch from c734a2b to 3b3c6ed Compare July 2, 2026 10:39

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR relands the Swift management VNet create/tag flow by replacing the Azure-Policy-banned deploymentScripts approach with a synchronous Shell step (swift-vnet) that launches an ACI container group assigned globalMSI to perform the Swift stamp tag write, and it adds the missing shellIdentity wiring to satisfy EV2RA manifest validation.

Changes:

  • Introduce scripts/swift-vnet.sh to create/tag the Swift mgmt VNet by running the write inside an ACI container group assigned globalMSI.
  • Split Swift VNet RBAC role assignments into a dedicated Bicep module/template (vnet-rbac.bicep, swift-vnet-permissions.bicep) and update VNet module behavior for Swift to treat the VNet as existing.
  • Update mgmt-pipeline.yaml and mgmt-solo-pipeline.yaml to add swift-vnet-permissions + swift-vnet steps and declare shellIdentity for Shell steps that require it.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
dev-infrastructure/templates/swift-vnet-permissions.bicep New template that conditionally applies RBAC grants for the Swift VNet identity.
dev-infrastructure/configurations/swift-vnet-permissions.tmpl.bicepparam New bicepparam template wiring deploymentMsiId + enableSwift into the RBAC template.
dev-infrastructure/modules/network/vnet-rbac.bicep New module that creates Network Contributor + Tag Contributor assignments for the Swift identity.
dev-infrastructure/modules/network/vnet.bicep Removes deployment-script-based Swift path; for Swift, treats VNet as existing (created/tagged out-of-band).
dev-infrastructure/scripts/swift-vnet.sh New script to launch ACI under globalMSI, poll for completion, stream logs, and clean up.
dev-infrastructure/mgmt-pipeline.yaml Adds swift-vnet-permissions + swift-vnet steps and fixes EV2RA validation via shellIdentity.
dev-infrastructure/mgmt-solo-pipeline.yaml Adds subscription lookup + Swift VNet steps and adds missing shellIdentity on fixes.
dev-infrastructure/templates/mgmt-cluster.bicep Stops passing deploymentMsiId into the VNet module now that it no longer accepts it.
dev-infrastructure/templates/svc-cluster.bicep Stops passing deploymentMsiId into the VNet module for non-Swift svc deployments.
dev-infrastructure/templates/opstool-cluster.bicep Stops passing deploymentMsiId into the VNet module for non-Swift opstool deployments.

Comment thread dev-infrastructure/mgmt-pipeline.yaml
Comment thread dev-infrastructure/mgmt-solo-pipeline.yaml
Copilot AI review requested due to automatic review settings July 2, 2026 10:43
@geoberle

geoberle commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

/lgtm

@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: geoberle, raelga

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.

@raelga

raelga commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

/test e2e-parallel

@openshift-merge-bot openshift-merge-bot Bot merged commit a26b05d into Azure:main Jul 2, 2026
17 checks passed
openshift-merge-bot Bot pushed a commit that referenced this pull request Jul 2, 2026
…ns (ARO-28100)

The swift-vnet step creates an ACI (az container create) in the
management subscription, but Microsoft.ContainerInstance was not in
mgmt.subscription.providers, so a fresh management subscription with
Swift enabled could fail az container create with
MissingSubscriptionRegistration.

Add Microsoft.ContainerInstance to the mgmt providers list so the
rpRegistration step registers it before swift-vnet runs, making
fresh-subscription rollouts deterministic. Rendered dev configs
regenerated via 'make -C config/ materialize'.

Existing int/stg/prod mgmt subscriptions already have the provider
registered (#5834's e2e ran the container), so this is a fast-follow
hardening, not a fix for the current reland (#5889).

Flagged by Copilot review on #5889.
raelga added a commit to raelga/ARO-HCP that referenced this pull request Jul 3, 2026
…SLSRE-1380)

Bumps the github.com/Azure/ARO-Tools modules from 2277df76598b (2026-06-17)
to 4612291d5420 (2026-07-03), which makes shellIdentity a required field on
shellStepBase in the pipeline schema (Azure/ARO-Tools#262).

This lands the enforcement so that a Shell step missing shellIdentity now
fails ARO-HCP 'make validate-config-pipelines' at PR time, instead of only
breaking the sdp-pipelines EV2 bump after merge (the regression that hit
ARO-28045 / Azure#5834).

All in-repo pipelines validate cleanly against the stricter schema; the known
offenders were already fixed in Azure#5889 and Azure#5890.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants