chore(grafana): move workspace lookup to pipeline shell step (ARO-28040)#5837
chore(grafana): move workspace lookup to pipeline shell step (ARO-28040)#5837raelga wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the ARM deploymentScripts-based Grafana Azure Monitor workspace integration lookup/preservation flow and replaces it with a global pipeline Shell step that patches the Managed Grafana resource after the global infra deployment. This is intended to avoid deployment scripts that require shared-key-backed storage access, which is blocked by policy.
Changes:
- Remove the
integration-lookupBicep module (and its PowerShell payload) and stop supplyinggrafanaIntegrationsvia the Grafana ARM resource. - Add a pipeline Shell step plus a Bash script to read and PATCH existing Azure Monitor workspace integrations.
- Add an additional role assignment for the Grafana manager identity to support the new flow.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| dev-infrastructure/templates/global-infra.bicep | Removes the integration lookup module and stops passing integrations into Grafana deployment. |
| dev-infrastructure/scripts/grafana-integrations.sh | New Bash script to query and PATCH Grafana workspace integrations. |
| dev-infrastructure/modules/grafana/integration-lookup.bicep | Deletes the ARM deploymentScripts lookup module. |
| dev-infrastructure/modules/grafana/instance.bicep | Removes grafanaIntegrations from the Grafana resource and adds a new role assignment for the manager identity. |
| dev-infrastructure/modules/grafana/grafanaIntegrationsLookup.ps1 | Deletes the PowerShell payload used by the deployment script. |
| dev-infrastructure/global-pipeline.yaml | Adds a global Shell step to run grafana-integrations.sh after global infra deploys. |
| dev-infrastructure/global-pipeline-stg.yaml | Mirrors the same Shell step addition for the STG-global pipeline. |
1173390 to
9487b34
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: raelga The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Redesign: grafanactl-based reconcile (supersedes preserve/PATCH)Thanks @copilot — all the inline comments target the previous
Instead, a new Validation so far: No further action or commit needed from the reviewer on the removed |
9a79563 to
29b4a83
Compare
The ARO-28044 mitigation (Azure#5838) commented out the grafana rollout and reader steps to unblock pipelines. This change supersedes that mitigation with a redesigned, deploymentScript-free grafana reconcile, so re-enable grafana first by reverting the mitigation in full.
Introduce a grafanactl reconcile command that owns managed grafana create/update and datasource wiring with dynamic Azure Monitor Workspace discovery, replacing the grafana deploymentScript-based integration lookup. Zone redundancy mirrors determineZoneRedundancy() in common.bicep: Auto and Enabled only enable zone redundancy when the target region exposes availability zones.
…ntScript (ARO-28040) Replace the grafana integration deploymentScript lookup with a grafanactl reconcile pipeline step and dedicated grafana-rbac bicep template. Removes integration-lookup.bicep and grafanaIntegrationsLookup.ps1, slims instance.bicep / global-infra.bicep, and gates the Front Door role assignment on azureFrontDoorManage && the profile name being set.
29b4a83 to
68243ac
Compare
Address PR review feedback: - The global pipeline grafana step invoked ./grafanactl, but the binary is gitignored and not committed, so the step would fail with file-not-found. Prepend 'make build' (prod + stg) so the binary is built before invocation. - managedGrafana() always set Tags to a non-nil empty map; an empty tags map on the ARM PUT clears existing tags on the Managed Grafana resource. Leave Tags nil unless a cross-tenant security group value is provided.
|
Addressed the latest Copilot review in 99b78fc:
Two earlier Copilot comments are already satisfied in the current revision: the AFD condition mismatch in Validation: No further action needed from Copilot on these — resolved. |
| grafanaCmd := &cobra.Command{ | ||
| Use: "grafana", | ||
| Short: "Reconcile an Azure Managed Grafana instance", | ||
| Long: "Create or update an Azure Managed Grafana instance, discover all succeeded Azure Monitor Workspaces, and reconcile Azure Monitor Workspace integrations.", | ||
| RunE: func(cmd *cobra.Command, args []string) error { | ||
| return opts.Run(cmd.Context()) | ||
| }, | ||
| } |
There was a problem hiding this comment.
Good catch — fixed in 0e96817. Added Args: cobra.NoArgs to the reconcile grafana subcommand, so unexpected positional arguments are now rejected instead of silently ignored. No further action or commit needed from you — this is resolved.
| logger.Info("Creating or updating Managed Grafana", "workspace-count", len(workspaceIDs)) | ||
| poller, err := o.grafanaClient.BeginCreate(ctx, o.ResourceGroup, o.GrafanaName, o.managedGrafana(workspaceIDs), nil) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to start Managed Grafana create/update: %w", err) |
There was a problem hiding this comment.
Valid edge case — fixed in 0e96817. Context: on main the prior PowerShell deployment script captured the existing Grafana integrations and re-applied them, so it self-healed; this command instead derives the set from a live NewListBySubscriptionPager query, which could transiently return zero. Run now guards that case: when discovery returns zero workspaces it does a GET on the Grafana, and if the instance already exists with integrations it logs a warning and skips the update to avoid wiping them. A genuine first-time create (Grafana does not exist yet, returns 404) still proceeds with an empty set, so bootstrap is unaffected. No new flag was needed. No further action or commit needed from you — this is resolved.
…tion wipe Add cobra.NoArgs to the reconcile grafana subcommand so unexpected positional arguments are rejected instead of silently ignored. Guard the create/update: if Azure Monitor Workspace discovery returns zero workspaces (e.g. a transient permission/listing glitch) and the Grafana already has integrations, skip the update to avoid wiping them. A genuine first-time create (Grafana does not exist yet) still proceeds.
|
We need to add this to ARO-Tools and integrate into the pipeline. Closing this for now, will re-iterate and integrate your work with the other PRs |
ARO-28040
What
Redesigns Grafana reconciliation so
grafanactlowns the Azure Managed Grafana create/update flow end-to-end.The new Grafana command discovers all succeeded Azure Monitor Workspaces (Prometheus), creates or updates the Managed Grafana resource with
azureMonitorWorkspaceIntegrations, and reconciles Grafana datasources in the same pipeline step. Bicep no longer creates the Grafana instance or emits AMW lookup outputs, while Grafana RBAC remains declarative in bicep through resource lookups and naming conventions.This supersedes and reverts the temporary Grafana no-op mitigation in #5838 / ARO-28044: Grafana deployment behavior is re-enabled through
grafanactlinstead of bicep plus region-pipeline datasource steps.Why
AME Azure Policy enforces
allowSharedKeyAccess=false, which blocks ARMdeploymentScriptsbecause they require shared-key-backed storage. The old deploymentScript worked around a bicep limitation: AMW discovery had to produce a complete integration list for the Grafana ARM PUT.Moving Grafana create/update into
grafanactlremoves that bicep output contract entirely.grafanactlcan look up AMWs directly, applyazureMonitorWorkspaceIntegrations, and reconcile datasources without passing discovered workspace IDs back into bicep.Testing
Planned for this redesign:
cd tooling/grafanactl && PATH=$HOME/sdk/go1.25.7/bin:$PATH GO=$HOME/sdk/go1.25.7/bin/go go build ./...PATH=$HOME/sdk/go1.25.7/bin:$PATH GO=$HOME/sdk/go1.25.7/bin/go go test ./...cd config && PATH=$HOME/sdk/go1.25.7/bin:$PATH GO=$HOME/sdk/go1.25.7/bin/go make materializePATH=$HOME/sdk/go1.25.7/bin:$PATH GO=$HOME/sdk/go1.25.7/bin/go make validate-config-pipelinesPATH=$HOME/sdk/go1.25.7/bin:$PATH GO=$HOME/sdk/go1.25.7/bin/go make yamlfmt/home/rael/bin/bicep build .../home/rael/bin/bicep lint ...git diff --exit-codeSpecial notes for your reviewer
cc @jboll as owner.
Key review point: role management intentionally stays in bicep. Only Grafana resource create/update, AMW integration discovery, and datasource reconciliation move into
grafanactlso the AMW lookup no longer needs to become a bicep output.PR Checklist
If E2E tests are included:
demonstrate that the test is able to detect a defect/error and fail with
proper error message and logs which communicates nature of the problem.