Skip to content

fix(config): set grafanaZoneRedundantMode for public clouds (AROSLSRE-1379)#5886

Closed
raelga wants to merge 1 commit into
Azure:mainfrom
raelga:rael/fix-grafana-zoneredundancy-public
Closed

fix(config): set grafanaZoneRedundantMode for public clouds (AROSLSRE-1379)#5886
raelga wants to merge 1 commit into
Azure:mainfrom
raelga:rael/fix-grafana-zoneredundancy-public

Conversation

@raelga

@raelga raelga commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

AROSLSRE-1379

What

Adds grafanaZoneRedundantMode: Disabled to the base defaults.monitoring block in config/config.yaml, alongside the existing grafanaName.

Why

The GrafanaManage step in dev-infrastructure/global-pipeline.yaml renders zoneRedundancy: "{{ .monitoring.grafanaZoneRedundantMode }}". monitoring.grafanaZoneRedundantMode is a required config field, but only the dev-cloud block defined it — the base defaults never did. For public clouds (int/stg/prod) the value was absent, rendered to an empty string, and failed EV2 pipeline-schema validation downstream in sdp-pipelines:

at '/resourceGroups/0/steps/1/zoneRedundancy': value must be one of 'Enabled', 'Disabled'

This surfaced in sdp-pipelines build 170669006. The step was re-enabled in 448930dc8 ("Re-enable grafana with refactoring"); while grafana pipeline steps were disabled the missing base value was dormant. ARO-HCP CI only validates dev/pers rendered configs (which have the value via the dev override), so the gap only appears in the public int/stg/prod EV2 generation.

Disabled matches the dev override, Azure Managed Grafana's default, and the existing grandfathered instances. Zone redundancy is immutable at creation, so a mismatched value would risk a reconcile conflict.

Testing

Config-only change; validated by rendering and materialization:

  • Rendered the msft int config locally (templatize configuration render ... --cloud public --environment int) and confirmed monitoring.grafanaZoneRedundantMode: Disabled now resolves (was previously absent).
  • make -C config materialize produces no change to dev/pers rendered outputs (dev already overrides to Disabled), confirming the base default is consistent.
  • Confirmed the failing pipeline step (global-pipeline.yaml line 48) now resolves zoneRedundancy: Disabled instead of an empty string.

Special notes for your reviewer

  • monitoring.grafanaMajorVersion is also empty for public clouds but is optional in the schema, so it does not block validation and is out of scope here.
  • No zone-redundancy policy change (e.g. HA Enabled for prod) is made — that can be a follow-up if desired.

PR Checklist

  • PR is scoped to a single task (no mixed concerns)
  • Title follows Conventional Commits format
  • Summary explains the "Why" behind the change
  • Linked to relevant ticket/issue
  • Screenshots included (if graph/UI/metrics changes)
  • Self-reviewed the diff
  • CI/CD checks are passing (ignore Tide)
  • Draft PR used for WIP (if applicable)
  • Commit history is clean (rebased/squashed)
  • Tricky code blocks are commented
  • Specific reviewers tagged
  • All comment threads resolved before merge

…-1379)

The GrafanaManage pipeline step renders
zoneRedundancy: "{{ .monitoring.grafanaZoneRedundantMode }}", but the base
defaults.monitoring block never defined grafanaZoneRedundantMode (only the
dev-cloud block did). For public clouds (int/stg/prod) the required field was
absent, rendered to an empty string, and failed EV2 pipeline-schema validation:

  at '/resourceGroups/0/steps/1/zoneRedundancy': value must be one of
  'Enabled', 'Disabled'

Add grafanaZoneRedundantMode: Disabled to the base defaults. Disabled matches
the dev override, Azure Managed Grafana's default, and existing grandfathered
instances (zone redundancy is immutable at creation).
Copilot AI review requested due to automatic review settings July 2, 2026 07:57
@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: raelga

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes EV2 pipeline-schema validation for public-cloud environments by ensuring the required monitoring.grafanaZoneRedundantMode config value is always present in the base defaults, so GrafanaManage.zoneRedundancy does not render as an empty string.

Changes:

  • Add defaults.monitoring.grafanaZoneRedundantMode: Disabled to config/config.yaml, with explanatory comments.

@raelga

raelga commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

Closing — wrong diagnosis. This is not a missing-default problem.

The EV2 manifests generate precompile substitutes a dunder sentinel for every config leaf and then validates the pipeline against pipeline.schema.v1.json (strict):

generate.go:118  dunderCfg := EV2Mapping(system, NewDunderPlaceholders())   // value -> __key__
generate.go:327  NewPipelineFromFile(path, dunderCfg)
pipeline.go:67   PreprocessContent(...)   // {{ .monitoring.grafanaZoneRedundantMode }} -> __monitoring.grafanaZoneRedundantMode__
pipeline.go:72   ValidatePipelineSchema() // strict; zoneRedundancy enum = Enabled/Disabled

zoneRedundancy is a strict enum, so the sentinel __monitoring.grafanaZoneRedundantMode__ fails validation regardless of whether the config provides a value. Adding grafanaZoneRedundantMode: Disabled does not help.

Real root cause: PR #5843 moved Grafana provisioning to a GrafanaManage pipeline step field sourced from a {{ }} config template, which the EV2 pipeline-schema precompile cannot validate. Fixed by reverting #5843 in #5887 (restores Bicep-based provisioning, where the same values are passed as simple-field-access bicepparams — the supported pattern per ARO-27904 / #5863).

Superseded by #5887.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants