Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 3 additions & 10 deletions pkg/types/defaults/machinepools.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,18 +40,11 @@ func SetMachinePoolDefaults(p *types.MachinePool, platform *types.Platform, fgat
}
}

// Set management to ClusterAPI if the appropriate feature gate is enabled and management is unspecified
if p.Management == "" {
if p.Name == types.MachinePoolControlPlaneRoleName && fgates.Enabled(features.FeatureGateClusterAPIControlPlaneInstall) {
p.Management = types.ClusterAPI
}
if p.Name == types.MachinePoolComputeRoleName && fgates.Enabled(features.FeatureGateClusterAPIComputeInstall) {
p.Management = types.ClusterAPI
}
}

switch platform.Name() {
case aws.Name:
if p.Management == "" && p.Name == types.MachinePoolComputeRoleName && fgates.Enabled(features.FeatureGateClusterAPIMachineManagementAWS) {
p.Management = types.ClusterAPI
}
if p.Platform.AWS == nil && platform.AWS.DefaultMachinePlatform != nil {
p.Platform.AWS = &aws.MachinePool{}
}
Expand Down
53 changes: 25 additions & 28 deletions pkg/types/defaults/machinepools_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ import (

configv1 "github.com/openshift/api/config/v1"
"github.com/openshift/installer/pkg/types"
"github.com/openshift/installer/pkg/types/gcp"
"github.com/openshift/installer/pkg/types/aws"
gcp "github.com/openshift/installer/pkg/types/gcp"
)

func defaultMachinePoolWithReplicaCount(name string, replicaCount int) *types.MachinePool {
Expand Down Expand Up @@ -144,61 +145,57 @@ func TestSetMahcinePoolDefaults(t *testing.T) {
}

func TestSetMachinePoolDefaultsWithFeatureGates(t *testing.T) {
awsPlatform := &types.Platform{AWS: &aws.Platform{}}

cases := []struct {
name string
pool *types.MachinePool
platform *types.Platform
featureSet configv1.FeatureSet
featureGates []string
Comment on lines -151 to +154

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just stick with featureset?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so: it's the wrong abstraction. We'd have to change all the tests every time we promoted anything.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this looks fine to me although you might need CNU featureset enabled?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we lose coverage for scenarios where the users set: featureSet: Default/TechPreviewNoUpgrade/DevPreviewNoUpgrade in the install-config, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if we're testing the resulting feature gates.

I haven't checked how installer loads feature gates from a feature set, but in the running cluster you have to read them from the FeatureGate object because they can change at runtime. You cannot trust the openshift/api vendored in your binary. I believe the CVO image writes them, which (IIRC) would make the CVO image the canonical source of the contents of a feature set. Does installer read them from the same place?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh... 😅 I believe the installer reads the current state of feature gates from the vendored openshift/api. Every time any gate-related change, we re-vendor o/api.

expectedManagement types.MachineManagementAPI
}{
{
name: "control plane with DevPreviewNoUpgrade feature set",
pool: &types.MachinePool{Name: types.MachinePoolControlPlaneRoleName},
platform: &types.Platform{},
featureSet: configv1.DevPreviewNoUpgrade,
name: "AWS compute sets ClusterAPI management when ClusterAPIMachineManagementAWS enabled",
pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName},
platform: awsPlatform,
featureGates: []string{"ClusterAPIMachineManagementAWS=True"},
expectedManagement: types.ClusterAPI,
},
{
name: "control plane with default feature set",
pool: &types.MachinePool{Name: types.MachinePoolControlPlaneRoleName},
platform: &types.Platform{},
featureSet: configv1.Default,
name: "AWS compute management is unchanged when ClusterAPIMachineManagementAWS disabled",
pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName},
platform: awsPlatform,
featureGates: []string{"ClusterAPIMachineManagementAWS=False"},
expectedManagement: "",
},
{
name: "compute with DevPreviewNoUpgrade feature set",
pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName},
platform: &types.Platform{},
featureSet: configv1.DevPreviewNoUpgrade,
expectedManagement: types.ClusterAPI,
name: "AWS control plane is unaffected by ClusterAPIMachineManagementAWS",
pool: &types.MachinePool{Name: types.MachinePoolControlPlaneRoleName},
platform: awsPlatform,
featureGates: []string{"ClusterAPIMachineManagementAWS=True"},
expectedManagement: "",
},
{
name: "compute with default feature set",
name: "non-AWS compute is unaffected by ClusterAPIMachineManagementAWS",
pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName},
platform: &types.Platform{},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's put a non-nil platform (e.g. GCP) to test "non-AWS" case?

featureSet: configv1.Default,
featureGates: []string{"ClusterAPIMachineManagementAWS=True"},
expectedManagement: "",
},
{
name: "control plane with management already set",
pool: &types.MachinePool{Name: types.MachinePoolControlPlaneRoleName, Management: types.MachineAPI},
platform: &types.Platform{},
featureSet: configv1.DevPreviewNoUpgrade,
expectedManagement: types.MachineAPI,
},
{
name: "compute with management already set",
name: "AWS compute management is not overwritten when already set",
pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName, Management: types.MachineAPI},
platform: &types.Platform{},
featureSet: configv1.DevPreviewNoUpgrade,
platform: awsPlatform,
featureGates: []string{"ClusterAPIMachineManagementAWS=True"},
expectedManagement: types.MachineAPI,
},
}

for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
config := &types.InstallConfig{
FeatureSet: tc.featureSet,
FeatureSet: configv1.CustomNoUpgrade,
FeatureGates: tc.featureGates,
}
SetMachinePoolDefaults(tc.pool, tc.platform, config.EnabledFeatureGates())
assert.Equal(t, tc.expectedManagement, tc.pool.Management, "unexpected management API")
Expand Down
9 changes: 3 additions & 6 deletions pkg/types/validation/featuregates.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,9 @@ func validateMachinePoolFeatureGates(c *types.InstallConfig) []featuregates.Gate
Field: field.NewPath("osImageStream"),
},
{
FeatureGateName: features.FeatureGateClusterAPIControlPlaneInstall,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also remove the 2 unit tests, following this change:

name: "Control Plane CAPI machine management is allowed with DevPreviewNoUpgrade Feature Set",

name: "Control Plane CAPI machine management is not allowed with Default Feature Set",

Condition: c.ControlPlane != nil && c.ControlPlane.Management == types.ClusterAPI,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By removing this check, field controlPlane.management can be set to ClusterAPI freely; but results in a no-op.

We should add a validation to block it for the moment, WDYT?

case types.MachinePoolEdgeRoleName:
allErrs = append(allErrs, validateComputeEdge(platform, p.Name, poolFldPath, poolFldPath)...)
if p.Management == types.ClusterAPI {
allErrs = append(allErrs, field.Invalid(poolFldPath.Child("management"), p.Management, "edge compute pools cannot be managed by Cluster API"))
}

Field: field.NewPath("controlPlane", "management"),
},
{
FeatureGateName: features.FeatureGateClusterAPIComputeInstall,
// Note that this should use a platform-specific feature gate, but
// there is no way to express that here.
FeatureGateName: features.FeatureGateClusterAPIMachineManagement,
Comment on lines +45 to +47

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

Use the AWS-specific gate in validation, not the generic machine-management gate.

pkg/types/validation/installconfig.go consumes this mapping purely by FeatureGateName, so switching this entry to FeatureGateClusterAPIMachineManagement broadens which configs are admitted/rejected. That no longer matches the AWS-only behavior described for this change and can make compute.management validation diverge from the actual platform support. If the current helper cannot express platform-specific gating, that limitation needs fixing instead of weakening the contract here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/types/validation/featuregates.go` around lines 45 - 47, The validation
mapping is using the generic machine-management feature gate instead of the
AWS-specific one, which broadens install config acceptance in `installconfig.go`
through `FeatureGateName`. Update the `featuregates.go` entry to reference the
AWS-specific gate used for `compute.management`, and keep the platform-specific
behavior aligned with the AWS-only contract rather than falling back to
`FeatureGateClusterAPIMachineManagement`. If the helper cannot represent
platform-specific gating, fix that abstraction instead of changing this mapping.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdbooth mdbooth Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to work out how to express this on Friday and didn't quite get there. I wanted to express:

  • On all platforms except [the current set which support this], this field may not be set
  • On [the current set which support this], the field is guarded by [platform-specific feature gate]

The best I had was a top level if plaform.AWS != nil && platform.Azure != nil && platform.GCP != nil && platform.OpenStack != nil ... <reject>, and separate entries in each of their per-platform validations. Or perhaps just pre-emptive entries in every per-platform validation with hardcoded reject on unsupported platforms. The latter felt a bit cleaner. Preference, or a better idea?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this idea: tthvo@28105d1 🤔?

The commit added a validateMachineManagement where you can define the supported platform for management: ClusterAPI. To add a new platform, follow 2 steps:

  • Add the platform name as a new case block (pkg/types/validation/installconfig.go)
  • Guard the management: ClusterAPI behind feature gate (pkg/types/{platform}/validation/featuregates.go)

This would achieve your 2 requirements above, WDYT?

Condition: len(c.Compute) > 0 && c.Compute[0].Management == types.ClusterAPI,
Field: field.NewPath("compute", "management"),
Comment on lines 48 to 49

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Point the validation error at the indexed compute pool.

compute is a list in install-config, so field.NewPath("compute", "management") does not map to the structure users edit. If this check is for the first pool, the field path should be compute[0].management (or the matched index if you generalize the scan), otherwise the error is harder to act on. As per path instructions, "Verify error field paths match the YAML/JSON structure users provide."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/types/validation/featuregates.go` around lines 48 - 49, The validation
error path in the feature gate check is pointing at the list field instead of
the actual compute pool entry, so update the field path used in the feature
validation logic to target the indexed pool. In the validation code around the
compute management check, adjust the `field.NewPath(...)` call so it references
`compute[0].management` (or the matching index if the logic is generalized),
keeping the check in sync with the `c.Compute` slice access.

Source: Path instructions

},
Expand Down