OCPSTRAT-1679: enhancement: StorageClass KMS Key for AWS Hosted Control Planes#2039
OCPSTRAT-1679: enhancement: StorageClass KMS Key for AWS Hosted Control Planes#2039devguyio wants to merge 1 commit into
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ahmed Abdalla <aabdelre@redhat.com>
|
@devguyio: This pull request references OCPSTRAT-1679 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the feature to target either version "5.0." or "openshift-5.0.", but it targets "openshift-5.1" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@devguyio: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| ## Motivation | ||
|
|
||
| ROSA classic clusters allow customers to configure KMS encryption for volumes created |
There was a problem hiding this comment.
can we elaborate how rosa classic leverage this semantic, and how rosa hcp is planning to do the same, e.g. is there any efforts we can link to expose this in ocm?
| cluster (day-2 operation) so that new PVCs use an updated or rotated KMS key without | ||
| requiring cluster recreation. | ||
|
|
||
| - As a self-managed HyperShift operator on AWS, I want to specify a KMS key for the |
There was a problem hiding this comment.
"hypershift" is a dev cli, "hcp" is product one. Let's remove any ambiguity so we get agents straight
|
|
||
| ### Goals | ||
|
|
||
| - Add `storageKMSKeyARN` to `AWSPlatformSpec` (shared by `HostedCluster` and |
There was a problem hiding this comment.
I think storageKMSKeyARN should belong to parent api struct that can potentially grow cohesively
| approvers: | ||
| - "@csrwng" | ||
| - "@enxebre" | ||
| api-approvers: |
There was a problem hiding this comment.
please let's include here @JoelSpeed @everettraven so we diversify knowledge
| title: storageclass-kms-key | ||
| authors: | ||
| - "@devguyio" | ||
| reviewers: |
There was a problem hiding this comment.
please let's include here managed services and storage representatives so we diversify knowledge
cc @joshbranham, @jsafrane
| reads this value and configures the default StorageClass with the KMS key ID. | ||
| 5. New EBS volumes provisioned via the StorageClass carry the KMS encryption. | ||
|
|
||
| When the field is cleared, the HCCO clears `DriverConfig.AWS.KMSKeyARN`, the CSO reverts |
There was a problem hiding this comment.
There needs to be a way to opt-out from reconciliation so in cluster changes UX is preserved, is this how you achieve that?
|
|
||
| 1. The HC controller mirrors the field to `HostedControlPlane.spec.platform.aws.storageKMSKeyARN`, | ||
| following the existing mirroring pattern for other `AWSPlatformSpec` fields. | ||
| 2. The HCCO validates the key by assuming the `StorageARN` role (already provisioned for |
There was a problem hiding this comment.
is 2 something that the storage does today? If not, should that be an epic for storage operator?
| the HC controller bubbles it to `HostedCluster` status. | ||
| 4. The HCCO storage reconciliation writes `ClusterCSIDriver.spec.driverConfig.aws.kmsKeyARN` | ||
| to the guest cluster via its guest cluster client. The cluster-storage-operator (CSO) | ||
| reads this value and configures the default StorageClass with the KMS key ID. |
There was a problem hiding this comment.
does introduces a race for stateful workloads using the default class? can we outline the risks of that?
| - **Per-StorageClass KMS key granularity.** This enhancement targets only the default | ||
| StorageClass managed by the cluster-storage-operator. The CSI operator's | ||
| `ClusterCSIDriver.DriverConfig` applies to the driver globally; per-StorageClass |
There was a problem hiding this comment.
Users can still create their own StorageClasses with their own keys, at least on standalone clusters. The enhancement affects only the default StorageClasses.
|
|
||
| ## Proposal | ||
|
|
||
| When a customer sets `spec.platform.aws.storageKMSKeyARN` on a `HostedCluster`, the |
There was a problem hiding this comment.
So far the pattern for operator configs is to place them under hc.spec.operatorConfiguration. Would this not be better under something like: hc.spec.operatorConfiguration.csiDrivers.ebs = {blah} ?
ebs may not be the only csi driver for the aws platform (afaik for aws there is at least efs and s3 as well)
There was a problem hiding this comment.
Echoing Cesar's (and Alberto's points above), this should probably have additional nesting instead of just a subfield in spec.platform.aws.
On the point of
nesting adds indirection
with no benefit. A nested struct can be introduced in a follow-on enhancement if
additional storage fields are needed.
below, this is not really true. APIs are hard to modify after creation especially in OCP, as we retain support for a major release's lifecycle, so we should consider future API additions in the design.
It's probably overkill but the direct equivalent would probably be something like hc.spec.operatorConfiguration.csiDriverConfig.aws.kmsKeyARN
|
|
||
| ## Summary | ||
|
|
||
| This enhancement adds an optional `storageKMSKeyARN` field to the AWS platform spec |
There was a problem hiding this comment.
will existing iam policies support encryption?
yuqi-zhang
left a comment
There was a problem hiding this comment.
Adding some comments from the openshift/api perspective
|
|
||
| ## Proposal | ||
|
|
||
| When a customer sets `spec.platform.aws.storageKMSKeyARN` on a `HostedCluster`, the |
There was a problem hiding this comment.
Echoing Cesar's (and Alberto's points above), this should probably have additional nesting instead of just a subfield in spec.platform.aws.
On the point of
nesting adds indirection
with no benefit. A nested struct can be introduced in a follow-on enhancement if
additional storage fields are needed.
below, this is not really true. APIs are hard to modify after creation especially in OCP, as we retain support for a major release's lifecycle, so we should consider future API additions in the design.
It's probably overkill but the direct equivalent would probably be something like hc.spec.operatorConfiguration.csiDriverConfig.aws.kmsKeyARN
| StorageKMSKeyARN string `json:"storageKMSKeyARN,omitempty"` | ||
| ``` | ||
|
|
||
| The field is validated with a CEL `XValidation` rule accepting both KMS key ARNs and |
There was a problem hiding this comment.
Let's add these following points directly to the validation above, i.e..
// +optional
// +kubebuilder:validation:MaxLength=2048
// +kubebuilder:validation:XValidation:rule="...",message="..."
StorageKMSKeyARN string `json:"storageKMSKeyARN,omitempty"`
| alias ARNs: | ||
|
|
||
| ```text | ||
| ^arn:(aws|aws-cn|aws-us-gov):kms:[a-z0-9-]+:[0-9]{12}:(key|alias)/[a-zA-Z0-9:/_-]+$ |
There was a problem hiding this comment.
I assume this is the hypershift equivalent of https://github.com/openshift/api/blob/master/operator/v1/types_csi_cluster_driver.go#L176, perhaps we can see if matching the validation and godocs there makes sense
|
|
||
| **Day-2 (key removal):** | ||
|
|
||
| 1. The administrator sets `storageKMSKeyARN` to empty string. |
There was a problem hiding this comment.
Technically conflicts with the field being optional and omitempty. The right user behaviour is deleting the field. Empty string should not be allowed since there's no difference between that and an unset field.
| // Updating this field does not re-encrypt existing volumes; only newly | ||
| // created PVCs use the new key. | ||
| // +optional | ||
| StorageKMSKeyARN string `json:"storageKMSKeyARN,omitempty"` |
There was a problem hiding this comment.
Not sure how far we got with featuregating in hypershift, but this might be a good candidate for one.
| 2. The HC controller updates the `HostedControlPlane` field; the HCCO re-validates and | ||
| updates `ClusterCSIDriver`. | ||
| 3. The CSO updates the default StorageClass. There is a brief period between CSO | ||
| reconciliation cycles where the StorageClass references neither the old nor the new |
There was a problem hiding this comment.
A bit confused on this point, how would it be neither? I thought it would just be using the old key until reconciliation, and there's no "gap" period.
| include the `alias/` ARN resource type and the broader alias name character set | ||
| (`:`, `/`, `_` are valid in alias names). | ||
|
|
||
| #### New condition type: `ValidAWSStorageKMSConfig` |
There was a problem hiding this comment.
I'm not familiar with general hypershift status/conditions, but just thinking from a user perspective, let's say I updated my key, I don't think there's a easy way to see if/when that gets applied just by looking at the conditions, since it doesn't actually tell me what the last success was.
Would there be an equivalent status field we could populate for the last successful key reconciliation?
Add enhancement proposal for customer-controlled default encryption of
guest-cluster EBS volumes using a specified KMS key in HyperShift AWS
hosted control planes.
This closes a parity gap between ROSA classic and ROSA HCP for storage
encryption by adding
storageKMSKeyARNtoAWSPlatformSpec, propagatingit through the HC controller → HCCO →
ClusterCSIDriverchain, andexposing a
--storage-volumes-kms-keyCLI flag.Tracking: OCPSTRAT-1679
Related: openshift/enhancements#1163
cc @csrwng @muraee @celebdor @enxebre