Skip to content

Commit 00d4866

Browse files
authored
Merge pull request #5898 from yongruilin/dv-kep-shadow
KEP-5073: Introduce DeclarativeValidationBeta and deprecate Takeover gate
2 parents 47348c1 + 9048615 commit 00d4866

1 file changed

Lines changed: 45 additions & 30 deletions

File tree

  • keps/sig-api-machinery/5073-declarative-validation-with-validation-gen

keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md

Lines changed: 45 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,9 @@
1717
- [New Validations Vs Migrating Validations](#new-validations-vs-migrating-validations)
1818
- [New Validation Tests](#new-validation-tests)
1919
- [Ensuring Validation Equivalence With Testing](#ensuring-validation-equivalence-with-testing)
20-
- [Introduce Feature Gates: <code>DeclarativeValidation</code> &amp; <code>DeclarativeValidationTakeover</code>](#introduce-feature-gates-declarativevalidation--declarativevalidationtakeover)
21-
- [<code>DeclarativeValidation</code> &amp; <code>DeclarativeValidationTakeover</code> Will Target Beta From The Beginning](#declarativevalidation--declarativevalidationtakeover-will-target-beta-from-the-beginning)
20+
- [Introduce Feature Gates: <code>DeclarativeValidation</code>, <code>DeclarativeValidationTakeover</code>, &amp; <code>DeclarativeValidationBeta</code>](#introduce-feature-gates-declarativevalidation-declarativevalidationtakeover--declarativevalidationbeta)
21+
- [<code>DeclarativeValidation</code> &amp; <code>DeclarativeValidationBeta</code> Will Target Beta From The Beginning](#declarativevalidation--declarativevalidationbeta-will-target-beta-from-the-beginning)
22+
- [Execution &amp; Authority Logic](#execution--authority-logic)
2223
- [Feature Gate Graduation Criteria](#feature-gate-graduation-criteria)
2324
- [<code>DeclarativeValidation</code> Feature Gate Beta to GA Graduation Criteria](#declarativevalidation-feature-gate-beta-to-ga-graduation-criteria)
2425
- [Linter](#linter)
@@ -236,18 +237,18 @@ The strategic goal of Declarative Validation (DV) is to **meaningfully reduce th
236237

237238
### The "Implicit Shadow" Reality
238239

239-
Today, the system is in a "hybrid" state. Any field using standard +k8s: tags (without the +k8s:declarativeValidationNative marker) is **implicitly shadowed** because Takeover defaults to false. Mismatches are recorded, but errors are suppressed to prevent duplication.
240+
Today, the system is in a "hybrid" state. Any field using standard +k8s: tags (without the +k8s:declarativeValidationNative marker) is **implicitly shadowed** because `DeclarativeValidationBeta` defaults to false for legacy fields (or they remain in implicit mode). Mismatches are recorded, but errors are suppressed to prevent duplication.
240241

241242
### Problem: Global vs. Local Control
242243

243-
The Takeover gate is a global "all-or-nothing" switch. Graduation is blocked because we cannot force every migrated field in the cluster to become "Authoritative" simultaneously.
244+
The `DeclarativeValidationBeta` gate is a global "all-or-nothing" switch for rules in the Beta stage. Graduation is blocked because we cannot force every migrated field in the cluster to become "Authoritative" simultaneously without a granular lifecycle.
244245

245246
### Solution: Lifecycle Tags
246247

247248
We adopt a standard Alpha/Beta/GA lifecycle for validation rules, controlled by tag prefixes:
248249

249250
* **+k8s:alpha**: Shadow mode (Metrics only).
250-
* **+k8s:beta**: Enforced by default, but disable-able via the global DeclarativeValidationTakeover gate.
251+
* **+k8s:beta**: Enforced by default, but disable-able via the global `DeclarativeValidationBeta` gate.
251252
* **no prefix tag**: Permanently enforced.
252253

253254
Declarative validation will benefit Kubernetes maintainers:
@@ -308,8 +309,8 @@ Please feel free to try out the [prototype](https://github.com/jpbetz/kubernetes
308309
* Introduce new validation tests, test framework and migration test utilities
309310
* No field can go thru migration without a robust test for the field in question and maintainer review scrutiny which proves that it is validated correctly before the change and after.
310311
* Create migration test pattern and utilities which support testing equivalence between hand-written validation and declarative validation (de-risks migration problems)
311-
* Introduce featuregate: `DeclarativeValidation` and `DeclarativeValidationTakeover`
312-
* Combined allow for safety mechanism in case a mistake is made so that we can safely compare validation errors but have the handwritten validations still be authoritative along the request path. Additionally users can turn off Declarative Validation and get back to a healthy validation state if necessary. (de-risks migration problems)
312+
* Introduce featuregate: `DeclarativeValidation` and `DeclarativeValidationBeta`
313+
* Combined allow for safety mechanism in case a mistake is made so that we can safely compare validation errors but have the handwritten validations still be authoritative along the request path. Additionally users can turn off Declarative Validation or disable newly enforced rules via the Beta gate and get back to a healthy validation state if necessary. (de-risks migration problems)
313314
* Introduce runtime verification testing which emit
314315
* `declarative_validation_mismatch_total` metric allowing for tests and users to identify any mismatching validation logic between hand-written and declarative validations.
315316
* `declarative_validation_panic_total` metric which counts the number of panics (recovered) that occur in declarative validation code as an extra precaution.
@@ -359,20 +360,34 @@ For testing the migration and ensuring that the validation is identical across c
359360

360361
Verifying that a field/type that is migrated is appropriately tested with proper changes to validation_test.go, equivalence testing, etc. will be human-driven enforced in PR review for the related community migration PR.
361362

362-
Additionally, to aid in ensuring that the validation is identical across current hand-written validation and declarative validations, we will create a runtime check controlled by the `DeclarativeValidation` and `DeclarativeValidationTakeover` feature gates. When `DeclarativeValidation` is enabled, both hand-written and declarative validation will be run. Any mismatches will be logged and a `declarative_validation_mismatch_total` metric will be incremented. The `DeclarativeValidationTakeover` gate controls which result (imperative or declarative) is returned to the user.
363-
### Introduce Feature Gates: `DeclarativeValidation` & `DeclarativeValidationTakeover`
363+
Additionally, to aid in ensuring that the validation is identical across current hand-written validation and declarative validations, we will create a runtime check controlled by the `DeclarativeValidation` and `DeclarativeValidationBeta` feature gates. When `DeclarativeValidation` is enabled, both hand-written and declarative validation will be run. Any mismatches will be logged and a `declarative_validation_mismatch_total` metric will be incremented. The `DeclarativeValidationBeta` gate controls which result (imperative or declarative) is returned to the user for Beta-stage rules.
364+
### Introduce Feature Gates: `DeclarativeValidation`, `DeclarativeValidationTakeover`, & `DeclarativeValidationBeta`
364365

365-
Two feature gates were introduced in v1.33 to manage the rollout:
366+
Three feature gates are involved in the rollout, reflecting the transition to the lifecycle model:
366367

367-
* **`DeclarativeValidation`**: This gate controls whether declarative validation is *enabled* for a given resource or field. When enabled, both imperative (hand-written) and declarative validation will run. The results will be compared, and any mismatches will be logged and reported via metrics (see `DeclarativeValidationTakeover` below). The imperative validation result will be returned to the user. When disabled, only imperative validation runs.
368+
* **`DeclarativeValidation`**: This gate controls whether declarative validation is *enabled* for a given resource or field. When enabled, both imperative (hand-written) and declarative validation will run. The results will be compared, and any mismatches will be logged and reported via metrics (see `DeclarativeValidationBeta` below). The imperative validation result will be returned to the user. When disabled, only imperative validation runs.
368369

369-
* **`DeclarativeValidationTakeover`**: The DeclarativeValidationTakeover feature gate is retained as the Global Safety Switch for Beta-stage validation rules. It allows cluster admins to disable "newly enforced" validations if regressions are found, forcing them back to Shadow mode (handwritten fallback). When `DeclarativeValidationTakeover` is enabled (default for Beta), Beta tags are Enforced. When disabled, Beta tags are Shadowed. `DeclarativeValidationTakeover` has *no effect* if `DeclarativeValidation` is disabled.
370+
* **`DeclarativeValidationTakeover`**: Deprecated in v1.36. Previously determined whether declarative validation results were authoritative. As `DeclarativeValidationTakeover` does not change user-visible behavior (it is internal to validation mechanics), we do not intend to honor it in 1.36. Users of emulation version will be allowed to set the gate (e.g., when upgrading a cluster that had it set), but it will have no impact on implementation. This prevents upgrade failures (e.g., "gate not recognized") while immediately moving behavior to the new lifecycle model.
370371

371-
#### `DeclarativeValidation` & `DeclarativeValidationTakeover` Will Target Beta From The Beginning
372+
* **`DeclarativeValidationBeta`**: Introduced in v1.36. This feature gate acts as the Global Safety Switch for Beta-stage validation rules. It allows cluster admins to disable "newly enforced" validations if regressions are found, forcing them back to Shadow mode. When `DeclarativeValidationBeta` is enabled (default for Beta), Beta tags are Enforced. When disabled, Beta tags are Shadowed. `DeclarativeValidationBeta` has *no effect* if `DeclarativeValidation` is disabled.
373+
374+
#### `DeclarativeValidation` & `DeclarativeValidationBeta` Will Target Beta From The Beginning
372375

373376
Declarative Validation will target the Beta stage from the beginning (vs Alpha). Additionally, `DeclarativeValidation` is targeting Beta with `default:true`. This is because Declarative Validation is not new functionality, but an alternative implementation of validation, and users should not be able to perceive any changes when swapping hand-written validation with identical declarative validation. The feature gate, `DeclarativeValidation`, exists as a safety mechanism in case a mistake is made so that users can turn it off and get back to safety. There is prior art for this rationale where other feature gates did not target Alpha as they were not related to new functionality (changing underlying behavior, bugfix, etc.). An example of this is the current feature gate `AllowParsingUserUIDFromCertAuth`, which was introduced in Beta as `default:true` as it is not a net new feature but fixes a current issue ([PR](https://github.com/kubernetes/kubernetes/pull/127897), [feature gate](https://github.com/kubernetes/kubernetes/blob/master/pkg/features/versioned_kube_features.go#L228-L230)).
374377

375-
`DeclarativeValidationTakeover` will default to `false` initially in Beta. This way during the initial rollout we can "soak" and verify that the errors produced for a replaced validation rule (handwritten -> declarative) are identical. Over time the goal is to flip `DeclarativeValidationTakeover` to be default `true` such that for fields where declarative validation rules exist, they are used as the authoritative validation rule.
378+
`DeclarativeValidationBeta` will default to `true` initially. This is because we want to enable the beta validations by default, but provide a safety switch to disable them if necessary.
379+
380+
#### Execution & Authority Logic
381+
382+
Execution and authority are determined by two feature gates (`DeclarativeValidation`, `DeclarativeValidationBeta`) and the resource strategy (`WithDeclarativeEnforcement()`).
383+
384+
1. **Execution (Is it running?)**:
385+
1. DV runs if `DeclarativeValidation` is Enabled OR the resource uses `WithDeclarativeEnforcement()`.
386+
2. Note: Using the strategy ensures New APIs run even if the main gate is disabled.
387+
2. Enforcement:
388+
1. **Standard Tags:** Always Enforced (Bypasses Beta Gate).
389+
2. **Beta Tags:** Enforced if `DeclarativeValidationBeta` is Enabled. Otherwise Shadowed.
390+
3. **Alpha Tags:** Always Shadowed.
376391

377392
#### Feature Gate Graduation Criteria
378393

@@ -432,7 +447,7 @@ Our goal is to standardize on the Explicit Strategy and Lifecycle mechanism. The
432447

433448
#### Case: New APIs
434449

435-
- **Action: **Adopt `WithDeclarativeEnforcement()`. Use Standard tags.
450+
- **Action:** Adopt `WithDeclarativeEnforcement()`. Use Standard tags.
436451
- **Result:** Enforced.
437452

438453
#### Case: Legacy APIs
@@ -443,22 +458,22 @@ Our goal is to standardize on the Explicit Strategy and Lifecycle mechanism. The
443458
#### Case: Adding Validation to Existing Fields (Implicit)
444459

445460
- **Action:** Use standard tags (`+k8s:minimum=1`).
446-
- **Result: **Implicitly shadowed.
461+
- **Result:** Implicitly shadowed.
447462

448463
### Phase 2: v1.37 - v1.38 (Transition to Beta)
449464

450465
- **Action:** Legacy APIs adopt `WithDeclarativeEnforcement()`.
451466
- **Tag Updates:**
452-
- **Verified fields (Legacy): **Convert standard tags to `+k8s:beta`.
467+
- **Verified fields (Legacy):** Convert standard tags to `+k8s:beta`.
453468
- **Reasoning:** These fields have soaked. We are ready to enforce, but want the safety switch.
454469
- **New Rules:** Use `+k8s:alpha`.
455470
- **Runtime Effect:**
456-
- Takeover ON: Beta tags are Enforced.
457-
- Takeover OFF: Beta tags are Shadowed.
471+
- `DeclarativeValidationBeta` ON: Beta tags are Enforced.
472+
- `DeclarativeValidationBeta` OFF: Beta tags are Shadowed.
458473

459474
### Phase 3: v1.39+ (Gate Removal & GA)
460475

461-
- **Trigger:** `DeclarativeValidation` feature gate is removed. `DeclarativeValidationTakeover` gate remains as the Beta toggle.
476+
- **Trigger:** `DeclarativeValidation` feature gate is removed. `DeclarativeValidationBeta` gate remains as the Beta toggle.
462477
- Start to Promotion to GA
463478
- Action: Remove `+k8s:beta` prefix and delete handwritten code.
464479
- Result: Permanently Enforced.
@@ -711,7 +726,7 @@ Requests are received as the versioned type, so it should be feasible to avoid e
711726
* Test fixture
712727
* Linter
713728
* Documentation generator
714-
* Feature gates - `DeclarativeValidation`& `DeclarativeValidationTakeover`
729+
* Feature gates - `DeclarativeValidation`, `DeclarativeValidationTakeover` (deprecated), & `DeclarativeValidationBeta`
715730
* Metrics - `declarative_validation_mismatch_total` & `declarative_validation_panic_total`
716731
* Testing
717732
* Equivalency tests (verifyVersionedValidationEquivalence in prototype)
@@ -975,7 +990,7 @@ We should be able to start the migration when:
975990
* Add/extend validators to enable further progress into non-trivial cases
976991
3. Using Schemas for Validation (Joint Effort):
977992
* Core Team:
978-
* Enable validation through generated schemas for migrated resources (controlled by DeclarativeValidation feature gate).
993+
* Enable validation through generated schemas for migrated resources (controlled by `DeclarativeValidation` and `DeclarativeValidationBeta` feature gates).
979994
* Implement logic to populate default values from schemas.
980995
* Community:
981996
* Run E2E tests with declarative validation enabled.
@@ -1735,13 +1750,13 @@ When the `DeclarativeValidation` feature gate is enabled, both imperative and de
17351750
If the errors do not match, a 'declarative_validation_mismatch_total' metric will be incremented and information
17361751
about the mismatch will be written to the apiserver's logs.
17371752
1738-
The `DeclarativeValidationTakeover` feature gate controls *which* set of validation errors (imperative or declarative) are returned to the user. When `DeclarativeValidationTakeover` is true, the declarative errors are returned; otherwise, the imperative errors are returned.
1753+
The `DeclarativeValidationBeta` feature gate controls *which* set of validation errors (imperative or declarative) are returned to the user. When `DeclarativeValidationBeta` is true, the declarative errors are returned; otherwise, the imperative errors are returned.
17391754
17401755
This can then be used to minimize risk when rolling out Declarative Validation in production, by following these steps:
1741-
- Enable `DeclarativeValidation` (with `DeclarativeValidationTakeover` *disabled*).
1756+
- Enable `DeclarativeValidation` (with `DeclarativeValidationBeta` *disabled*).
17421757
- Soak for a desired duration across some number of clusters.
17431758
- Check the metrics to ensure no mismatches have been found.
1744-
- Enable `DeclarativeValidationTakeover`.
1759+
- Enable `DeclarativeValidationBeta`.
17451760
##### Integration tests
17461761
17471762
###### Migration Equivalency Tests
@@ -1887,7 +1902,7 @@ N/A. This change does not affect any communications going out of the apiserver.
18871902
1. Feature gate (also fill in values in `kep.yaml`)
18881903
* Feature gate name: DeclarativeValidation
18891904
* Components depending on the feature gate: kube-apiserver
1890-
* Feature gate name: DeclarativeValidationTakeover
1905+
* Feature gate name: DeclarativeValidationBeta
18911906
* Components depending on the feature gate: kube-apiserver
18921907
2. Other
18931908
* Describe the mechanism:
@@ -1899,7 +1914,7 @@ N/A. This change does not affect any communications going out of the apiserver.
18991914
* `DeclarativeValidation`
19001915
* Beta: Enables running both imperative and declarative validation. Mismatches are logged and reported via metrics. Imperative validation errors are returned to users.
19011916
* GA: Enables running both imperative and declarative validation. Mismatches are logged and reported via metrics. Imperative validation errors are returned to users.
1902-
* `DeclarativeValidationTakeover`
1917+
* `DeclarativeValidationBeta`
19031918
* Beta: When `DeclarativeValidation` is also enabled, returns declarative validation errors to users. Has no effect if `DeclarativeValidation` is disabled.
19041919
* GA: When `DeclarativeValidation` is also enabled, returns declarative validation errors to users. Has no effect if `DeclarativeValidation` is disabled.
19051920
@@ -2053,8 +2068,8 @@ If the API server is failing to meet SLOs (latency, validation error-rate, etc.)
20532068
* If the logs show repeated mismatches or errors for certain resource types, compare the declarative validation tags in `types.go` with the original hand-written logic to identify gaps or typos
20542069
* ^ Be sure to submit this information when filing an issue (see step 5)
20552070
4. **Compare Feature Gate Settings**
2056-
* Verify whether `DeclarativeValidation` is enabled for all API servers in an HA environment. Partial enablement can sometimes lead to inconsistent behavior or unexpected rejections.
2057-
* Temporarily disabling `DeclarativeValidation` can help isolate if new validation logic is the root cause. Bear in mind that rolling back may block updates on objects that were only valid under declarative validation rules if there is a bug related to this, so review “Can the feature be disabled once it has been enabled?” in this KEP in this case.
2071+
* Verify whether `DeclarativeValidation` and `DeclarativeValidationBeta` are enabled for all API servers in an HA environment. Partial enablement can sometimes lead to inconsistent behavior or unexpected rejections.
2072+
* Temporarily disabling `DeclarativeValidation` or `DeclarativeValidationBeta` can help isolate if new validation logic is the root cause. Bear in mind that rolling back may block updates on objects that were only valid under declarative validation rules if there is a bug related to this, so review “Can the feature be disabled once it has been enabled?” in this KEP in this case.
20582073
5. **File or Triage Issues**
20592074
* If you confirm that Declarative Validation logic is producing incorrect results or performance regressions, open a Github issue in the kubernetes/kubernetes repository. Include:
20602075
* The exact failing resource object or field that triggers errors.
@@ -2067,7 +2082,7 @@ If the API server is failing to meet SLOs (latency, validation error-rate, etc.)
20672082
- v1.33: Initial Beta implementation of `DeclarativeValidation` and `DeclarativeValidationTakeover` gates.
20682083
- v1.34: Stability metrics collection began.
20692084
- v1.35: Dual implementation requirement enforced, tag/feature stability codified, validation library implemented.
2070-
- v1.36: Introduction of the Validation Lifecycle mechanism and Explicit Strategy. (Current)
2085+
- v1.36: Introduction of the Validation Lifecycle mechanism and Explicit Strategy. Introduction of `DeclarativeValidationBeta` and deprecation of `DeclarativeValidationTakeover`. (Current)
20712086
20722087
## Drawbacks
20732088

0 commit comments

Comments
 (0)