You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md
+45-30Lines changed: 45 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,8 +17,9 @@
17
17
-[New Validations Vs Migrating Validations](#new-validations-vs-migrating-validations)
18
18
-[New Validation Tests](#new-validation-tests)
19
19
-[Ensuring Validation Equivalence With Testing](#ensuring-validation-equivalence-with-testing)
20
-
-[Introduce Feature Gates: <code>DeclarativeValidation</code> & <code>DeclarativeValidationTakeover</code>](#introduce-feature-gates-declarativevalidation--declarativevalidationtakeover)
21
-
-[<code>DeclarativeValidation</code> & <code>DeclarativeValidationTakeover</code> Will Target Beta From The Beginning](#declarativevalidation--declarativevalidationtakeover-will-target-beta-from-the-beginning)
20
+
-[Introduce Feature Gates: <code>DeclarativeValidation</code>, <code>DeclarativeValidationTakeover</code>, & <code>DeclarativeValidationBeta</code>](#introduce-feature-gates-declarativevalidation-declarativevalidationtakeover--declarativevalidationbeta)
21
+
-[<code>DeclarativeValidation</code> & <code>DeclarativeValidationBeta</code> Will Target Beta From The Beginning](#declarativevalidation--declarativevalidationbeta-will-target-beta-from-the-beginning)
22
+
-[Execution & Authority Logic](#execution--authority-logic)
-[<code>DeclarativeValidation</code> Feature Gate Beta to GA Graduation Criteria](#declarativevalidation-feature-gate-beta-to-ga-graduation-criteria)
24
25
-[Linter](#linter)
@@ -236,18 +237,18 @@ The strategic goal of Declarative Validation (DV) is to **meaningfully reduce th
236
237
237
238
### The "Implicit Shadow" Reality
238
239
239
-
Today, the system is in a "hybrid" state. Any field using standard +k8s: tags (without the +k8s:declarativeValidationNative marker) is **implicitly shadowed** because Takeover defaults to false. Mismatches are recorded, but errors are suppressed to prevent duplication.
240
+
Today, the system is in a "hybrid" state. Any field using standard +k8s: tags (without the +k8s:declarativeValidationNative marker) is **implicitly shadowed** because `DeclarativeValidationBeta` defaults to false for legacy fields (or they remain in implicit mode). Mismatches are recorded, but errors are suppressed to prevent duplication.
240
241
241
242
### Problem: Global vs. Local Control
242
243
243
-
The Takeover gate is a global "all-or-nothing" switch. Graduation is blocked because we cannot force every migrated field in the cluster to become "Authoritative" simultaneously.
244
+
The `DeclarativeValidationBeta` gate is a global "all-or-nothing" switch for rules in the Beta stage. Graduation is blocked because we cannot force every migrated field in the cluster to become "Authoritative" simultaneously without a granular lifecycle.
244
245
245
246
### Solution: Lifecycle Tags
246
247
247
248
We adopt a standard Alpha/Beta/GA lifecycle for validation rules, controlled by tag prefixes:
248
249
249
250
***+k8s:alpha**: Shadow mode (Metrics only).
250
-
***+k8s:beta**: Enforced by default, but disable-able via the global DeclarativeValidationTakeover gate.
251
+
***+k8s:beta**: Enforced by default, but disable-able via the global `DeclarativeValidationBeta` gate.
251
252
***no prefix tag**: Permanently enforced.
252
253
253
254
Declarative validation will benefit Kubernetes maintainers:
@@ -308,8 +309,8 @@ Please feel free to try out the [prototype](https://github.com/jpbetz/kubernetes
308
309
* Introduce new validation tests, test framework and migration test utilities
309
310
* No field can go thru migration without a robust test for the field in question and maintainer review scrutiny which proves that it is validated correctly before the change and after.
310
311
* Create migration test pattern and utilities which support testing equivalence between hand-written validation and declarative validation (de-risks migration problems)
311
-
* Introduce featuregate: `DeclarativeValidation` and `DeclarativeValidationTakeover`
312
-
* Combined allow for safety mechanism in case a mistake is made so that we can safely compare validation errors but have the handwritten validations still be authoritative along the request path. Additionally users can turn off Declarative Validation and get back to a healthy validation state if necessary. (de-risks migration problems)
312
+
* Introduce featuregate: `DeclarativeValidation` and `DeclarativeValidationBeta`
313
+
* Combined allow for safety mechanism in case a mistake is made so that we can safely compare validation errors but have the handwritten validations still be authoritative along the request path. Additionally users can turn off Declarative Validation or disable newly enforced rules via the Beta gate and get back to a healthy validation state if necessary. (de-risks migration problems)
313
314
* Introduce runtime verification testing which emit
314
315
*`declarative_validation_mismatch_total` metric allowing for tests and users to identify any mismatching validation logic between hand-written and declarative validations.
315
316
*`declarative_validation_panic_total` metric which counts the number of panics (recovered) that occur in declarative validation code as an extra precaution.
@@ -359,20 +360,34 @@ For testing the migration and ensuring that the validation is identical across c
359
360
360
361
Verifying that a field/type that is migrated is appropriately tested with proper changes to validation_test.go, equivalence testing, etc. will be human-driven enforced in PR review for the related community migration PR.
361
362
362
-
Additionally, to aid in ensuring that the validation is identical across current hand-written validation and declarative validations, we will create a runtime check controlled by the `DeclarativeValidation` and `DeclarativeValidationTakeover` feature gates. When `DeclarativeValidation` is enabled, both hand-written and declarative validation will be run. Any mismatches will be logged and a `declarative_validation_mismatch_total` metric will be incremented. The `DeclarativeValidationTakeover` gate controls which result (imperative or declarative) is returned to the user.
Additionally, to aid in ensuring that the validation is identical across current hand-written validation and declarative validations, we will create a runtime check controlled by the `DeclarativeValidation` and `DeclarativeValidationBeta` feature gates. When `DeclarativeValidation` is enabled, both hand-written and declarative validation will be run. Any mismatches will be logged and a `declarative_validation_mismatch_total` metric will be incremented. The `DeclarativeValidationBeta` gate controls which result (imperative or declarative) is returned to the user for Beta-stage rules.
Two feature gates were introduced in v1.33 to manage the rollout:
366
+
Three feature gates are involved in the rollout, reflecting the transition to the lifecycle model:
366
367
367
-
***`DeclarativeValidation`**: This gate controls whether declarative validation is *enabled* for a given resource or field. When enabled, both imperative (hand-written) and declarative validation will run. The results will be compared, and any mismatches will be logged and reported via metrics (see `DeclarativeValidationTakeover` below). The imperative validation result will be returned to the user. When disabled, only imperative validation runs.
368
+
***`DeclarativeValidation`**: This gate controls whether declarative validation is *enabled* for a given resource or field. When enabled, both imperative (hand-written) and declarative validation will run. The results will be compared, and any mismatches will be logged and reported via metrics (see `DeclarativeValidationBeta` below). The imperative validation result will be returned to the user. When disabled, only imperative validation runs.
368
369
369
-
***`DeclarativeValidationTakeover`**: The DeclarativeValidationTakeover feature gate is retained as the Global Safety Switch for Beta-stage validation rules. It allows cluster admins to disable "newly enforced" validations if regressions are found, forcing them back to Shadow mode (handwritten fallback). When `DeclarativeValidationTakeover` is enabled (default for Beta), Beta tags are Enforced. When disabled, Beta tags are Shadowed. `DeclarativeValidationTakeover` has *no effect* if `DeclarativeValidation` is disabled.
370
+
***`DeclarativeValidationTakeover`**: Deprecated in v1.36. Previously determined whether declarative validation results were authoritative. As `DeclarativeValidationTakeover` does not change user-visible behavior (it is internal to validation mechanics), we do not intend to honor it in 1.36. Users of emulation version will be allowed to set the gate (e.g., when upgrading a cluster that had it set), but it will have no impact on implementation. This prevents upgrade failures (e.g., "gate not recognized") while immediately moving behavior to the new lifecycle model.
370
371
371
-
#### `DeclarativeValidation` & `DeclarativeValidationTakeover` Will Target Beta From The Beginning
372
+
***`DeclarativeValidationBeta`**: Introduced in v1.36. This feature gate acts as the Global Safety Switch for Beta-stage validation rules. It allows cluster admins to disable "newly enforced" validations if regressions are found, forcing them back to Shadow mode. When `DeclarativeValidationBeta` is enabled (default for Beta), Beta tags are Enforced. When disabled, Beta tags are Shadowed. `DeclarativeValidationBeta` has *no effect* if `DeclarativeValidation` is disabled.
373
+
374
+
#### `DeclarativeValidation` & `DeclarativeValidationBeta` Will Target Beta From The Beginning
372
375
373
376
Declarative Validation will target the Beta stage from the beginning (vs Alpha). Additionally, `DeclarativeValidation` is targeting Beta with `default:true`. This is because Declarative Validation is not new functionality, but an alternative implementation of validation, and users should not be able to perceive any changes when swapping hand-written validation with identical declarative validation. The feature gate, `DeclarativeValidation`, exists as a safety mechanism in case a mistake is made so that users can turn it off and get back to safety. There is prior art for this rationale where other feature gates did not target Alpha as they were not related to new functionality (changing underlying behavior, bugfix, etc.). An example of this is the current feature gate `AllowParsingUserUIDFromCertAuth`, which was introduced in Beta as `default:true` as it is not a net new feature but fixes a current issue ([PR](https://github.com/kubernetes/kubernetes/pull/127897), [feature gate](https://github.com/kubernetes/kubernetes/blob/master/pkg/features/versioned_kube_features.go#L228-L230)).
374
377
375
-
`DeclarativeValidationTakeover` will default to `false` initially in Beta. This way during the initial rollout we can "soak" and verify that the errors produced for a replaced validation rule (handwritten -> declarative) are identical. Over time the goal is to flip `DeclarativeValidationTakeover` to be default `true` such that for fields where declarative validation rules exist, they are used as the authoritative validation rule.
378
+
`DeclarativeValidationBeta` will default to `true` initially. This is because we want to enable the beta validations by default, but provide a safety switch to disable them if necessary.
379
+
380
+
#### Execution & Authority Logic
381
+
382
+
Execution and authority are determined by two feature gates (`DeclarativeValidation`, `DeclarativeValidationBeta`) and the resource strategy (`WithDeclarativeEnforcement()`).
383
+
384
+
1.**Execution (Is it running?)**:
385
+
1. DV runs if `DeclarativeValidation` is Enabled OR the resource uses `WithDeclarativeEnforcement()`.
386
+
2. Note: Using the strategy ensures New APIs run even if the main gate is disabled.
* Equivalency tests (verifyVersionedValidationEquivalence in prototype)
@@ -975,7 +990,7 @@ We should be able to start the migration when:
975
990
* Add/extend validators to enable further progress into non-trivial cases
976
991
3. Using Schemas for Validation (Joint Effort):
977
992
* Core Team:
978
-
* Enable validation through generated schemas for migrated resources (controlled by DeclarativeValidationfeature gate).
993
+
* Enable validation through generated schemas for migrated resources (controlled by `DeclarativeValidation` and `DeclarativeValidationBeta`feature gates).
979
994
* Implement logic to populate default values from schemas.
980
995
* Community:
981
996
* Run E2E tests with declarative validation enabled.
@@ -1735,13 +1750,13 @@ When the `DeclarativeValidation` feature gate is enabled, both imperative and de
1735
1750
If the errors do not match, a 'declarative_validation_mismatch_total' metric will be incremented and information
1736
1751
about the mismatch will be written to the apiserver's logs.
1737
1752
1738
-
The `DeclarativeValidationTakeover` feature gate controls *which* set of validation errors (imperative or declarative) are returned to the user. When `DeclarativeValidationTakeover` is true, the declarative errors are returned; otherwise, the imperative errors are returned.
1753
+
The `DeclarativeValidationBeta` feature gate controls *which* set of validation errors (imperative or declarative) are returned to the user. When `DeclarativeValidationBeta` is true, the declarative errors are returned; otherwise, the imperative errors are returned.
1739
1754
1740
1755
This can then be used to minimize risk when rolling out Declarative Validation in production, by following these steps:
* Components depending on the feature gate: kube-apiserver
1892
1907
2. Other
1893
1908
* Describe the mechanism:
@@ -1899,7 +1914,7 @@ N/A. This change does not affect any communications going out of the apiserver.
1899
1914
* `DeclarativeValidation`
1900
1915
* Beta: Enables running both imperative and declarative validation. Mismatches are logged and reported via metrics. Imperative validation errors are returned to users.
1901
1916
* GA: Enables running both imperative and declarative validation. Mismatches are logged and reported via metrics. Imperative validation errors are returned to users.
1902
-
* `DeclarativeValidationTakeover`
1917
+
* `DeclarativeValidationBeta`
1903
1918
* Beta: When `DeclarativeValidation` is also enabled, returns declarative validation errors to users. Has no effect if `DeclarativeValidation` is disabled.
1904
1919
* GA: When `DeclarativeValidation` is also enabled, returns declarative validation errors to users. Has no effect if `DeclarativeValidation` is disabled.
1905
1920
@@ -2053,8 +2068,8 @@ If the API server is failing to meet SLOs (latency, validation error-rate, etc.)
2053
2068
* If the logs show repeated mismatches or errors for certain resource types, compare the declarative validation tags in `types.go` with the original hand-written logic to identify gaps or typos
2054
2069
* ^ Be sure to submit this information when filing an issue (see step 5)
2055
2070
4. **Compare Feature Gate Settings**
2056
-
* Verify whether `DeclarativeValidation` is enabled for all API servers in an HA environment. Partial enablement can sometimes lead to inconsistent behavior or unexpected rejections.
2057
-
* Temporarily disabling `DeclarativeValidation` can help isolate if new validation logic is the root cause. Bear in mind that rolling back may block updates on objects that were only valid under declarative validation rules if there is a bug related to this, so review “Can the feature be disabled once it has been enabled?” in this KEP in this case.
2071
+
* Verify whether `DeclarativeValidation` and `DeclarativeValidationBeta` are enabled for all API servers in an HA environment. Partial enablement can sometimes lead to inconsistent behavior or unexpected rejections.
2072
+
* Temporarily disabling `DeclarativeValidation` or `DeclarativeValidationBeta` can help isolate if new validation logic is the root cause. Bear in mind that rolling back may block updates on objects that were only valid under declarative validation rules if there is a bug related to this, so review “Can the feature be disabled once it has been enabled?” in this KEP in this case.
2058
2073
5. **File or Triage Issues**
2059
2074
* If you confirm that Declarative Validation logic is producing incorrect results or performance regressions, open a Github issue in the kubernetes/kubernetes repository. Include:
2060
2075
* The exact failing resource object or field that triggers errors.
@@ -2067,7 +2082,7 @@ If the API server is failing to meet SLOs (latency, validation error-rate, etc.)
2067
2082
- v1.33: Initial Beta implementation of `DeclarativeValidation` and `DeclarativeValidationTakeover` gates.
- v1.36: Introduction of the Validation Lifecycle mechanism and Explicit Strategy. (Current)
2085
+
- v1.36: Introduction of the Validation Lifecycle mechanism and Explicit Strategy. Introduction of `DeclarativeValidationBeta` and deprecation of `DeclarativeValidationTakeover`. (Current)
0 commit comments