chore(xtest): Shared Scenario/Instance Pydantic schema in otdf-sdk-mgr#450
Conversation
…X-3302)
Introduces otdf_sdk_mgr.schema as the canonical Pydantic v2 model layer
for the multi-instance test harness. Both otdf-sdk-mgr and otdf-local
will read scenarios.yaml / instance.yaml through these models so the
on-disk YAML format has exactly one definition.
Models:
- SourceRef, PlatformPin, KasPin, SdkPin (with mutually-exclusive
dist|source|image validation on the platform/KAS pins)
- PortsConfig, Metadata, Fixtures
- Instance (apiVersion/kind/metadata/platform/ports/kas/...)
- ScenarioSdks (encrypt + decrypt maps mirroring xtest's
--sdks-encrypt / --sdks-decrypt convention)
- Suite (pytest select + flags)
- Scenario (composes Instance + ScenarioSdks + Suite)
Includes load_scenario / load_instance / dump_instance helpers and a
`python -m otdf_sdk_mgr.schema validate <path>` CLI entry that dispatches
on `kind:` so the same command validates both Scenario and Instance YAML.
Adds pydantic + ruamel.yaml to otdf-sdk-mgr's deps and a 6-test smoke
suite covering round-trips, pin validation, encrypt/decrypt union dedup,
and unknown-field rejection.
Refs: https://virtru.atlassian.net/browse/DSPX-3302
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces Pydantic-based schema validation for OpenTDF Scenario and Instance YAML configurations, SDK-to-pytest translation utilities, comprehensive tests, and documentation updates. Dependencies are added, data models enforce cross-field validation rules, YAML loaders/serializers are implemented, SDK translation logic matches scenario pins against install records, and extensive tests validate all behavior including error handling. ChangesSchema Validation and SDK Translation for OpenTDF Test Scenarios
Sequence DiagramsequenceDiagram
participant User
participant schema.py
participant installed.json
participant pytest
User->>schema.py: load_scenario(path)
schema.py->>schema.py: load YAML & validate via Pydantic
schema.py-->>User: Scenario object
User->>schema.py: scenario_to_pytest_sdks(scenario, installed_json_path)
schema.py->>installed.json: read install records
schema.py->>schema.py: match SDK pins by role & version
schema.py-->>User: {encrypt: [sdk@dist], decrypt: [sdk@dist]}
User->>schema.py: _main(['validate', path])
schema.py->>schema.py: load & validate YAML by kind
schema.py-->>User: ok or invalid (exit code)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a new schema module for OpenTDF scenarios and instances, utilizing Pydantic models to ensure a canonical YAML definition across different tools. The changes include adding pydantic and ruamel.yaml as dependencies, implementing load/dump logic, and providing comprehensive smoke tests. The review feedback suggests several improvements to the implementation, specifically regarding the enforcement of UTF-8 encoding for file operations, the removal of redundant YAML configuration settings, and the use of more specific exception handling for validation and parsing errors.
There was a problem hiding this comment.
Pull request overview
First PR of a five-part stack introducing a multi-instance test harness for OpenTDF. This change adds only the shared Pydantic v2 schema in otdf-sdk-mgr (no consumers yet), plus a small CLI validator and unit tests. It establishes the on-disk shape for scenarios.yaml / instance.yaml so downstream PRs in the stack (otdf-local, xtest/conftest.py, Claude plugin) can import a single canonical definition.
Changes:
- Adds
otdf_sdk_mgr.schemawith strict (extra="forbid") v2 models:Scenario,Instance,PlatformPin,KasPin,SdkPin,ScenarioSdks,Suite, plus helpersload_scenario/load_instance/dump_instanceand apython -m otdf_sdk_mgr.schema validate <path>entrypoint. - Adds
pydantic>=2.6.0andruamel.yaml>=0.18.0to project dependencies (with correspondinguv.lockentries). - Adds 6 schema unit tests covering scenario round-trip, pin "exactly one source" invariant, KasPin features pass-through, SDK union, instance dump/load, and unknown-field rejection.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| otdf-sdk-mgr/pyproject.toml | Adds pydantic and ruamel.yaml runtime dependencies. |
| otdf-sdk-mgr/uv.lock | Auto-generated lockfile updates for the new dependencies (pydantic, pydantic-core, annotated-types, typing-inspection, ruamel-yaml). |
| otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py | New module with Pydantic models, YAML load/dump helpers, and a validate CLI entry point. |
| otdf-sdk-mgr/tests/test_schema.py | Smoke tests for the new schema (round-trip, pin invariants, union, extra-forbid rejection). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… format xtest's --sdks / --sdks-encrypt / --sdks-decrypt accept whitespace- separated `sdk@version` tokens after #446 (e.g. `go@v0.24.0`, `go@main`, `go@*`). The version segment must match an actual directory under `xtest/sdk/<lang>/dist/`. Scenario version fields can be aliases (`lts`, `tip`) that only resolve to a concrete dist name once `otdf-sdk-mgr install scenario` runs, so we can't translate scenarios → pytest args from the scenario YAML alone. Adds two helpers so the scenario→pytest bridge has one canonical implementation: installed_json_for(scenario_path): The conventional sibling file `otdf-sdk-mgr install scenario` writes. `xtest/scenarios/x.yaml` → `xtest/scenarios/x.installed.json`. scenario_to_pytest_sdks(scenario, installed_json_path) -> dict: Returns `{"encrypt": ["go@v0.24.0", ...], "decrypt": [...]}`, reading the dist directory names recorded in installed.json. Raises FileNotFoundError with a `run install scenario first` hint when the record is missing (aliases can't be passed verbatim to xtest, so a clean error beats a confusing pytest failure). Raises ValueError when the scenario references an SDK the install record doesn't cover. Both `otdf-local scenario run` and `xtest/conftest.py`'s `--scenario`-default path will switch to this helper in the following PRs so they no longer drop the version when forwarding to pytest. Refs: https://virtru.atlassian.net/browse/DSPX-3302 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
X-Test Failure Report |
| `dist` references a built binary at `xtest/platform/dist/<dist>/service` | ||
| produced by `otdf-sdk-mgr install platform:<version>`. `source.ref` is a | ||
| git ref to build from on demand. `image` is reserved for forward-compat | ||
| once container images are published; rejected at run time today. |
This is speculative work and can wait for later
|
X-Test Failure Report✅ js@main-v0.15.0 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
otdf-sdk-mgr/tests/test_schema.py (1)
70-77: 💤 Low valueTest name is slightly misleading.
test_platform_pin_requires_exactly_one_sourcereads as "must have exactly onesource", but the cases actually assert thedist | sourcemutual-exclusion rule. Consider renaming totest_platform_pin_requires_exactly_one_of_dist_or_sourceto match the validator's contract and error message.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@otdf-sdk-mgr/tests/test_schema.py` around lines 70 - 77, Rename the test function test_platform_pin_requires_exactly_one_source to test_platform_pin_requires_exactly_one_of_dist_or_source so its name reflects the mutual-exclusion validation enforced by PlatformPin (the dist | source rule); update the test function declaration and any references to it in tests (the body using PlatformPin and SourceRef(ref="main") remains unchanged) so the validator behavior and error messaging match the new descriptive name.otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py (4)
286-293: 💤 Low valueMinor: error message formatting when
entry.sourceis set.The ternary
' source ' + entry.source if entry.source else ''concatenates a raw source string with no surrounding quotes, whilesdk/versionare quoted. Reading the message out loud yields e.g.... version '0.7.8' source platform, but ...— slightly inconsistent and easy to misparse if the source ever contains spaces or punctuation. Not a defect; just a readability nit.📝 Suggested tweak
- raise ValueError( - f"Scenario references {role} SDK '{entry.sdk}' version '{entry.version}'" - f"{' source ' + entry.source if entry.source else ''}, but {p} has no matching " - "install record for it. Re-run `otdf-sdk-mgr install scenario`." - ) + source_clause = f" source '{entry.source}'" if entry.source else "" + raise ValueError( + f"Scenario references {role} SDK '{entry.sdk}' version " + f"'{entry.version}'{source_clause}, but {p} has no matching " + "install record for it. Re-run `otdf-sdk-mgr install scenario`." + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py` around lines 286 - 293, The error message in the block that validates install_entry (inside the function that computes dist_name and returns f"{entry.sdk}@{dist_name}") concatenates entry.source without quotes, making the message inconsistent with quoted sdk/version; update the error string construction to wrap entry.source in single quotes (or use repr(entry.source)) when present so the message becomes e.g. "version '0.7.8' source 'platform', but ..."; adjust the ternary that currently produces "' source ' + entry.source if entry.source else ''" to include the quoted form and keep the rest of the message unchanged.
112-113: 💤 Low value
apiVersionliteral duplicated alongsideAPI_VERSIONconstant.
Literal["opentdf.io/v1alpha1"]is repeated in bothInstance.apiVersionandScenario.apiVersion, while the runtime default uses theAPI_VERSIONconstant. The type annotation cannot reference the constant (Pydantic/PythonLiteralrequires a literal value), but you can centralize the type with a module-level alias so future version bumps only touch one place:ApiVersionLiteral = Literal["opentdf.io/v1alpha1"]then use
apiVersion: ApiVersionLiteral = API_VERSIONin both models. Strictly cosmetic.Also applies to: 183-184
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py` around lines 112 - 113, Create a module-level alias for the Literal API version and use it for both models: define ApiVersionLiteral = Literal["opentdf.io/v1alpha1"] once, then change the apiVersion annotations in Instance.apiVersion and Scenario.apiVersion to use ApiVersionLiteral and keep the runtime default API_VERSION as the value; update both occurrences so future version bumps only require changing the alias.
301-337: ⚡ Quick winCLI
_mainhas no direct test coverage.
_main()handles four distinct error branches (read/OSError, YAMLError, top-level-not-mapping ValueError, unknown kind, ValidationError) plus the success path, none of which are exercised bytest_schema.py. A couple of small tests calling_main(["validate", str(path)])and asserting the return code would lock in the contract — particularly the exit-1 vs exit-2 distinction and the kind dispatch.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py` around lines 301 - 337, Add unit tests that call the CLI entry function _main(argv) directly to exercise its branches: create fixture files to trigger OSError (use a non-existent path or mock _load_yaml_mapping to raise OSError), invalid YAML (mock _load_yaml_mapping to raise YAMLError), top-level-not-mapping (mock _load_yaml_mapping to raise ValueError), unknown kind (return a mapping without kind or with kind != "Scenario"/"Instance"), ValidationError (return a mapping with kind "Scenario" or "Instance" but invalid fields and assert model.model_validate raises ValidationError or mock Scenario/Instance.model_validate to raise), and the success path (valid mapping for Scenario/Instance); for each test call _main(["validate", str(path)]) and assert the correct integer return codes (1 for errors, 2 for usage) and expected stderr messages when applicable, referencing _load_yaml_mapping, _main, Scenario, Instance, and ValidationError to locate targets to mock or supply test data.
46-71: 💤 Low valueConsider extracting the shared
exactly one of dist|sourcevalidator.
PlatformPin._exactly_one(46-53) andKasPin._exactly_one(64-71) are byte-for-byte identical except for the class name in the error message. A tiny shared mixin or module-level helper would keep the constraint definition in one place if/when a third pin type is added.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py` around lines 46 - 71, Extract the duplicated validator into a shared mixin or helper and have both models use it: create a mixin (e.g., ExactlyOneOfDistSource) that defines a single `@model_validator`(mode="after") method (use a generic/self return type) which checks getattr(self, "dist") and getattr(self, "source") and raises ValueError with a message using type(self).__name__ to include the model name; then have PlatformPin and KasPin inherit from that mixin (remove their _exactly_one implementations) so the single validator enforces the "exactly one of dist|source" rule for both classes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py`:
- Around line 122-146: The docstring for ScenarioSdks contradicts the
_dedupe_per_role validator: the class currently raises a ValueError on exact
duplicate (see _dedupe_per_role and
test_scenario_sdks_rejects_exact_duplicate_within_role) but the docstring says
selections are "de-duplicated"; update the ScenarioSdks docstring to state that
selections are validated and duplicates within each role (by sdk, version,
source) are rejected with a ValidationError (or ValueError) rather than silently
removed, so the documentation matches the behavior of _dedupe_per_role.
---
Nitpick comments:
In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py`:
- Around line 286-293: The error message in the block that validates
install_entry (inside the function that computes dist_name and returns
f"{entry.sdk}@{dist_name}") concatenates entry.source without quotes, making the
message inconsistent with quoted sdk/version; update the error string
construction to wrap entry.source in single quotes (or use repr(entry.source))
when present so the message becomes e.g. "version '0.7.8' source 'platform', but
..."; adjust the ternary that currently produces "' source ' + entry.source if
entry.source else ''" to include the quoted form and keep the rest of the
message unchanged.
- Around line 112-113: Create a module-level alias for the Literal API version
and use it for both models: define ApiVersionLiteral =
Literal["opentdf.io/v1alpha1"] once, then change the apiVersion annotations in
Instance.apiVersion and Scenario.apiVersion to use ApiVersionLiteral and keep
the runtime default API_VERSION as the value; update both occurrences so future
version bumps only require changing the alias.
- Around line 301-337: Add unit tests that call the CLI entry function
_main(argv) directly to exercise its branches: create fixture files to trigger
OSError (use a non-existent path or mock _load_yaml_mapping to raise OSError),
invalid YAML (mock _load_yaml_mapping to raise YAMLError), top-level-not-mapping
(mock _load_yaml_mapping to raise ValueError), unknown kind (return a mapping
without kind or with kind != "Scenario"/"Instance"), ValidationError (return a
mapping with kind "Scenario" or "Instance" but invalid fields and assert
model.model_validate raises ValidationError or mock
Scenario/Instance.model_validate to raise), and the success path (valid mapping
for Scenario/Instance); for each test call _main(["validate", str(path)]) and
assert the correct integer return codes (1 for errors, 2 for usage) and expected
stderr messages when applicable, referencing _load_yaml_mapping, _main,
Scenario, Instance, and ValidationError to locate targets to mock or supply test
data.
- Around line 46-71: Extract the duplicated validator into a shared mixin or
helper and have both models use it: create a mixin (e.g.,
ExactlyOneOfDistSource) that defines a single `@model_validator`(mode="after")
method (use a generic/self return type) which checks getattr(self, "dist") and
getattr(self, "source") and raises ValueError with a message using
type(self).__name__ to include the model name; then have PlatformPin and KasPin
inherit from that mixin (remove their _exactly_one implementations) so the
single validator enforces the "exactly one of dist|source" rule for both
classes.
In `@otdf-sdk-mgr/tests/test_schema.py`:
- Around line 70-77: Rename the test function
test_platform_pin_requires_exactly_one_source to
test_platform_pin_requires_exactly_one_of_dist_or_source so its name reflects
the mutual-exclusion validation enforced by PlatformPin (the dist | source
rule); update the test function declaration and any references to it in tests
(the body using PlatformPin and SourceRef(ref="main") remains unchanged) so the
validator behavior and error messaging match the new descriptive name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c6fcc044-1125-47f7-b09f-281968be1b59
⛔ Files ignored due to path filters (1)
otdf-sdk-mgr/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
otdf-sdk-mgr/pyproject.tomlotdf-sdk-mgr/src/otdf_sdk_mgr/schema.pyotdf-sdk-mgr/tests/test_schema.pyxtest/conftest.py
| class ScenarioSdks(_StrictModel): | ||
| """Encrypt/decrypt split mirrors xtest's --sdks-encrypt/--sdks-decrypt. | ||
|
|
||
| Selections are ordered to preserve the eventual argv order, and are | ||
| de-duplicated within each role by (sdk, version, source). | ||
| """ | ||
|
|
||
| encrypt: list[ScenarioSdk] = Field(default_factory=list) | ||
| decrypt: list[ScenarioSdk] = Field(default_factory=list) | ||
|
|
||
| @model_validator(mode="after") | ||
| def _dedupe_per_role(self) -> ScenarioSdks: | ||
| for role in ("encrypt", "decrypt"): | ||
| seen: set[tuple[SdkName, str, str | None]] = set() | ||
| duplicates = [] | ||
| for entry in getattr(self, role): | ||
| key = entry.install_key() | ||
| if key in seen: | ||
| duplicates.append(key) | ||
| seen.add(key) | ||
| if duplicates: | ||
| raise ValueError( | ||
| f"ScenarioSdks.{role} contains duplicate sdk/version entries: {duplicates}" | ||
| ) | ||
| return self |
There was a problem hiding this comment.
Docstring contradicts the validator's behavior.
The class docstring says selections are "de-duplicated within each role by (sdk, version, source)", but _dedupe_per_role actually rejects duplicates with a ValidationError (see test_scenario_sdks_rejects_exact_duplicate_within_role). Either rename the validator and update the docstring to match the reject-on-duplicate behavior, or change the validator to silently dedupe like union() does. The current wording will mislead consumers writing scenarios.
📝 Suggested docstring fix (keeping reject-on-duplicate semantics)
class ScenarioSdks(_StrictModel):
"""Encrypt/decrypt split mirrors xtest's --sdks-encrypt/--sdks-decrypt.
Selections are ordered to preserve the eventual argv order, and
- de-duplicated within each role by (sdk, version, source).
+ duplicate (sdk, version, source) tuples within a single role are
+ rejected. Use `union()` to obtain a deduped encrypt+decrypt sequence.
"""🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@otdf-sdk-mgr/src/otdf_sdk_mgr/schema.py` around lines 122 - 146, The
docstring for ScenarioSdks contradicts the _dedupe_per_role validator: the class
currently raises a ValueError on exact duplicate (see _dedupe_per_role and
test_scenario_sdks_rejects_exact_duplicate_within_role) but the docstring says
selections are "de-duplicated"; update the ScenarioSdks docstring to state that
selections are validated and duplicates within each role (by sdk, version,
source) are rejected with a ValidationError (or ValueError) rather than silently
removed, so the documentation matches the behavior of _dedupe_per_role.



Summary
First PR in a five-part stack that introduces a multi-instance test harness and a Claude plugin for OpenTDF bug reproduction. This PR adds only the shared Pydantic schema in
otdf-sdk-mgr— no consumers yet.otdf_sdk_mgr.schemawith v2 models:Scenario,Instance,PlatformPin,KasPin,SdkPin,ScenarioSdks,Suite, etc.ScenarioSdks.encrypt/.decryptmirror xtest's existing--sdks-encrypt/--sdks-decryptconvention so a→b-only scenarios are first-class.python -m otdf_sdk_mgr.schema validate <path>validates either a Scenario or an Instance file based on itskind:.pydantic+ruamel.yamltootdf-sdk-mgr/pyproject.toml.Stack
install scenarioinotdf-sdk-mgr(builds on this)otdf-localmulti-instance refactor + new CLI subcommandsxtest/conftest.pyintegration (--scenario,--instance).claude/skills/, settings, plugin manifest)Test plan
cd otdf-sdk-mgr && uv run pytest tests/test_schema.py— all 6 passuv run python -m otdf_sdk_mgr.schema validate <path>accepts a valid scenarios.yaml and rejects unknown fieldsJira: https://virtru.atlassian.net/browse/DSPX-3302
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
nanotoztdf-ecwrap.Dependencies