Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions design/TESTPROVIDER_ASSESSMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Test Provider — suitability as a general library, and what's missing

**Status:** assessment / proposal (2026-06). Evaluates the `test/proto/`
prototypes (see [`../test/proto/PROVIDER.md`](../test/proto/PROVIDER.md),
[`RUNNING.md`](../test/proto/RUNNING.md),
[`AGENTS.md`](../test/proto/AGENTS.md)) against the bar of being a *general
library for test-spec provision* — something a port's test suite, or a coding
agent, depends on to consume the shared corpus. Nothing here is implemented.

> **Verdict.** As a prototype proving the shape: suitable — the model and the
> 22-language parity are right. As a drop-in general library today: not yet.
> Gaps §3.1–§3.4 are correctness/usability blockers, not polish. The cleanest
> path is to push the *invoke-mapping* and *null-mode* into the test-spec model
> (the aontu schema strand, [`TESTSPEC_MODEL.md`](./TESTSPEC_MODEL.md)), which
> turns the provider from "data + an external cheat-sheet" into a genuinely
> self-contained library; only then is packaging worth the spend.


## 1. What the prototype is

A data-access library, ported to all 22 languages and verified to emit the same
normalized view of `build/test/test.json` — **1325 entries** (value 1181,
absent 84, error 59, match 1). It loads the corpus, classifies
`struct.<fn>.<group>.set[]` into normalized `Entry` records (tagged `Input`,
tagged `Expect`, provenance), and ships pure comparison helpers
(`equal`/`equalStrict`/`structMatch`/`errorMatches`/`matchval`). It is **not** a
runner: it never calls the function under test and never asserts.


## 2. What is already library-grade

* **Uniform model across 22 languages, proven identical.** Cross-language
parity is the expensive part and it is done and run-verified.
* **Dependency-free**, with an order-preserving JSON reader (stdlib or
hand-rolled) in every port.
* **The fiddly logic is centralized.** `structMatch` (regex / `__UNDEF__` /
`__EXISTS__` / partial deep match) and the `equal` vs `equalStrict` null
semantics are written once per port instead of re-derived in each test file.
* **Good authoring ergonomics.** Tagged input/expect + provenance
(`function/group/index/id`) make per-case tests and failure messages clean.


## 3. Gaps that block general use (prioritized)

### 3.1 The invoke-mapping lives *outside* the library ⟵ #1
The provider hands you `entry.input.in = {store, path}` but not that it maps to
`getpath(store, path)`. That knowledge is prose in `AGENTS.md` §3, not data.
Every consumer must hand-maintain that table — exactly the drift this repo
exists to prevent. **Highest-value gap.** (Proposal: §4.1.)

### 3.2 The `null:false` mode is not in the corpus
Whether a case is compared with `equal` or `equalStrict` is set by the test
author per `runset` call — i.e. **per (function, group)** (e.g. `validate.basic`
is strict, `validate.child` is not; `transform.format` is strict, sibling
groups are not; all of `minor.clone`, all `sentinels`, `walk.depth`, …). The
provider cannot currently tell a caller the right comparison mode. A silent
correctness hazard. (Proposal: §4.1.)

### 3.3 The `args`/`ctx` input paths are effectively untested
All 1325 iterated entries are `kind:in`. The only `args`/`ctx`/`DEF.client`
data lives under `primary.check`, which `functions()` deliberately skips. So
those provider branches are written but never exercised, and the entire
client-integration spec is unreachable through the provider. (Proposal: §4.2.)

### 3.4 No clone-on-read
`raw` and `input.in` are returned by reference. The real runner *clones*
`entry.in` before each call precisely so a subject can't mutate shared
corpus/fixtures; a test that mutates input here would corrupt later cases.
(Proposal: §4.3.)

### 3.5 The helpers are a reimplementation, not the port's own semantics
The runner matches using each port's *own* `struct.walk`/`getpath`/`stringify`;
the provider ships generic equivalents. Self-contained, but they can diverge
from a port's real semantics on edge cases (array indexing, special keys,
stringify formatting). Fine as a data utility, riskier as the assertion
authority — see the ownership decision in §4.6.

### 3.6 Corpus discovery is hardcoded
The default path assumes the in-repo `build/test/test.json` layout. A library
shipped inside a port's package will not know where the consumer's corpus is
(the runner already hints at a `.sdk/test/test.json` alternative for sdkgen
projects). Needs explicit/configurable resolution.

### 3.7 Thin API, and the provider itself is untested
No `byId()`, no filtering (by `doc`, by `client`), no access to the `primary`
namespace or to fixtures except via `raw()`. The provider's own logic is only
*smoke*-tested (counts) — the helpers have no conformance suite, and there is no
cross-port parity check (the analogue of `tools/check_parity.py`) keeping the 22
APIs and behaviours in sync.

### 3.8 Packaging
Loose single files under `test/proto/`, not consumable packages, with per-port
run quirks (swift `main.swift`, clojure ns/path depth, scala classpath) and
cosmetic inconsistencies (sorted vs insertion-order kind printing in smokes).


## 4. What else is needed

### 4.1 Encode invoke-mapping + null-mode in the model (resolves §3.1, §3.2)
This is where the provider converges with the aontu schema work
([`TESTSPEC_MODEL.md`](./TESTSPEC_MODEL.md)). Add **emitted** descriptors the
provider can read. Sketch:

```jsonic
# per function: how an entry's input maps onto the call
struct: getpath: api: { args: ['in.store', 'in.path'] }
struct: merge: api: { args: ['in'] } # in is the whole arg
struct: select: api: { args: ['in.obj', 'in.query'] }

# per group: comparison mode (default true = equal; false = equalStrict)
struct: validate: basic: nullmode: false
struct: transform: format: nullmode: false
```

* `args` is a list of **dotted path-expressions** resolved against the entry
(`in.store`, `in`, …). The provider can then build the argument vector itself,
and `AGENTS.md`'s mapping table disappears.
* **Honest limit:** not every call is pure-data-dispatchable. `filter` takes a
predicate selected by `in.check`, `walk` takes a callback, `transform.modify`
takes a modifier, `getpath.handler`/`inject`/`validate.special` take an
injection/current. These need a small set of **named resolvers** the consumer
registers once (e.g. `resolvers = { check: …, walkcb: … }`) and the model
references (`args: ['in.val', {resolver: 'check', key: 'in.check'}]`). So the
end state is *data-driven dispatch for the ~80% case + a handful of named
hooks*, not magic.
* `nullmode` is **per-group** (§3.2). Default emitted from the
`struct.&` template so only the exceptions are written.

These fields are additive — existing port runners ignore unknown keys, so
`test.json` consumers are unaffected (verify with the zero-diff guard from
`TESTSPEC_MODEL.md` §5, treating the new keys as intended additions).

### 4.2 Model `primary` / `DEF` / `client` / fixtures as first-class (§3.3)
Expose `primary.check` through the provider (a `clients()` / `entries('check')`
path), surface a group's `DEF` and fixtures via typed accessors rather than
`raw()`, and add `args`/`ctx` corpus coverage so those branches are real. Ties
to the `fixtures`/`DEF` slots proposed in `TESTSPEC_MODEL.md` §4.4.

### 4.3 Clone-on-read or a documented immutability contract (§3.4)
Either deep-clone `input`/`raw` on access (matches the runner), or document that
returned values are shared-immutable and provide an explicit `clone(entry)`.
Prefer clone-on-read for `input` (the thing tests touch most).

### 4.4 A provider-level conformance corpus + cross-port parity check (§3.7)
A small fixed set of `(helper, args, expected)` cases — especially for
`structMatch`/`matchval`/`equalStrict` edge cases — that every port runs, plus a
`tools/check_provider_parity.py` that asserts all 22 expose the same API and
pass that set. Without it, 22 hand-written ports *will* drift.

### 4.5 Configurable corpus discovery (§3.6)
`load(path)` explicit, plus a documented search order (env var → walk up for
`build/test/test.json` or `.sdk/test/test.json` → error). Fixes the
clojure/depth fragility noted in `RUNNING.md` along the way.

### 4.6 Decide who owns the assertion logic (§3.5)
Pick one and commit:
* **(a) Provider = data only.** Drop the helpers; the port's own `struct` utils
do matching. Maximally faithful, but every test reimplements comparison.
* **(b) Provider = the authority.** Keep the helpers but give them the
conformance suite from §4.4 so they are provably correct, and document that
they intentionally define corpus-match semantics independent of any port.

Recommendation: **(b)** — centralizing comparison is most of the value; just
make it earn the authority with tests.

### 4.7 Packaging (§3.8)
Per-language module/package layout, consistent smoke output, and a decision on
*home*: stay in `test/proto/` as reference, or vendor each port's provider into
that port's package so its test suite imports it directly.


## 5. Suggested order

1. **§4.1** (invoke-mapping + null-mode in the model) — unblocks the rest and is
the throughline with the schema work. Do it in the corpus model first, then
teach the canonical TS provider to read it, then propagate.
2. **§4.3** clone-on-read and **§4.5** corpus discovery — small, pure
correctness/robustness wins.
3. **§4.4** conformance corpus + parity check — lock the 22 together before
adding surface area.
4. **§4.2** primary/DEF/fixtures + args/ctx coverage.
5. **§4.6 / §4.7** ownership decision and packaging — last, once the contract is
stable.

Until §4.1–§4.4 land, treat the prototypes as a **proven core to build on**, not
a finished library: excellent for an agent writing per-case tests *with* the
`AGENTS.md` mapping table at hand, not yet a self-contained dependency.
18 changes: 18 additions & 0 deletions test/proto/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Build artifacts from running the prototype smokes — never commit these.
# Rust
rust/target/
# Java / JVM
*.class
*.jar
# C# / .NET
csharp/bin/
csharp/obj/
# OCaml
*.cmi
*.cmo
*.cmt
*.cmx
*.o
# Native binaries
*.out
smoke
149 changes: 149 additions & 0 deletions test/proto/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# test/proto — agent guidance

You are writing **actual tests** for a `struct` port using the **test provider**
library in this directory. The provider gives you normalized cases from the
shared corpus (`build/test/test.json`); you supply the part it deliberately does
not: *how a case's input maps onto the function call, and how to assert.*

Read [`PROVIDER.md`](./PROVIDER.md) first for the data model. This file is the
how-to.

## 0. What the provider does and does not do

- **Does:** load `test.json`, enumerate functions/groups, and hand you a flat
list of normalized `Entry` records — each with a tagged `input`, a tagged
`expect`, and provenance (`function`/`group`/`index`/`id`). Plus pure
comparison helpers (`equal`, `equalStrict`, `structMatch`, `errorMatches`,
`matchval`).
- **Does NOT:** call the function under test, assert, or know the function's
parameter order. That is your job — it is function-specific (§3).

## 1. The recipe

For each function you are testing:

1. `provider.entries("<fn>")` (optionally per group).
2. For each entry, **map `entry.input` onto the call** using §3.
3. Run the call (inside try/catch when an error may be expected).
4. **Assert against `entry.expect`** by its `kind`:

```
switch (entry.expect.kind) {
VALUE : assert equal(entry.expect.value, result) // or equalStrict — §4
ERROR : the call must throw; assert errorMatches(entry.expect.error, message)
MATCH : assert structMatch(entry.expect.match, resultContext).ok
ABSENT : assert the result is nullish (null/undefined/None/nil)
}
if (entry.expect.match != null && kind != MATCH) also assert structMatch(...) // co-existing match
```

For `MATCH`, the runner matches against a *context object*
`{ in, args, out: result, ctx }`, not the bare result — build the same shape
before calling `structMatch` (see the `merge` cases, whose match paths start
`args.0…`).

## 2. Entry quick-reference

```
entry.function / .group / .index / .id / .doc / .client # provenance
entry.input = { kind: IN|ARGS|CTX, in?, args?, ctx? }
entry.expect = { kind: VALUE|ERROR|MATCH|ABSENT, value?, error?, match? }
entry.raw # original map, escape hatch
```

`entry.input.in` is usually a small map (`{path, store}`, `{data, spec}`, …) you
destructure. When `kind` is `ARGS`, spread `entry.input.args`. When `CTX`, the
function takes the context map (these are the `primary.check` client cases).

## 3. Per-function input → call mapping

Derived from the canonical TS runner. `vin = entry.input.in`. Names are the
canonical function names (case per your language).

| Function (group) | Call to make |
|-------------------------|--------------|
| `getpath` basic | `getpath(vin.store, vin.path)` |
| `getpath` relative/handler | `getpath(vin.store, vin.path, vin.current)` (handler) |
| `getpath` special | `getpath(vin.store, vin.path, vin.inj)` |
| `merge` (most groups) | `merge(vin)` — `vin` is the list of objects |
| `merge` depth | `merge(vin.val, vin.depth)` |
| `transform` (most) | `transform(vin.data, vin.spec)` |
| `transform` modify | `transform(vin.data, vin.spec, <modifier>)` |
| `validate` (most) | `validate(vin.data, vin.spec)` |
| `validate` special | `validate(vin.data, vin.spec, vin.inj)` |
| `inject` deep | `inject(vin.val, vin.store)` |
| `select` (all) | `select(vin.obj, vin.query)` |
| `walk` basic/copy | `walk(vin, <callback>)` |
| `minor.isnode/ismap/islist/iskey/isempty/isfunc/clone/keysof/items/escre/escurl/typename/typify/size` | pass `vin` (or the whole `in`) directly: `fn(in)` |
| `minor.filter` | `filter(vin.val, <check from vin.check>)` |
| `minor.flatten` | `flatten(vin.val, vin.depth)` |
| `minor.getprop` | `getprop(vin.val, vin.key, vin.alt?)` |
| `minor.getelem` | `getelem(vin.val, vin.key, vin.alt?)` |
| `minor.setprop` | `setprop(vin.parent, vin.key, vin.val)` |
| `minor.delprop` | `delprop(vin.parent, vin.key)` |
| `minor.haskey` | `haskey(vin.val, vin.key)` |
| `minor.join` | `join(vin.val, vin.sep?)` |
| `minor.slice` | `slice(vin.val, vin.start, vin.end?)` |
| `minor.pad` | `pad(vin.val, …)` |
| `minor.setpath` | `setpath(vin.store, vin.path, vin.val)` |

When unsure, open the canonical `typescript/test/utility/StructUtility.test.ts`
— each `runset(spec.<fn>.<group>, …)` line shows the exact lambda. The corpus is
the contract; that file is the reference mapping.

## 4. The `null:false` functions — use `equalStrict`

Most functions treat absent ≡ null (use `equal`). These run with the runner's
`{ null: false }` flag, where an absent/undefined result is **distinct** from
JSON null — assert them with `equalStrict`:

```
iskey, strkey, isempty, clone, jsonify, getelem, getprop, haskey, join,
typify, size, slice, pad, setpath,
walk.depth, transform.format, validate.basic, validate.invalid,
and every group under `sentinels`.
```

(This flag is a property of the function's contract, not the corpus, so it is
not encoded in `Entry`. If a port disagrees, the corpus + canonical TS win.)

## 5. Worked example (TypeScript, `getpath`)

```ts
import { TestProvider, equal, errorMatches, structMatch } from './provider'
import { getpath } from '../../../typescript/dist/StructUtility' // your port's import

const provider = TestProvider.load()

for (const e of provider.entries('getpath')) {
const label = e.id ?? `${e.function}/${e.group}#${e.index}`
const vin = e.input.in
if (e.expect.kind === 'error') {
let threw = false
try { getpath(vin.store, vin.path) } catch (err: any) {
threw = true
assert(errorMatches(e.expect.error!, err.message), label)
}
assert(threw, `${label}: expected an error`)
} else if (e.expect.kind === 'value') {
assert(equal(e.expect.value, getpath(vin.store, vin.path)), label)
} else if (e.expect.kind === 'absent') {
assert(equal(null, getpath(vin.store, vin.path)), label)
} else if (e.expect.kind === 'match') {
const res = getpath(vin.store, vin.path)
assert(structMatch(e.expect.match, { in: e.raw.in, out: res }).ok, label)
}
}
```

Use your language's own test framework for `assert` and iteration — the provider
is framework-agnostic on purpose.

## 6. Rules

- **Never edit the corpus** to make a test pass (repo-wide rule; see top-level
`AGENTS.md`). If a port disagrees with a case, the port is wrong.
- **Keep the provider a pure data utility.** Comparison helpers must stay
side-effect-free; execution and assertion belong in the test you write.
- **Provenance in failures.** Always include `entry.id` (or
`function/group#index`) in assertion messages so a failure points at one case.
Loading
Loading