Skip to content

Proposal: a §9 conformance corpus (valid + invalid example bundles + a runner) #62

Description

@wey-gu

okf/tests/test_document.py unit-tests OKFDocument on inline strings, which is
great for the parser. What the repo does not yet have is a corpus of whole example
bundles that pins down the bundle-level conformance rules in SPEC §9. I would like to
contribute one, if it is welcome.

What it is. A small tests/conformance/ corpus plus a parametrized
test_conformance.py that validates each bundle with the existing OKFDocument (no
new dependencies):

  • valid/ bundles that MUST be conformant: a spec-minimal one (only type), a full
    one (recommended fields, a file-relative cross-link, # Citations, log.md,
    okf_version on the root index), a unicode/CJK one, a broken-link one (§5.3:
    consumers MUST tolerate broken links), and an extra-keys one (§4.1).
  • invalid/ bundles that MUST be flagged, each isolating one violation: missing
    type, empty type, no frontmatter, unterminated frontmatter, non-mapping
    frontmatter, and a README.md left frontmatter-less (it is not a reserved name per
    §3.1, so it reads as a malformed concept — a gotcha real producers hit).

It tests the spec's §9 bar (parseable frontmatter mapping + non-empty type),
which is deliberately looser than OKFDocument.validate()'s producer bar (that also
requires title/description/timestamp), so spec-minimal producers are not failed.

Why it helps. A conformance corpus is how a young spec pins down what it actually
means: it documents §9 by example, catches parser regressions, and gives anyone
writing a producer a concrete target to test against. It slots into the existing
okf/tests/ pytest setup and adds no dependencies.

Two questions it surfaces, that I would align the corpus to your answers on:

  1. Does a file with no frontmatter at all violate §9.1 (no block) or just §9.2
    (no type)? OKFDocument.parse treats a no-frontmatter file as an empty mapping
    rather than raising, so today the corpus catches that case via the missing-type
    check. If §9.1 is meant to require the presence of a block, the check (or
    parse) may want to distinguish "no block" from "empty block."
  2. Reserved-file shape (§6 index.md, §7 log.md) is not machine-checked yet. Worth
    a follow-up, or out of scope for §9 conformance?

I have the corpus and runner drafted and passing against the current OKFDocument
(5 valid bundles conformant, 6 invalid flagged). Happy to open the PR and sign the
CLA if the direction is agreeable. Context: I work on Nowledge Mem, which exports OKF
bundles, so this came out of validating real output.

Thanks for publishing OKF.

Wey Gu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions