Skip to content

content_index validation not implemented (future tranche) #2

@samjanny

Description

@samjanny

The section 10 content-index flow is not implemented in this library.

When a verified manifest carries content_root, a conforming client MUST, as part of Stage 9:

  • fetch /content_index.json from the same carrier origin (per the section 09 content-index fetch rules);
  • verify SHA-256 of the response body bytes against the manifest's content_root;
  • for each content document at a path present in the index, verify its seq and body hash against the index entry before rendering.

Current state: content_root is validated for syntax only (a sha-256: digest in the manifest schema, section 06). The fetch, the hash binding, and the per-document seq / hash checks are absent. The seven section 11 codes for this flow are present in DiagnosticCode but unreachable: E_CONTENT_INDEX_FETCH_FAILED, E_CONTENT_INDEX_HASH_MISMATCH, E_CONTENT_INDEX_INVALID, E_CONTENT_SEQ_MISSING, E_CONTENT_SEQ_ROLLBACK, E_CONTENT_SEQ_UNCOMMITTED, E_CONTENT_HASH_MISMATCH.

Security impact: the content index is the defense against a K_runtime-only attacker (runtime key compromised, publisher key intact). content_root is K_publisher-signed and binds the valid (path, seq, hash) set, so a runtime-key-only attacker cannot forge it. Without the flow, a site that declares content_root is rendered with content protected only by the K_runtime signature, so such an attacker could serve rolled-back (E_CONTENT_SEQ_ROLLBACK), uncommitted (E_CONTENT_SEQ_UNCOMMITTED), or body-forged (E_CONTENT_HASH_MISMATCH) content that a content-index-validating client would reject. Section 10 treats a manifest that commits to a content index but cannot deliver a valid one as a hard security failure. Sites that do not declare content_root are unaffected.

Deferred as a self-contained future tranche. Implementing it needs the multi-fetch flow (manifest -> content index -> content document) plus the transport-layer fetch obligations from section 09, and a corpus vector-schema extension to model the multi-document scenario (the current corpus is single-document). The Rust reference implementation (samjanny/entangled-api) implements the full flow and can be used where content-index enforcement is required.

Documented in the README under "Known limitations".

Scope:

  • content index fetch + transport obligations (section 09)
  • SHA-256 hash binding against content_root
  • structural validation of the content index
  • per-document seq / hash verification at Stage 9
  • corpus vector-schema extension + vectors for each failure mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions