Skip to content

🔐 Security / Adversarial XML test corpus gated in CI #38

Description

@docJerem

Context

The SAML/XML parsing and signature surface is the most security-sensitive part of ex_saml. The hardening initiative in #32 centralises parsing behind SafeXml and adds a CI audit gate, but a gate only proves that new call sites cannot bypass the wrapper — it does not prove the wrapper actually withstands hostile input. Today the test suite is small (~11 test files) and has no dedicated adversarial-input corpus.

We should add a versioned corpus of malicious / edge-case XML documents, executed in CI, so that every release demonstrably resists known attack classes and so regressions become impossible to merge silently.

Goal

A test/fixtures/security/xml/ corpus + a corpus-driven test that feeds each fixture through the real parsing/validation path and asserts the expected fail-closed (or canonical) outcome.

Proposed scope

  • A manifest (manifest.json) listing each fixture with: input file, attack class, and expected result (:invalid_xml, :bad_signature, specific violation atom, or a golden canonical output).
  • Initial attack classes:
    • XXE — external entity, internal entity expansion, billion-laughs / entity-expansion DoS, parameter entities, external DTD reference.
    • DOCTYPE present at all (must be rejected before deep processing).
    • Canonicalisation edge cases — inherited namespaces, mixed content, comment handling, attribute ordering — with golden .c14n outputs compared byte-for-byte against Core.Xml.C14n.
    • Encoding — UTF-8 / UTF-16 / BOM variants (regression guard for 🐛 Fix / UTF-8 characters in SAMLResponse rejected by xmerl_scan #22).
  • A single test module iterating the manifest so adding a case = adding a fixture + a manifest row (no new test code).

Why

Out of scope

  • Signature-wrapping fixtures (tracked separately).
  • Replacing xmerl with a different parser — this issue hardens the current path, it does not change the parser.

Relates to #32.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions