Skip to content

Regression in 5.7.0: addEntity() accepts #-prefixed names but parse() then rejects them — breaks @aws-sdk/xml-builder #824

@m8werk

Description

@m8werk

Summary
In 5.7.0, the new EntityReplacer rejects entity names containing # at parse time, but
addEntity() itself still accepts them silently. This breaks anyone who registered XML numeric character references via addEntity("#xD", "\r") — most prominently the official AWS
SDK v3.
In 5.6.0, both addEntity("#xD", ...) and the subsequent parse() call work. In 5.7.0 and 5.7.1, addEntity silently registers the entity and parse then throws on every document:

[EntityReplacer] Invalid character '#' in entity name: "#xD"

Reproducer

import { XMLParser } from 'fast-xml-parser';

const parser = new XMLParser({
processEntities: { enabled: true, maxTotalExpansions: Infinity },
htmlEntities: true,
});
parser.addEntity("#xD", "\r"); // accepted silently parser.parse('foo'); // throws on EVERY parse

┌─────────┬─────────────────────┬───────────────────────────────────────────────────┐
│ Version │ addEntity │ parse │ ├─────────┼─────────────────────┼───────────────────────────────────────────────────┤
│ 5.6.0 │ ✅ accepts │ ✅ parses │ ├─────────┼─────────────────────┼───────────────────────────────────────────────────┤
│ 5.7.0 │ ✅ accepts (silent) │ ❌ throws [EntityReplacer] Invalid character '#'… │
├─────────┼─────────────────────┼───────────────────────────────────────────────────┤
│ 5.7.1 │ ✅ accepts (silent) │ ❌ throws (same) │
└─────────┴─────────────────────┴───────────────────────────────────────────────────┘
Real-World Impact

This breaks @aws-sdk/xml-builder (dist-es/xml-parser.js):

// AWS SDK source parser.addEntity("#xD", "\r");
parser.addEntity("#10", "\n");
These are registered to handle XML numeric character references ( = CR, = LF) which S3 and other AWS services may include in response payloads. The W3C XML 1.0 spec
explicitly defines these as valid numeric character references.
After upgrading to 5.7.0 or 5.7.1 in any project that ships AWS SDK v3, every S3 XML
response (ListObjectsV2, HeadBucket, etc.) fails deserialization with HTTP 200 +
[EntityReplacer] Invalid character '#'…. Production services crash on startup if they call S3 in their boot path.

Proposed Fix

Either:

  1. addEntity() should reject #-prefixed names at registration time (clearer signal, breaks the contract loudly), or
  2. EntityReplacer should accept #-prefixed names (preserves backwards compatibility, treats
    them as numeric character references — which is what the W3C spec calls them).

Option 2 maintains compatibility with AWS SDK and any other consumer that uses this idiom.
The strict validation introduced in 5.7.0 may be over-strict per the XML 1.0 spec — #xD and
#10 correspond to valid numeric char refs / .

Workaround
Pinning fast-xml-parser <5.7 in transitive position works, e.g. via pnpm path-specific
override:
"pnpm": { "overrides": {
"@aws-sdk/xml-builder>fast-xml-parser": "^5.6.0"
} }

But this leaves users without the GHSA-gh4j-gqv2-49f6 CVE fix for the XMLBuilder side.
(Note: AWS SDK only uses the XMLParser, not the vulnerable XMLBuilder, so this workaround is
safe in that specific case.)

Environment

  • Node.js v24.x
  • Package manager: pnpm
  • Affected: @aws-sdk/xml-builder@3.972.x (and likely all earlier 3.x versions of the AWS SDK)
  • Reproducible: linux/amd64

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions