Add feed plugin for parsing RSS and Atom syndication feeds#1
Conversation
Replace the JSONC parser with @jsonic/feed: an RSS (0.90, 0.91, 0.92, 1.0, 2.0) and Atom (0.3, 1.0) parser built on top of @jsonic/xml. By default every dialect is normalised to an Atom-shaped result; a `format: 'native'` option preserves the source dialect's structure and `format: 'raw'` returns the underlying XmlElement tree. - src/feed.ts: typed AtomFeed / Rss2Feed / Rss1Feed shapes, dialect detection, native parsers for each format, and best-effort RSS-to-Atom conversion mappings (guid -> id, enclosure -> link[rel], managingEditor -> authors, lastBuildDate -> updated, etc.). - Plugin form (`Jsonic.use(Feed)` adds a `.feed(src)` method) and a standalone `parseFeed(src, options)` helper. - test/feed.test.ts: hand-curated samples covering each dialect, the three output formats, the plugin form, and xhtml content extraction. - test/feedparser-wellformed/: focused subset of well-formed feed samples vendored from kurtmckee/feedparser (BSD 2-Clause); upstream LICENSE preserved alongside, attribution recorded in THIRD_PARTY_NOTICES.md. - test/feedparser.test.ts: dialect detection, no-error parse, and targeted value checks against the vendored corpus. Removed jsonc grammar, embed script, Go module, and JSONTestSuite corpus, all of which were specific to the previous JSONC parser.
Remove parseFeed() and the j.feed() method. With the Feed plugin installed, calling the jsonic instance directly (j(src)) returns the converted feed result, matching the standard jsonic plugin pattern. Implementation: register a `bc` (before-close) action on the `xml` root rule. After @jsonic/xml's own @xml-bc copies the parsed XmlElement onto ctx.root().node, our hook replaces it with the converted feed (Atom by default, native, or raw per options.format). Tests and README updated to use the j(src) API throughout.
Add a Go implementation of the feed parser at go/feed.go that mirrors the TypeScript plugin. Both languages now parse RSS 0.90 / 0.91 / 0.92 / 1.0 / 2.0 and Atom 0.3 / 1.0 into typed structures and default to a normalised Atom shape; format=native preserves the source dialect's structure and format=raw returns the underlying XmlElement tree from @jsonic/xml. Shared test fixtures live under test/specs/ — each base name has a .xml input, a .detect.json (expected dialect/version), an .atom.json (expected default output), and an optional .native.json. Both the TS and Go test suites enumerate the directory and JSON-compare results to expectations, so adding a fixture covers both languages automatically. The two language test suites also run the feedparser-wellformed corpus for no-error parsing plus a shared set of targeted value checks. Also fixes a re-entry bug in the bc hook: the xml rule's bc fires twice when there is trailing whitespace after the root element, so both implementations now skip the second invocation when r.Node has already been replaced with a converted feed. Makefile and CI workflow updated to build and test both languages.
The xml rule's bc fires once per close, including extra times when `r: xml` recurses to consume trailing whitespace after the root element. @jsonic/xml's own @xml-bc handles this idempotency by only acting when r.child.node is set (i.e. an element was just parsed in the current iteration). Mirror that idiom in our own bc hook so the guard expresses the actual condition rather than relying on the incidental fact that the previous iteration already replaced r.node. Behavior is unchanged; this is a clarity / robustness fix.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 15530f3c83
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const title = findChild(root, 'title') | ||
| if (title) feed.title = parseText(title) |
There was a problem hiding this comment.
Restrict Atom element lookup to Atom namespace
The Atom parser reads core fields like title by local name only (findChild(root, 'title')), so any extension element with the same local name (for example dc:title) that appears earlier will be parsed as the feed title. This causes incorrect results on mixed-namespace feeds, which are common in Atom/RSS ecosystems, and can silently corrupt normalized output. Please pass the Atom namespace when selecting core Atom elements (and apply the same rule consistently for entry-level fields).
Useful? React with 👍 / 👎.
| if t := findChild(root, "title"); t != nil { | ||
| feed.Title = parseText(t) |
There was a problem hiding this comment.
Enforce namespace when extracting Go Atom core fields
The Go Atom parser also matches core tags purely by local name (findChild(root, "title")), so extension tags such as dc:title can be mistaken for Atom core fields when they appear first. In real feeds with multiple namespaces, this produces wrong metadata without an error. The field extraction for Atom feed/entry elements should be namespace-aware to avoid collisions.
Useful? React with 👍 / 👎.
This PR replaces the JSONC plugin with a new Feed plugin that parses RSS (0.90, 0.91, 0.92, 1.0, 2.0) and Atom (0.3, 1.0) syndication feeds.
Summary
The Feed plugin is built on top of the
@jsonic/xmlplugin and normalizes all feed dialects to an Atom-shaped result by default. It supports three output formats:atom(default): Normalized Atom-shaped structurenative: Dialect-specific structure (preserves RSS or Atom format)raw: Raw XML element tree from the XML pluginKey Changes
src/feed.ts): Complete feed parser with support for multiple RSS and Atom versions, including comprehensive type definitions for all feed formatsgo/feed.go): Port of the TypeScript implementation with equivalent functionality and type structuresgo/feed_test.go): Tests covering various feed formats and edge cases using the feedparser test suiteREADME.md): Replaced JSONC documentation with Feed plugin documentationpackage.json,go/go.mod): Changed from JSONC to Feed pluginTHIRD_PARTY_NOTICES.md): Updated to reference feedparser test suite instead of JSON test suiteImplementation Details
format: 'native'is usedhttps://claude.ai/code/session_01Bu1d6Wy1spSXSMPpxjmbZ5