Skip to content

cl-sdk/io.github.cl-sdk.xml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

io.github.cl-sdk.xml

A Common Lisp XML reader, writer, and custom parser.

Installation

(ql:quickload "io.github.cl-sdk.xml")

Parsing

parse-xml accepts a string and an optional :handler keyword argument.

  • Default behaviour — when no handler is given, parse-xml returns an xml-document built by the built-in dom-builder handler (fully backward-compatible).
  • SAX behaviour — when a custom handler is supplied, the parser fires events on it and returns whatever end-document returns.
(defvar *doc*
  (io.github.cl-sdk.xml:parse-xml "<?xml version=\"1.0\"?>
<!-- preamble -->
<root>
  <item id=\"1\">hello &amp; world</item>
  <!-- note -->
  <![CDATA[literal <text>]]>
  <?app instruction?>
</root>"))

SAX parsing

Provide a subclass of sax-handler and pass an instance as :handler to parse-xml. Specialize only the event methods you care about; unspecialized methods are no-ops.

(defclass my-handler (io.github.cl-sdk.xml:sax-handler) ())

(defmethod io.github.cl-sdk.xml:start-element ((h my-handler) tag attributes)
  (format t "open  ~a ~a~%" tag attributes))

(defmethod io.github.cl-sdk.xml:end-element ((h my-handler) tag)
  (format t "close ~a~%" tag))

(defmethod io.github.cl-sdk.xml:end-document ((h my-handler))
  :done)

(io.github.cl-sdk.xml:parse-xml "<root><child /></root>" :handler (make-instance 'my-handler))
;; open  root nil
;; open  child nil
;; close child
;; close root
;; => :done

SAX handler generic functions

Generic function When called
(start-document handler) once, before any other event
(end-document handler) once, after all events; return value is parse-xml's result
(start-element handler tag attributes) opening / self-closing tag
(end-element handler tag) closing / self-closing tag
(characters handler text) character data (entity refs already expanded)
(comment handler data) <!-- … --> comment
(processing-instruction handler target data) <?target data?> PI
(cdata-section handler data) <![CDATA[…]]> section

xml-document

The top-level result of parse-xml.

Accessor Returns
xml-document-prolog list of xml-comment / xml-pi nodes before the root element
xml-document-root the root xml-node
(io.github.cl-sdk.xml:xml-document-prolog *doc*)
;; => (#<xml-pi "xml" …> #<xml-comment " preamble ">)

(io.github.cl-sdk.xml:xml-node-tag (io.github.cl-sdk.xml:xml-document-root *doc*))
;; => "root"

xml-node (element)

Accessor Returns
xml-node-tag element name as a string
xml-node-attributes alist of (name . value) string pairs
xml-node-children list of child nodes (see node types below)
(let* ((root (io.github.cl-sdk.xml:xml-document-root *doc*))
       (item (first (io.github.cl-sdk.xml:xml-node-children root))))
  (io.github.cl-sdk.xml:xml-node-tag item)                      ; => "item"
  (io.github.cl-sdk.xml:xml-node-attributes item)               ; => (("id" . "1"))
  (io.github.cl-sdk.xml:xml-node-children item))                ; => ("hello & world")

xml-comment

Represents a <!-- … --> comment.

Accessor Returns
xml-comment-data comment body as a string

xml-pi (processing instruction)

Represents a <?target data?> processing instruction.

Accessor Returns
xml-pi-target target name as a string
xml-pi-data data string (may be empty)

xml-cdata

Represents a <![CDATA[…]]> section.

Accessor Returns
xml-cdata-data literal content as a string

Node types inside xml-node-children

Each child of an xml-node is one of:

Type Produced by
xml-node <child …> / <child />
xml-comment <!-- … -->
xml-pi <?target data?>
xml-cdata <![CDATA[…]]>
string character data / entity references

Whitespace-only character data between elements is discarded.

XML 1.0 conformance

  • §2.3 NamesNameStartChar / NameChar Unicode ranges enforced
  • §2.3 / §3.3.3 Attribute values — bare < is an error; entity/character references expanded
  • §2.5 Comments-- inside a comment body is an error
  • §2.7 CDATA sections — content is literal (markup characters not interpreted)
  • §2.8 Prolog — XML declaration and DOCTYPE handled; prolog comments/PIs preserved
  • §3.1 Attributes — duplicate attribute names are an error
  • §4.6 References&amp; &lt; &gt; &quot; &apos; &#N; &#xN; expanded

References

io.github.cl-sdk.xml is a hand-written recursive-descent parser implemented in Common Lisp. It targets the specifications listed below.

  • Extensible Markup Language (XML) 1.0 — the core grammar and well-formedness rules that govern parsing, character data, entity references, comments, CDATA sections, processing instructions, and the document prolog.
  • XML Schema Part 1: Structures — the schema-definition language used as a reference for element and attribute declarations, content models, and type hierarchies.

License

Unlicense

About

A Common Lisp reader, writer, and custom parser.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages