227 lines
12 KiB
227 lines
12 KiB
<!DOCTYPE html>
<!-- Created by GNU Texinfo 7.1, https://www.gnu.org/software/texinfo/ -->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<!-- This manual documents Guile version 3.0.10.
Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
Copyright (C) 2021 Maxime Devos
Copyright (C) 2024 Tomas Volf
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
copy of the license is included in the section entitled "GNU Free
Documentation License." -->
<title>Reading and Writing XML (Guile Reference Manual)</title>
<meta name="description" content="Reading and Writing XML (Guile Reference Manual)">
<meta name="keywords" content="Reading and Writing XML (Guile Reference Manual)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content=".texi2any-real">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link href="index.html" rel="start" title="Top">
<link href="Concept-Index.html" rel="index" title="Concept Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="SXML.html" rel="up" title="SXML">
<link href="SSAX.html" rel="next" title="SSAX">
<link href="SXML-Overview.html" rel="prev" title="SXML Overview">
<style type="text/css">
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
div.example {margin-left: 3.2em}
span:hover a.copiable-link {visibility: visible}
strong.def-name {font-family: monospace; font-weight: bold; font-size: larger}
<link rel="stylesheet" type="text/css" href="https://www.gnu.org/software/gnulib/manual.css">
<body lang="en">
<div class="subsection-level-extent" id="Reading-and-Writing-XML">
<div class="nav-panel">
Next: <a href="SSAX.html" accesskey="n" rel="next">SSAX: A Functional XML Parsing Toolkit</a>, Previous: <a href="SXML-Overview.html" accesskey="p" rel="prev">SXML Overview</a>, Up: <a href="SXML.html" accesskey="u" rel="up">SXML</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
<h4 class="subsection" id="Reading-and-Writing-XML-1"><span>7.21.2 Reading and Writing XML<a class="copiable-link" href="#Reading-and-Writing-XML-1"> ¶</a></span></h4>
<p>The <code class="code">(sxml simple)</code> module presents a basic interface for parsing
XML from a port into the Scheme SXML format, and for serializing it back
to text.
<div class="example">
<pre class="example-preformatted">(use-modules (sxml simple))
<dl class="first-deffn">
<dt class="deffn" id="index-xml_002d_003esxml"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">xml->sxml</strong> <var class="def-var-arguments">[string-or-port] [#:namespaces=’()] [#:declare-namespaces?=#t] [#:trim-whitespace?=#f] [#:entities=’()] [#:default-entity-handler=#f] [#:doctype-handler=#f]</var><a class="copiable-link" href="#index-xml_002d_003esxml"> ¶</a></span></dt>
<dd><p>Use SSAX to parse an XML document into SXML. Takes one optional
argument, <var class="var">string-or-port</var>, which defaults to the current input
port. Returns the resulting SXML document. If <var class="var">string-or-port</var> is
a port, it will be left pointing at the next available character in the
<p>As is normal in SXML, XML elements parse as tagged lists. Attributes,
if any, are placed after the tag, within an <code class="code">@</code> element. The root
of the resulting XML will be contained in a special tag, <code class="code">*TOP*</code>.
This tag will contain the root element of the XML, but also any prior
processing instructions.
<div class="example">
<pre class="example-preformatted">(xml->sxml "<foo/>")
⇒ (*TOP* (foo))
(xml->sxml "<foo>text</foo>")
⇒ (*TOP* (foo "text"))
(xml->sxml "<foo kind=\"bar\">text</foo>")
⇒ (*TOP* (foo (@ (kind "bar")) "text"))
(xml->sxml "<?xml version=\"1.0\"?><foo/>")
⇒ (*TOP* (*PI* xml "version=\"1.0\"") (foo))
<p>All namespaces in the XML document must be declared, via <code class="code">xmlns</code>
attributes. SXML elements built from non-default namespaces will have
their tags prefixed with their URI. Users can specify custom prefixes
for certain namespaces with the <code class="code">#:namespaces</code> keyword argument to
<code class="code">xml->sxml</code>.
<div class="example">
<pre class="example-preformatted">(xml->sxml "<foo xmlns=\"http://example.org/ns1\">text</foo>")
⇒ (*TOP* (http://example.org/ns1:foo "text"))
(xml->sxml "<foo xmlns=\"http://example.org/ns1\">text</foo>"
#:namespaces '((ns1 . "http://example.org/ns1")))
⇒ (*TOP* (ns1:foo "text"))
(xml->sxml "<foo xmlns:bar=\"http://example.org/ns2\"><bar:baz/></foo>"
#:namespaces '((ns2 . "http://example.org/ns2")))
⇒ (*TOP* (foo (ns2:baz)))
<p>By default, namespaces passed to <code class="code">xml->sxml</code> are treated as if they
were declared on the root element. Passing a false
<code class="code">#:declare-namespaces?</code> argument will disable this behavior,
requiring in-document declarations of namespaces before use..
<div class="example">
<pre class="example-preformatted">(xml->sxml "<foo><ns2:baz/></foo>"
#:namespaces '((ns2 . "http://example.org/ns2")))
⇒ (*TOP* (foo (ns2:baz)))
(xml->sxml "<foo><ns2:baz/></foo>"
#:namespaces '((ns2 . "http://example.org/ns2"))
#:declare-namespaces? #f)
⇒ error: undeclared namespace: `bar'
<p>By default, all whitespace in XML is significant. Passing the
<code class="code">#:trim-whitespace?</code> keyword argument to <code class="code">xml->sxml</code> will trim
whitespace in front, behind and between elements, treating it as
“unsignificant”. Whitespace in text fragments is left alone.
<div class="example">
<pre class="example-preformatted">(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>")
⇒ (*TOP* (foo "\n" (bar " Alfie the parrot! ") "\n"))
(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>"
#:trim-whitespace? #t)
⇒ (*TOP* (foo (bar " Alfie the parrot! ")))
<p>Parsed entities may be declared with the <code class="code">#:entities</code> keyword
argument, or handled with the <code class="code">#:default-entity-handler</code>. By
default, only the standard <code class="code">&lt;</code>, <code class="code">&gt;</code>, <code class="code">&amp;</code>,
<code class="code">&apos;</code> and <code class="code">&quot;</code> entities are defined, as well as the
<code class="code">&#<var class="var">N</var>;</code> and <code class="code">&#x<var class="var">N</var>;</code> (decimal and hexadecimal)
numeric character entities.
<div class="example">
<pre class="example-preformatted">(xml->sxml "<foo>&amp;</foo>")
⇒ (*TOP* (foo "&"))
(xml->sxml "<foo>&nbsp;</foo>")
⇒ error: undefined entity: nbsp
(xml->sxml "<foo>&#xA0;</foo>")
⇒ (*TOP* (foo "\xa0"))
(xml->sxml "<foo>&nbsp;</foo>"
#:entities '((nbsp . "\xa0")))
⇒ (*TOP* (foo "\xa0"))
(xml->sxml "<foo>&nbsp; &foo;</foo>"
(lambda (port name)
(case name
((nbsp) "\xa0")
(format (current-warning-port)
"~a:~a:~a: undefined entitity: ~a\n"
(or (port-filename port) "<unknown file>")
(port-line port) (port-column port)
(symbol->string name)))))
-| <unknown file>:0:17: undefined entitity: foo
⇒ (*TOP* (foo "\xa0 foo"))
<p>By default, <code class="code">xml->sxml</code> skips over the <code class="code"><!DOCTYPE></code>
declaration, if any. This behavior can be overridden with the
<code class="code">#:doctype-handler</code> argument, which should be a procedure of three
arguments: the <em class="dfn">docname</em> (a symbol), <em class="dfn">systemid</em> (a string), and
the internal doctype subset (as a string or <code class="code">#f</code> if not present).
<p>The handler should return keyword arguments as multiple values, as if it
were calling its continuation with keyword arguments. The continuation
accepts the <code class="code">#:entities</code> and <code class="code">#:namespaces</code> keyword arguments,
in the same format that <code class="code">xml->sxml</code> itself takes. These entities
and namespaces will be prepended to those given to the <code class="code">xml->sxml</code>
<div class="example">
<pre class="example-preformatted">(define (handle-foo docname systemid internal-subset)
(case docname
(values #:entities '((greets . "<i>Hello, world!</i>"))))
(xml->sxml "<!DOCTYPE foo><p>&greets;</p>"
#:doctype-handler handle-foo)
⇒ (*TOP* (p (i "Hello, world!")))
<p>If the document has no doctype declaration, the <var class="var">doctype-handler</var> is
invoked with <code class="code">#f</code> for the three arguments.
<p>In the future, the continuation may accept other keyword arguments, for
example to validate the parsed SXML against the doctype.
<dl class="first-deffn">
<dt class="deffn" id="index-sxml_002d_003exml"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">sxml->xml</strong> <var class="def-var-arguments">tree [port]</var><a class="copiable-link" href="#index-sxml_002d_003exml"> ¶</a></span></dt>
<dd><p>Serialize the SXML tree <var class="var">tree</var> as XML. The output will be written to
the current output port, unless the optional argument <var class="var">port</var> is
<dl class="first-deffn">
<dt class="deffn" id="index-sxml_002d_003estring"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">sxml->string</strong> <var class="def-var-arguments">sxml</var><a class="copiable-link" href="#index-sxml_002d_003estring"> ¶</a></span></dt>
<dd><p>Detag an sxml tree <var class="var">sxml</var> into a string. Does not perform any
<div class="nav-panel">
Next: <a href="SSAX.html">SSAX: A Functional XML Parsing Toolkit</a>, Previous: <a href="SXML-Overview.html">SXML Overview</a>, Up: <a href="SXML.html">SXML</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>