227 lines
12 KiB
HTML
227 lines
12 KiB
HTML
|
<!DOCTYPE html>
|
||
|
<html>
|
||
|
<!-- Created by GNU Texinfo 7.1, https://www.gnu.org/software/texinfo/ -->
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
||
|
<!-- This manual documents Guile version 3.0.10.
|
||
|
|
||
|
Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
|
||
|
Inc.
|
||
|
|
||
|
Copyright (C) 2021 Maxime Devos
|
||
|
|
||
|
Copyright (C) 2024 Tomas Volf
|
||
|
|
||
|
|
||
|
Permission is granted to copy, distribute and/or modify this document
|
||
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
||
|
any later version published by the Free Software Foundation; with no
|
||
|
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
|
||
|
copy of the license is included in the section entitled "GNU Free
|
||
|
Documentation License." -->
|
||
|
<title>Reading and Writing XML (Guile Reference Manual)</title>
|
||
|
|
||
|
<meta name="description" content="Reading and Writing XML (Guile Reference Manual)">
|
||
|
<meta name="keywords" content="Reading and Writing XML (Guile Reference Manual)">
|
||
|
<meta name="resource-type" content="document">
|
||
|
<meta name="distribution" content="global">
|
||
|
<meta name="Generator" content=".texi2any-real">
|
||
|
<meta name="viewport" content="width=device-width,initial-scale=1">
|
||
|
|
||
|
<link href="index.html" rel="start" title="Top">
|
||
|
<link href="Concept-Index.html" rel="index" title="Concept Index">
|
||
|
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
||
|
<link href="SXML.html" rel="up" title="SXML">
|
||
|
<link href="SSAX.html" rel="next" title="SSAX">
|
||
|
<link href="SXML-Overview.html" rel="prev" title="SXML Overview">
|
||
|
<style type="text/css">
|
||
|
<!--
|
||
|
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
|
||
|
div.example {margin-left: 3.2em}
|
||
|
span:hover a.copiable-link {visibility: visible}
|
||
|
strong.def-name {font-family: monospace; font-weight: bold; font-size: larger}
|
||
|
-->
|
||
|
</style>
|
||
|
<link rel="stylesheet" type="text/css" href="https://www.gnu.org/software/gnulib/manual.css">
|
||
|
|
||
|
|
||
|
</head>
|
||
|
|
||
|
<body lang="en">
|
||
|
<div class="subsection-level-extent" id="Reading-and-Writing-XML">
|
||
|
<div class="nav-panel">
|
||
|
<p>
|
||
|
Next: <a href="SSAX.html" accesskey="n" rel="next">SSAX: A Functional XML Parsing Toolkit</a>, Previous: <a href="SXML-Overview.html" accesskey="p" rel="prev">SXML Overview</a>, Up: <a href="SXML.html" accesskey="u" rel="up">SXML</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
||
|
</div>
|
||
|
<hr>
|
||
|
<h4 class="subsection" id="Reading-and-Writing-XML-1"><span>7.21.2 Reading and Writing XML<a class="copiable-link" href="#Reading-and-Writing-XML-1"> ¶</a></span></h4>
|
||
|
|
||
|
<p>The <code class="code">(sxml simple)</code> module presents a basic interface for parsing
|
||
|
XML from a port into the Scheme SXML format, and for serializing it back
|
||
|
to text.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(use-modules (sxml simple))
|
||
|
</pre></div>
|
||
|
|
||
|
<dl class="first-deffn">
|
||
|
<dt class="deffn" id="index-xml_002d_003esxml"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">xml->sxml</strong> <var class="def-var-arguments">[string-or-port] [#:namespaces=’()] [#:declare-namespaces?=#t] [#:trim-whitespace?=#f] [#:entities=’()] [#:default-entity-handler=#f] [#:doctype-handler=#f]</var><a class="copiable-link" href="#index-xml_002d_003esxml"> ¶</a></span></dt>
|
||
|
<dd><p>Use SSAX to parse an XML document into SXML. Takes one optional
|
||
|
argument, <var class="var">string-or-port</var>, which defaults to the current input
|
||
|
port. Returns the resulting SXML document. If <var class="var">string-or-port</var> is
|
||
|
a port, it will be left pointing at the next available character in the
|
||
|
port.
|
||
|
</p></dd></dl>
|
||
|
|
||
|
<p>As is normal in SXML, XML elements parse as tagged lists. Attributes,
|
||
|
if any, are placed after the tag, within an <code class="code">@</code> element. The root
|
||
|
of the resulting XML will be contained in a special tag, <code class="code">*TOP*</code>.
|
||
|
This tag will contain the root element of the XML, but also any prior
|
||
|
processing instructions.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(xml->sxml "<foo/>")
|
||
|
⇒ (*TOP* (foo))
|
||
|
(xml->sxml "<foo>text</foo>")
|
||
|
⇒ (*TOP* (foo "text"))
|
||
|
(xml->sxml "<foo kind=\"bar\">text</foo>")
|
||
|
⇒ (*TOP* (foo (@ (kind "bar")) "text"))
|
||
|
(xml->sxml "<?xml version=\"1.0\"?><foo/>")
|
||
|
⇒ (*TOP* (*PI* xml "version=\"1.0\"") (foo))
|
||
|
</pre></div>
|
||
|
|
||
|
<p>All namespaces in the XML document must be declared, via <code class="code">xmlns</code>
|
||
|
attributes. SXML elements built from non-default namespaces will have
|
||
|
their tags prefixed with their URI. Users can specify custom prefixes
|
||
|
for certain namespaces with the <code class="code">#:namespaces</code> keyword argument to
|
||
|
<code class="code">xml->sxml</code>.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(xml->sxml "<foo xmlns=\"http://example.org/ns1\">text</foo>")
|
||
|
⇒ (*TOP* (http://example.org/ns1:foo "text"))
|
||
|
(xml->sxml "<foo xmlns=\"http://example.org/ns1\">text</foo>"
|
||
|
#:namespaces '((ns1 . "http://example.org/ns1")))
|
||
|
⇒ (*TOP* (ns1:foo "text"))
|
||
|
(xml->sxml "<foo xmlns:bar=\"http://example.org/ns2\"><bar:baz/></foo>"
|
||
|
#:namespaces '((ns2 . "http://example.org/ns2")))
|
||
|
⇒ (*TOP* (foo (ns2:baz)))
|
||
|
</pre></div>
|
||
|
|
||
|
<p>By default, namespaces passed to <code class="code">xml->sxml</code> are treated as if they
|
||
|
were declared on the root element. Passing a false
|
||
|
<code class="code">#:declare-namespaces?</code> argument will disable this behavior,
|
||
|
requiring in-document declarations of namespaces before use..
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(xml->sxml "<foo><ns2:baz/></foo>"
|
||
|
#:namespaces '((ns2 . "http://example.org/ns2")))
|
||
|
⇒ (*TOP* (foo (ns2:baz)))
|
||
|
(xml->sxml "<foo><ns2:baz/></foo>"
|
||
|
#:namespaces '((ns2 . "http://example.org/ns2"))
|
||
|
#:declare-namespaces? #f)
|
||
|
⇒ error: undeclared namespace: `bar'
|
||
|
</pre></div>
|
||
|
|
||
|
<p>By default, all whitespace in XML is significant. Passing the
|
||
|
<code class="code">#:trim-whitespace?</code> keyword argument to <code class="code">xml->sxml</code> will trim
|
||
|
whitespace in front, behind and between elements, treating it as
|
||
|
“unsignificant”. Whitespace in text fragments is left alone.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>")
|
||
|
⇒ (*TOP* (foo "\n" (bar " Alfie the parrot! ") "\n"))
|
||
|
(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>"
|
||
|
#:trim-whitespace? #t)
|
||
|
⇒ (*TOP* (foo (bar " Alfie the parrot! ")))
|
||
|
</pre></div>
|
||
|
|
||
|
<p>Parsed entities may be declared with the <code class="code">#:entities</code> keyword
|
||
|
argument, or handled with the <code class="code">#:default-entity-handler</code>. By
|
||
|
default, only the standard <code class="code">&lt;</code>, <code class="code">&gt;</code>, <code class="code">&amp;</code>,
|
||
|
<code class="code">&apos;</code> and <code class="code">&quot;</code> entities are defined, as well as the
|
||
|
<code class="code">&#<var class="var">N</var>;</code> and <code class="code">&#x<var class="var">N</var>;</code> (decimal and hexadecimal)
|
||
|
numeric character entities.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(xml->sxml "<foo>&amp;</foo>")
|
||
|
⇒ (*TOP* (foo "&"))
|
||
|
(xml->sxml "<foo>&nbsp;</foo>")
|
||
|
⇒ error: undefined entity: nbsp
|
||
|
(xml->sxml "<foo>&#xA0;</foo>")
|
||
|
⇒ (*TOP* (foo "\xa0"))
|
||
|
(xml->sxml "<foo>&nbsp;</foo>"
|
||
|
#:entities '((nbsp . "\xa0")))
|
||
|
⇒ (*TOP* (foo "\xa0"))
|
||
|
(xml->sxml "<foo>&nbsp; &foo;</foo>"
|
||
|
#:default-entity-handler
|
||
|
(lambda (port name)
|
||
|
(case name
|
||
|
((nbsp) "\xa0")
|
||
|
(else
|
||
|
(format (current-warning-port)
|
||
|
"~a:~a:~a: undefined entitity: ~a\n"
|
||
|
(or (port-filename port) "<unknown file>")
|
||
|
(port-line port) (port-column port)
|
||
|
name)
|
||
|
(symbol->string name)))))
|
||
|
-| <unknown file>:0:17: undefined entitity: foo
|
||
|
⇒ (*TOP* (foo "\xa0 foo"))
|
||
|
</pre></div>
|
||
|
|
||
|
<p>By default, <code class="code">xml->sxml</code> skips over the <code class="code"><!DOCTYPE></code>
|
||
|
declaration, if any. This behavior can be overridden with the
|
||
|
<code class="code">#:doctype-handler</code> argument, which should be a procedure of three
|
||
|
arguments: the <em class="dfn">docname</em> (a symbol), <em class="dfn">systemid</em> (a string), and
|
||
|
the internal doctype subset (as a string or <code class="code">#f</code> if not present).
|
||
|
</p>
|
||
|
<p>The handler should return keyword arguments as multiple values, as if it
|
||
|
were calling its continuation with keyword arguments. The continuation
|
||
|
accepts the <code class="code">#:entities</code> and <code class="code">#:namespaces</code> keyword arguments,
|
||
|
in the same format that <code class="code">xml->sxml</code> itself takes. These entities
|
||
|
and namespaces will be prepended to those given to the <code class="code">xml->sxml</code>
|
||
|
invocation.
|
||
|
</p>
|
||
|
<div class="example">
|
||
|
<pre class="example-preformatted">(define (handle-foo docname systemid internal-subset)
|
||
|
(case docname
|
||
|
((foo)
|
||
|
(values #:entities '((greets . "<i>Hello, world!</i>"))))
|
||
|
(else
|
||
|
(values))))
|
||
|
|
||
|
(xml->sxml "<!DOCTYPE foo><p>&greets;</p>"
|
||
|
#:doctype-handler handle-foo)
|
||
|
⇒ (*TOP* (p (i "Hello, world!")))
|
||
|
</pre></div>
|
||
|
|
||
|
<p>If the document has no doctype declaration, the <var class="var">doctype-handler</var> is
|
||
|
invoked with <code class="code">#f</code> for the three arguments.
|
||
|
</p>
|
||
|
<p>In the future, the continuation may accept other keyword arguments, for
|
||
|
example to validate the parsed SXML against the doctype.
|
||
|
</p>
|
||
|
<dl class="first-deffn">
|
||
|
<dt class="deffn" id="index-sxml_002d_003exml"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">sxml->xml</strong> <var class="def-var-arguments">tree [port]</var><a class="copiable-link" href="#index-sxml_002d_003exml"> ¶</a></span></dt>
|
||
|
<dd><p>Serialize the SXML tree <var class="var">tree</var> as XML. The output will be written to
|
||
|
the current output port, unless the optional argument <var class="var">port</var> is
|
||
|
present.
|
||
|
</p></dd></dl>
|
||
|
|
||
|
<dl class="first-deffn">
|
||
|
<dt class="deffn" id="index-sxml_002d_003estring"><span class="category-def">Scheme Procedure: </span><span><strong class="def-name">sxml->string</strong> <var class="def-var-arguments">sxml</var><a class="copiable-link" href="#index-sxml_002d_003estring"> ¶</a></span></dt>
|
||
|
<dd><p>Detag an sxml tree <var class="var">sxml</var> into a string. Does not perform any
|
||
|
formatting.
|
||
|
</p></dd></dl>
|
||
|
|
||
|
</div>
|
||
|
<hr>
|
||
|
<div class="nav-panel">
|
||
|
<p>
|
||
|
Next: <a href="SSAX.html">SSAX: A Functional XML Parsing Toolkit</a>, Previous: <a href="SXML-Overview.html">SXML Overview</a>, Up: <a href="SXML.html">SXML</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
||
|
</div>
|
||
|
|
||
|
|
||
|
|
||
|
</body>
|
||
|
</html>
|