1
0
Fork 0
cl-sites/guile.html_node/sxml_002dmatch.html

473 lines
22 KiB
HTML
Raw Normal View History

2024-12-17 12:49:28 +01:00
<!DOCTYPE html>
<html>
<!-- Created by GNU Texinfo 7.1, https://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<!-- This manual documents Guile version 3.0.10.
Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
Inc.
Copyright (C) 2021 Maxime Devos
Copyright (C) 2024 Tomas Volf
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
copy of the license is included in the section entitled "GNU Free
Documentation License." -->
<title>sxml-match (Guile Reference Manual)</title>
<meta name="description" content="sxml-match (Guile Reference Manual)">
<meta name="keywords" content="sxml-match (Guile Reference Manual)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content=".texi2any-real">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link href="index.html" rel="start" title="Top">
<link href="Concept-Index.html" rel="index" title="Concept Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Guile-Modules.html" rel="up" title="Guile Modules">
<link href="The-Scheme-shell-_0028scsh_0029.html" rel="next" title="The Scheme shell (scsh)">
<link href="Expect.html" rel="prev" title="Expect">
<style type="text/css">
<!--
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
div.example {margin-left: 3.2em}
span:hover a.copiable-link {visibility: visible}
strong.def-name {font-family: monospace; font-weight: bold; font-size: larger}
ul.mark-bullet {list-style-type: disc}
-->
</style>
<link rel="stylesheet" type="text/css" href="https://www.gnu.org/software/gnulib/manual.css">
</head>
<body lang="en">
<div class="section-level-extent" id="sxml_002dmatch">
<div class="nav-panel">
<p>
Next: <a href="The-Scheme-shell-_0028scsh_0029.html" accesskey="n" rel="next">The Scheme shell (scsh)</a>, Previous: <a href="Expect.html" accesskey="p" rel="prev">Expect</a>, Up: <a href="Guile-Modules.html" accesskey="u" rel="up">Guile Modules</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<h3 class="section" id="sxml_002dmatch_003a-Pattern-Matching-of-SXML"><span>7.17 <code class="code">sxml-match</code>: Pattern Matching of SXML<a class="copiable-link" href="#sxml_002dmatch_003a-Pattern-Matching-of-SXML"> &para;</a></span></h3>
<a class="index-entry-id" id="index-pattern-matching-_0028SXML_0029"></a>
<a class="index-entry-id" id="index-SXML-pattern-matching"></a>
<p>The <code class="code">(sxml match)</code> module provides syntactic forms for pattern
matching of SXML trees, in a &ldquo;by example&rdquo; style reminiscent of the
pattern matching of the <code class="code">syntax-rules</code> and <code class="code">syntax-case</code> macro
systems. See <a class="xref" href="SXML.html">SXML</a>, for more information on SXML.
</p>
<p>The following example<a class="footnote" id="DOCF28" href="#FOOT28"><sup>28</sup></a> provides a brief
illustration, transforming a music album catalog language into HTML.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(define (album-&gt;html x)
(sxml-match x
((album (@ (title ,t)) (catalog (num ,n) (fmt ,f)) ...)
`(ul (li ,t)
(li (b ,n) (i ,f)) ...))))
</pre></div>
<p>Three macros are provided: <code class="code">sxml-match</code>, <code class="code">sxml-match-let</code>, and
<code class="code">sxml-match-let*</code>.
</p>
<p>Compared to a standard s-expression pattern matcher (see <a class="pxref" href="Pattern-Matching.html">Pattern Matching</a>), <code class="code">sxml-match</code> provides the following benefits:
</p>
<ul class="itemize mark-bullet">
<li>matching of SXML elements does not depend on any degree of normalization of the
SXML;
</li><li>matching of SXML attributes (within an element) is under-ordered; the order of
the attributes specified within the pattern need not match the ordering with the
element being matched;
</li><li>all attributes specified in the pattern must be present in the element being
matched; in the spirit that XML is &rsquo;extensible&rsquo;, the element being matched may
include additional attributes not specified in the pattern.
</li></ul>
<p>The present module is a descendant of WebIt!, and was inspired by an
s-expression pattern matcher developed by Erik Hilsdale, Dan Friedman, and Kent
Dybvig at Indiana University.
</p>
<ul class="mini-toc">
<li><a href="#Syntax" accesskey="1">Syntax</a></li>
<li><a href="#Matching-XML-Elements" accesskey="2">Matching XML Elements</a></li>
<li><a href="#Ellipses-in-Patterns" accesskey="3">Ellipses in Patterns</a></li>
<li><a href="#Ellipses-in-Quasiquote_0027d-Output" accesskey="4">Ellipses in Quasiquote&rsquo;d Output</a></li>
<li><a href="#Matching-Nodesets" accesskey="5">Matching Nodesets</a></li>
<li><a href="#Matching-the-_0060_0060Rest_0027_0027-of-a-Nodeset" accesskey="6">Matching the &ldquo;Rest&rdquo; of a Nodeset</a></li>
<li><a href="#Matching-the-Unmatched-Attributes" accesskey="7">Matching the Unmatched Attributes</a></li>
<li><a href="#Default-Values-in-Attribute-Patterns" accesskey="8">Default Values in Attribute Patterns</a></li>
<li><a href="#Guards-in-Patterns" accesskey="9">Guards in Patterns</a></li>
<li><a href="#Catamorphisms">Catamorphisms</a></li>
<li><a href="#Named_002dCatamorphisms">Named-Catamorphisms</a></li>
<li><a href="#sxml_002dmatch_002dlet-and-sxml_002dmatch_002dlet_002a"><code class="code">sxml-match-let</code> and <code class="code">sxml-match-let*</code></a></li>
</ul>
<div class="unnumberedsubsec-level-extent" id="Syntax">
<h4 class="unnumberedsubsec"><span>Syntax<a class="copiable-link" href="#Syntax"> &para;</a></span></h4>
<p><code class="code">sxml-match</code> provides <code class="code">case</code>-like form for pattern matching of XML
nodes.
</p>
<dl class="first-deffn">
<dt class="deffn" id="index-sxml_002dmatch"><span class="category-def">Scheme Syntax: </span><span><strong class="def-name">sxml-match</strong> <var class="def-var-arguments">input-expression clause1 clause2 &hellip;</var><a class="copiable-link" href="#index-sxml_002dmatch"> &para;</a></span></dt>
<dd><p>Match <var class="var">input-expression</var>, an SXML tree, according to the given <var class="var">clause</var>s
(one or more), each consisting of a pattern and one or more expressions to be
evaluated if the pattern match succeeds. Optionally, each <var class="var">clause</var> within
<code class="code">sxml-match</code> may include a <em class="dfn">guard expression</em>.
</p></dd></dl>
<p>The pattern notation is based on that of Scheme&rsquo;s <code class="code">syntax-rules</code> and
<code class="code">syntax-case</code> macro systems. The grammar for the <code class="code">sxml-match</code> syntax
is given below:
</p>
<pre class="verbatim">match-form ::= (sxml-match input-expression
clause+)
clause ::= [node-pattern action-expression+]
| [node-pattern (guard expression*) action-expression+]
node-pattern ::= literal-pattern
| pat-var-or-cata
| element-pattern
| list-pattern
literal-pattern ::= string
| character
| number
| #t
| #f
attr-list-pattern ::= (@ attribute-pattern*)
| (@ attribute-pattern* . pat-var-or-cata)
attribute-pattern ::= (tag-symbol attr-val-pattern)
attr-val-pattern ::= literal-pattern
| pat-var-or-cata
| (pat-var-or-cata default-value-expr)
element-pattern ::= (tag-symbol attr-list-pattern?)
| (tag-symbol attr-list-pattern? nodeset-pattern)
| (tag-symbol attr-list-pattern?
nodeset-pattern? . pat-var-or-cata)
list-pattern ::= (list nodeset-pattern)
| (list nodeset-pattern? . pat-var-or-cata)
| (list)
nodeset-pattern ::= node-pattern
| node-pattern ...
| node-pattern nodeset-pattern
| node-pattern ... nodeset-pattern
pat-var-or-cata ::= (unquote var-symbol)
| (unquote [var-symbol*])
| (unquote [cata-expression -&gt; var-symbol*])
</pre>
<p>Within a list or element body pattern, ellipses may appear only once, but may be
followed by zero or more node patterns.
</p>
<p>Guard expressions cannot refer to the return values of catamorphisms.
</p>
<p>Ellipses in the output expressions must appear only in an expression context;
ellipses are not allowed in a syntactic form.
</p>
<p>The sections below illustrate specific aspects of the <code class="code">sxml-match</code> pattern
matcher.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Matching-XML-Elements">
<h4 class="unnumberedsubsec"><span>Matching XML Elements<a class="copiable-link" href="#Matching-XML-Elements"> &para;</a></span></h4>
<p>The example below illustrates the pattern matching of an XML element:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(e (@ (i 1)) 3 4 5)
((e (@ (i ,d)) ,a ,b ,c) (list d a b c))
(,otherwise #f))
</pre></div>
<p>Each clause in <code class="code">sxml-match</code> contains two parts: a pattern and one or more
expressions which are evaluated if the pattern is successfully match. The
example above matches an element <code class="code">e</code> with an attribute <code class="code">i</code> and three
children.
</p>
<p>Pattern variables must be &ldquo;unquoted&rdquo; in the pattern. The above expression
binds <var class="var">d</var> to <code class="code">1</code>, <var class="var">a</var> to <code class="code">3</code>, <var class="var">b</var> to <code class="code">4</code>, and <var class="var">c</var>
to <code class="code">5</code>.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Ellipses-in-Patterns">
<h4 class="unnumberedsubsec"><span>Ellipses in Patterns<a class="copiable-link" href="#Ellipses-in-Patterns"> &para;</a></span></h4>
<p>As in <code class="code">syntax-rules</code>, ellipses may be used to specify a repeated pattern.
Note that the pattern <code class="code">item ...</code> specifies zero-or-more matches of the
pattern <code class="code">item</code>.
</p>
<p>The use of ellipses in a pattern is illustrated in the code fragment below,
where nested ellipses are used to match the children of repeated instances of an
<code class="code">a</code> element, within an element <code class="code">d</code>.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(define x '(d (a 1 2 3) (a 4 5) (a 6 7 8) (a 9 10)))
(sxml-match x
((d (a ,b ...) ...)
(list (list b ...) ...)))
</pre></div>
<p>The above expression returns a value of <code class="code">((1 2 3) (4 5) (6 7 8) (9 10))</code>.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Ellipses-in-Quasiquote_0027d-Output">
<h4 class="unnumberedsubsec"><span>Ellipses in Quasiquote&rsquo;d Output<a class="copiable-link" href="#Ellipses-in-Quasiquote_0027d-Output"> &para;</a></span></h4>
<p>Within the body of an <code class="code">sxml-match</code> form, a slightly extended version of
quasiquote is provided, which allows the use of ellipses. This is illustrated
in the example below.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(e 3 4 5 6 7)
((e ,i ... 6 7) `(&quot;start&quot; ,(list 'wrap i) ... &quot;end&quot;))
(,otherwise #f))
</pre></div>
<p>The general pattern is that <code class="code">`(something ,i ...)</code> is rewritten as
<code class="code">`(something ,@i)</code>.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Matching-Nodesets">
<h4 class="unnumberedsubsec"><span>Matching Nodesets<a class="copiable-link" href="#Matching-Nodesets"> &para;</a></span></h4>
<p>A nodeset pattern is designated by a list in the pattern, beginning the
identifier list. The example below illustrates matching a nodeset.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(&quot;i&quot; &quot;j&quot; &quot;k&quot; &quot;l&quot; &quot;m&quot;)
((list ,a ,b ,c ,d ,e)
`((p ,a) (p ,b) (p ,c) (p ,d) (p ,e))))
</pre></div>
<p>This example wraps each nodeset item in an HTML paragraph element. This example
can be rewritten and simplified through using ellipsis:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(&quot;i&quot; &quot;j&quot; &quot;k&quot; &quot;l&quot; &quot;m&quot;)
((list ,i ...)
`((p ,i) ...)))
</pre></div>
<p>This version will match nodesets of any length, and wrap each item in the
nodeset in an HTML paragraph element.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Matching-the-_0060_0060Rest_0027_0027-of-a-Nodeset">
<h4 class="unnumberedsubsec"><span>Matching the &ldquo;Rest&rdquo; of a Nodeset<a class="copiable-link" href="#Matching-the-_0060_0060Rest_0027_0027-of-a-Nodeset"> &para;</a></span></h4>
<p>Matching the &ldquo;rest&rdquo; of a nodeset is achieved by using a <code class="code">. rest)</code> pattern
at the end of an element or nodeset pattern.
</p>
<p>This is illustrated in the example below:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(e 3 (f 4 5 6) 7)
((e ,a (f . ,y) ,d)
(list a y d)))
</pre></div>
<p>The above expression returns <code class="code">(3 (4 5 6) 7)</code>.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Matching-the-Unmatched-Attributes">
<h4 class="unnumberedsubsec"><span>Matching the Unmatched Attributes<a class="copiable-link" href="#Matching-the-Unmatched-Attributes"> &para;</a></span></h4>
<p>Sometimes it is useful to bind a list of attributes present in the element being
matched, but which do not appear in the pattern. This is achieved by using a
<code class="code">. rest)</code> pattern at the end of the attribute list pattern. This is
illustrated in the example below:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(a (@ (z 1) (y 2) (x 3)) 4 5 6)
((a (@ (y ,www) . ,qqq) ,t ,u ,v)
(list www qqq t u v)))
</pre></div>
<p>The above expression matches the attribute <code class="code">y</code> and binds a list of the
remaining attributes to the variable <var class="var">qqq</var>. The result of the above
expression is <code class="code">(2 ((z 1) (x 3)) 4 5 6)</code>.
</p>
<p>This type of pattern also allows the binding of all attributes:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(a (@ (z 1) (y 2) (x 3)))
((a (@ . ,qqq))
qqq))
</pre></div>
</div>
<div class="unnumberedsubsec-level-extent" id="Default-Values-in-Attribute-Patterns">
<h4 class="unnumberedsubsec"><span>Default Values in Attribute Patterns<a class="copiable-link" href="#Default-Values-in-Attribute-Patterns"> &para;</a></span></h4>
<p>It is possible to specify a default value for an attribute which is used if the
attribute is not present in the element being matched. This is illustrated in
the following example:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(e 3 4 5)
((e (@ (z (,d 1))) ,a ,b ,c) (list d a b c)))
</pre></div>
<p>The value <code class="code">1</code> is used when the attribute <code class="code">z</code> is absent from the
element <code class="code">e</code>.
</p>
</div>
<div class="unnumberedsubsec-level-extent" id="Guards-in-Patterns">
<h4 class="unnumberedsubsec"><span>Guards in Patterns<a class="copiable-link" href="#Guards-in-Patterns"> &para;</a></span></h4>
<p>Guards may be added to a pattern clause via the <code class="code">guard</code> keyword. A guard
expression may include zero or more expressions which are evaluated only if the
pattern is matched. The body of the clause is only evaluated if the guard
expressions evaluate to <code class="code">#t</code>.
</p>
<p>The use of guard expressions is illustrated below:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match '(a 2 3)
((a ,n) (guard (number? n)) n)
((a ,m ,n) (guard (number? m) (number? n)) (+ m n)))
</pre></div>
</div>
<div class="unnumberedsubsec-level-extent" id="Catamorphisms">
<h4 class="unnumberedsubsec"><span>Catamorphisms<a class="copiable-link" href="#Catamorphisms"> &para;</a></span></h4>
<p>The example below illustrates the use of explicit recursion within an
<code class="code">sxml-match</code> form. This example implements a simple calculator for the
basic arithmetic operations, which are represented by the XML elements
<code class="code">plus</code>, <code class="code">minus</code>, <code class="code">times</code>, and <code class="code">div</code>.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(define simple-eval
(lambda (x)
(sxml-match x
(,i (guard (integer? i)) i)
((plus ,x ,y) (+ (simple-eval x) (simple-eval y)))
((times ,x ,y) (* (simple-eval x) (simple-eval y)))
((minus ,x ,y) (- (simple-eval x) (simple-eval y)))
((div ,x ,y) (/ (simple-eval x) (simple-eval y)))
(,otherwise (error &quot;simple-eval: invalid expression&quot; x)))))
</pre></div>
<p>Using the catamorphism feature of <code class="code">sxml-match</code>, a more concise version of
<code class="code">simple-eval</code> can be written. The pattern <code class="code">,(x)</code> recursively invokes
the pattern matcher on the value bound in this position.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(define simple-eval
(lambda (x)
(sxml-match x
(,i (guard (integer? i)) i)
((plus ,(x) ,(y)) (+ x y))
((times ,(x) ,(y)) (* x y))
((minus ,(x) ,(y)) (- x y))
((div ,(x) ,(y)) (/ x y))
(,otherwise (error &quot;simple-eval: invalid expression&quot; x)))))
</pre></div>
</div>
<div class="unnumberedsubsec-level-extent" id="Named_002dCatamorphisms">
<h4 class="unnumberedsubsec"><span>Named-Catamorphisms<a class="copiable-link" href="#Named_002dCatamorphisms"> &para;</a></span></h4>
<p>It is also possible to explicitly name the operator in the &ldquo;cata&rdquo; position.
Where <code class="code">,(id*)</code> recurs to the top of the current <code class="code">sxml-match</code>,
<code class="code">,(cata -&gt; id*)</code> recurs to <code class="code">cata</code>. <code class="code">cata</code> must evaluate to a
procedure which takes one argument, and returns as many values as there are
identifiers following <code class="code">-&gt;</code>.
</p>
<p>Named catamorphism patterns allow processing to be split into multiple, mutually
recursive procedures. This is illustrated in the example below: a
transformation that formats a &ldquo;TV Guide&rdquo; into HTML.
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(define (tv-guide-&gt;html g)
(define (cast-list cl)
(sxml-match cl
((CastList (CastMember (Character (Name ,ch)) (Actor (Name ,a))) ...)
`(div (ul (li ,ch &quot;: &quot; ,a) ...)))))
(define (prog p)
(sxml-match p
((Program (Start ,start-time) (Duration ,dur) (Series ,series-title)
(Description ,desc ...))
`(div (p ,start-time
(br) ,series-title
(br) ,desc ...)))
((Program (Start ,start-time) (Duration ,dur) (Series ,series-title)
(Description ,desc ...)
,(cast-list -&gt; cl))
`(div (p ,start-time
(br) ,series-title
(br) ,desc ...)
,cl))))
(sxml-match g
((TVGuide (@ (start ,start-date)
(end ,end-date))
(Channel (Name ,nm) ,(prog -&gt; p) ...) ...)
`(html (head (title &quot;TV Guide&quot;))
(body (h1 &quot;TV Guide&quot;)
(div (h2 ,nm) ,p ...) ...)))))
</pre></div>
</div>
<div class="unnumberedsubsec-level-extent" id="sxml_002dmatch_002dlet-and-sxml_002dmatch_002dlet_002a">
<h4 class="unnumberedsubsec"><span><code class="code">sxml-match-let</code> and <code class="code">sxml-match-let*</code><a class="copiable-link" href="#sxml_002dmatch_002dlet-and-sxml_002dmatch_002dlet_002a"> &para;</a></span></h4>
<dl class="first-deffn">
<dt class="deffn" id="index-sxml_002dmatch_002dlet"><span class="category-def">Scheme Syntax: </span><span><strong class="def-name">sxml-match-let</strong> <var class="def-var-arguments">((pat expr) ...) expression0 expression ...</var><a class="copiable-link" href="#index-sxml_002dmatch_002dlet"> &para;</a></span></dt>
<dt class="deffnx def-cmd-deffn" id="index-sxml_002dmatch_002dlet_002a"><span class="category-def">Scheme Syntax: </span><span><strong class="def-name">sxml-match-let*</strong> <var class="def-var-arguments">((pat expr) ...) expression0 expression ...</var><a class="copiable-link" href="#index-sxml_002dmatch_002dlet_002a"> &para;</a></span></dt>
<dd><p>These forms generalize the <code class="code">let</code> and <code class="code">let*</code> forms of Scheme to allow
an XML pattern in the binding position, rather than a simple variable.
</p></dd></dl>
<p>For example, the expression below:
</p>
<div class="example lisp">
<pre class="lisp-preformatted">(sxml-match-let (((a ,i ,j) '(a 1 2)))
(+ i j))
</pre></div>
<p>binds the variables <var class="var">i</var> and <var class="var">j</var> to <code class="code">1</code> and <code class="code">2</code> in the XML
value given.
</p>
</div>
</div>
<div class="footnotes-segment">
<hr>
<h4 class="footnotes-heading">Footnotes</h4>
<h5 class="footnote-body-heading"><a id="FOOT28" href="#DOCF28">(28)</a></h5>
<p>This example is taken from a paper by
Krishnamurthi et al. Their paper was the first to show the usefulness of the
<code class="code">syntax-rules</code> style of pattern matching for transformation of XML, though
the language described, XT3D, is an XML language.</p>
</div>
<hr>
<div class="nav-panel">
<p>
Next: <a href="The-Scheme-shell-_0028scsh_0029.html">The Scheme shell (scsh)</a>, Previous: <a href="Expect.html">Expect</a>, Up: <a href="Guile-Modules.html">Guile Modules</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
</body>
</html>