166 lines
7.8 KiB
HTML
166 lines
7.8 KiB
HTML
<!DOCTYPE html>
|
|
<html>
|
|
<!-- Created by GNU Texinfo 7.1, https://www.gnu.org/software/texinfo/ -->
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<!-- This manual documents Guile version 3.0.10.
|
|
|
|
Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
|
|
Inc.
|
|
|
|
Copyright (C) 2021 Maxime Devos
|
|
|
|
Copyright (C) 2024 Tomas Volf
|
|
|
|
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
any later version published by the Free Software Foundation; with no
|
|
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
|
|
copy of the license is included in the section entitled "GNU Free
|
|
Documentation License." -->
|
|
<title>PEG Internals (Guile Reference Manual)</title>
|
|
|
|
<meta name="description" content="PEG Internals (Guile Reference Manual)">
|
|
<meta name="keywords" content="PEG Internals (Guile Reference Manual)">
|
|
<meta name="resource-type" content="document">
|
|
<meta name="distribution" content="global">
|
|
<meta name="Generator" content=".texi2any-real">
|
|
<meta name="viewport" content="width=device-width,initial-scale=1">
|
|
|
|
<link href="index.html" rel="start" title="Top">
|
|
<link href="Concept-Index.html" rel="index" title="Concept Index">
|
|
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
|
<link href="PEG-Parsing.html" rel="up" title="PEG Parsing">
|
|
<link href="PEG-Tutorial.html" rel="prev" title="PEG Tutorial">
|
|
<style type="text/css">
|
|
<!--
|
|
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
|
|
div.example {margin-left: 3.2em}
|
|
span:hover a.copiable-link {visibility: visible}
|
|
-->
|
|
</style>
|
|
<link rel="stylesheet" type="text/css" href="https://www.gnu.org/software/gnulib/manual.css">
|
|
|
|
|
|
</head>
|
|
|
|
<body lang="en">
|
|
<div class="subsection-level-extent" id="PEG-Internals">
|
|
<div class="nav-panel">
|
|
<p>
|
|
Previous: <a href="PEG-Tutorial.html" accesskey="p" rel="prev">PEG Tutorial</a>, Up: <a href="PEG-Parsing.html" accesskey="u" rel="up">PEG Parsing</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
<hr>
|
|
<h4 class="subsection" id="PEG-Internals-1"><span>6.15.4 PEG Internals<a class="copiable-link" href="#PEG-Internals-1"> ¶</a></span></h4>
|
|
|
|
<p>A PEG parser takes a string as input and attempts to parse it as a given
|
|
nonterminal. The key idea of the PEG implementation is that every
|
|
nonterminal is just a function that takes a string as an argument and
|
|
attempts to parse that string as its nonterminal. The functions always
|
|
start from the beginning, but a parse is considered successful if there
|
|
is material left over at the end.
|
|
</p>
|
|
<p>This makes it easy to model different PEG parsing operations. For
|
|
instance, consider the PEG grammar <code class="code">"ab"</code>, which could also be
|
|
written <code class="code">(and "a" "b")</code>. It matches the string “ab”. Here’s how
|
|
that might be implemented in the PEG style:
|
|
</p>
|
|
<div class="example lisp">
|
|
<pre class="lisp-preformatted">(define (match-and-a-b str)
|
|
(match-a str)
|
|
(match-b str))
|
|
</pre></div>
|
|
|
|
<p>As you can see, the use of functions provides an easy way to model
|
|
sequencing. In a similar way, one could model <code class="code">(or a b)</code> with
|
|
something like the following:
|
|
</p>
|
|
<div class="example lisp">
|
|
<pre class="lisp-preformatted">(define (match-or-a-b str)
|
|
(or (match-a str) (match-b str)))
|
|
</pre></div>
|
|
|
|
<p>Here the semantics of a PEG <code class="code">or</code> expression map naturally onto
|
|
Scheme’s <code class="code">or</code> operator. This function will attempt to run
|
|
<code class="code">(match-a str)</code>, and return its result if it succeeds. Otherwise it
|
|
will run <code class="code">(match-b str)</code>.
|
|
</p>
|
|
<p>Of course, the code above wouldn’t quite work. We need some way for the
|
|
parsing functions to communicate. The actual interface used is below.
|
|
</p>
|
|
<h4 class="subsubheading" id="Parsing-Function-Interface"><span>Parsing Function Interface<a class="copiable-link" href="#Parsing-Function-Interface"> ¶</a></span></h4>
|
|
|
|
<p>A parsing function takes three arguments - a string, the length of that
|
|
string, and the position in that string it should start parsing at. In
|
|
effect, the parsing functions pass around substrings in pieces - the
|
|
first argument is a buffer of characters, and the second two give a
|
|
range within that buffer that the parsing function should look at.
|
|
</p>
|
|
<p>Parsing functions return either #f, if they failed to match their
|
|
nonterminal, or a list whose first element must be an integer
|
|
representing the final position in the string they matched and whose cdr
|
|
can be any other data the function wishes to return, or ’() if it
|
|
doesn’t have any more data.
|
|
</p>
|
|
<p>The one caveat is that if the extra data it returns is a list, any
|
|
adjacent strings in that list will be appended by <code class="code">match-pattern</code>. For
|
|
instance, if a parsing function returns <code class="code">(13 ("a" "b" "c"))</code>,
|
|
<code class="code">match-pattern</code> will take <code class="code">(13 ("abc"))</code> as its value.
|
|
</p>
|
|
<p>For example, here is a function to match “ab” using the actual
|
|
interface.
|
|
</p>
|
|
<div class="example lisp">
|
|
<pre class="lisp-preformatted">(define (match-a-b str len pos)
|
|
(and (<= (+ pos 2) len)
|
|
(string= str "ab" pos (+ pos 2))
|
|
(list (+ pos 2) '()))) ; we return no extra information
|
|
</pre></div>
|
|
|
|
<p>The above function can be used to match a string by running
|
|
<code class="code">(match-pattern match-a-b "ab")</code>.
|
|
</p>
|
|
<h4 class="subsubheading" id="Code-Generators-and-Extensible-Syntax"><span>Code Generators and Extensible Syntax<a class="copiable-link" href="#Code-Generators-and-Extensible-Syntax"> ¶</a></span></h4>
|
|
|
|
<p>PEG expressions, such as those in a <code class="code">define-peg-pattern</code> form, are
|
|
interpreted internally in two steps.
|
|
</p>
|
|
<p>First, any string PEG is expanded into an s-expression PEG by the code
|
|
in the <code class="code">(ice-9 peg string-peg)</code> module.
|
|
</p>
|
|
<p>Then, the s-expression PEG that results is compiled into a parsing
|
|
function by the <code class="code">(ice-9 peg codegen)</code> module. In particular, the
|
|
function <code class="code">compile-peg-pattern</code> is called on the s-expression. It then
|
|
decides what to do based on the form it is passed.
|
|
</p>
|
|
<p>The PEG syntax can be expanded by providing <code class="code">compile-peg-pattern</code> more
|
|
options for what to do with its forms. The extended syntax will be
|
|
associated with a symbol, for instance <code class="code">my-parsing-form</code>, and will
|
|
be called on all PEG expressions of the form
|
|
</p><div class="example lisp">
|
|
<pre class="lisp-preformatted">(my-parsing-form ...)
|
|
</pre></div>
|
|
|
|
<p>The parsing function should take two arguments. The first will be a
|
|
syntax object containing a list with all of the arguments to the form
|
|
(but not the form’s name), and the second will be the
|
|
<code class="code">capture-type</code> argument that is passed to <code class="code">define-peg-pattern</code>.
|
|
</p>
|
|
<p>New functions can be registered by calling <code class="code">(add-peg-compiler!
|
|
symbol function)</code>, where <code class="code">symbol</code> is the symbol that will indicate
|
|
a form of this type and <code class="code">function</code> is the code generating function
|
|
described above. The function <code class="code">add-peg-compiler!</code> is exported from
|
|
the <code class="code">(ice-9 peg codegen)</code> module.
|
|
</p>
|
|
</div>
|
|
<hr>
|
|
<div class="nav-panel">
|
|
<p>
|
|
Previous: <a href="PEG-Tutorial.html">PEG Tutorial</a>, Up: <a href="PEG-Parsing.html">PEG Parsing</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|