emacs.d/clones/lisp/stevelosh.com/blog/2022/04/fun-with-macros-do-file/index.html
2022-10-07 15:47:14 +02:00

230 lines
No EOL
20 KiB
HTML

<!DOCTYPE html>
<html lang='en'><head><meta charset='utf-8' /><meta name='pinterest' content='nopin' /><link href='../../../../static/css/style.css' rel='stylesheet' type='text/css' /><link href='../../../../static/css/print.css' rel='stylesheet' type='text/css' media='print' /><title>Fun with Macros: Do-File / Steve Losh</title></head><body><header><a id='logo' href='https://stevelosh.com/'>Steve Losh</a><nav><a href='../../../index.html'>Blog</a> - <a href='https://stevelosh.com/projects/'>Projects</a> - <a href='https://stevelosh.com/photography/'>Photography</a> - <a href='https://stevelosh.com/links/'>Links</a> - <a href='https://stevelosh.com/rss.xml'>Feed</a></nav></header><hr class='main-separator' /><main id='page-blog-entry'><article><h1><a href='index.html'>Fun with Macros: Do-File</a></h1><p class='date'>Posted on April 19th, 2022.</p><p>It's been a while, but it's time to take a look at another fun little Common
Lisp macro with some interesting things inside it: <code>do-file</code>.</p>
<ol class="table-of-contents"><li><a href="index.html#s1-usage">Usage</a></li><li><a href="index.html#s2-implementation">Implementation</a><ol><li><a href="index.html#s3-let-over-defmacro">Let Over Defmacro</a></li><li><a href="index.html#s4-rest-and-key">&amp;rest and &amp;key</a></li><li><a href="index.html#s5-macros-using-macros">Macros Using Macros</a></li><li><a href="index.html#s6-don-t-loop">Don't Loop</a></li><li><a href="index.html#s7-repetition-allergies">Repetition Allergies</a></li></ol></li><li><a href="index.html#s8-result">Result</a></li></ol>
<h2 id="s1-usage"><a href="index.html#s1-usage">Usage</a></h2>
<p>The macro we'll be taking a look at today is called <code>do-file</code>. It's used to
open a file and iterate over the contents using a reader function, saving you
some tedious boilerplate.</p>
<p>First let's look at some examples of how you could use it. Processing each
line of a file is the default:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">line <span class="string">&quot;foo.txt&quot;</span></span>)</span>
<span class="paren2">(<span class="code">unless <span class="paren3">(<span class="code">string= <span class="string">&quot;&quot;</span> line</span>)</span>
<span class="paren3">(<span class="code">write-line <span class="paren4">(<span class="code">string-upcase line</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<p>Using a different reader function and <a href="../../../2018/05/fun-with-macros-gathering/index.html">another
macro</a> to gather data from inside the
iteration:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">gathering
<span class="paren2">(<span class="code">do-file <span class="paren3">(<span class="code">n <span class="keyword">:reader</span> #'read-integer</span>)</span>
<span class="paren3">(<span class="code">when <span class="paren4">(<span class="code">primep n</span>)</span>
<span class="paren4">(<span class="code">gather n</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<p>Passing along options to the underlying <code>open</code>, and returning early:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">form <span class="string">&quot;foo.lisp&quot;</span> <span class="keyword">:reader</span> #'read <span class="keyword">:external-format</span> <span class="keyword">:EBCDIC-US</span></span>)</span>
<span class="paren2">(<span class="code">when <span class="paren3">(<span class="code">eq form <span class="keyword">:stop</span></span>)</span>
<span class="paren3">(<span class="code">return <span class="keyword">:stopped-early</span></span>)</span></span>)</span>
<span class="paren2">(<span class="code">print form</span>)</span></span>)</span></span></code></pre>
<p>All of these could of course be done in other ways. You could have a separate
function that reads the file into a sequence and then pass that to <code>mapcar</code> or
something else, but it can be wasteful to cons up the entire list if you're only
going to process items and don't need to retain then (or if you're going to stop
early).</p>
<p>You could also write a <code>mapc-file</code> that takes a function instead of making this
a macro, but sometimes it's nice to not have to wrap things in a thunk. It's
probably worth having that function as an additional tool in the toolbox though!</p>
<h2 id="s2-implementation"><a href="index.html#s2-implementation">Implementation</a></h2>
<p>Here's the full implementation of the macro:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eof <span class="paren4">(<span class="code">gensym <span class="string">&quot;EOF&quot;</span></span>)</span></span>)</span></span>)</span>
<span class="paren2">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">symbol path
&amp;rest open-options
&amp;key <span class="paren5">(<span class="code">reader '#'read-line</span>)</span> &amp;allow-other-keys</span>)</span>
&amp;body body</span>)</span>
<span class="string">&quot;Iterate over the contents of `file` using `reader`.
During iteration, `symbol` will be set to successive values read from the
file by `reader`.
`reader` can be any function that conforms to the usual reading interface,
i.e. anything that can handle `(read-foo stream eof-error-p eof-value)`.
Any keyword arguments other than `:reader` will be passed along to `open`.
If `nil` is used for one of the `:if-…` options to `open` and this results
in `open` returning `nil`, no iteration will take place.
An implicit block named `nil` surrounds the iteration, so `return` can be
used to terminate early.
Returns `nil`.
Examples:
(do-file (line </span><span class="string">\&quot;</span><span class="string">foo.txt</span><span class="string">\&quot;</span><span class="string">)
(print line))
(do-file (form </span><span class="string">\&quot;</span><span class="string">foo.lisp</span><span class="string">\&quot;</span><span class="string"> :reader #'read :external-format :EBCDIC-US)
(when (eq form :stop)
(return :stopped-early))
(print form))
(do-file (line </span><span class="string">\&quot;</span><span class="string">does-not-exist.txt</span><span class="string">\&quot;</span><span class="string"> :if-does-not-exist nil)
(this-will-not-be-executed))
&quot;</span>
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code">open-options <span class="paren6">(<span class="code">alexandria:remove-from-plist open-options <span class="keyword">:reader</span></span>)</span></span>)</span></span>)</span>
<span class="paren4">(<span class="code"><i><span class="symbol">alexandria:with-gensyms</span></i> <span class="paren5">(<span class="code">stream</span>)</span>
<span class="paren5">(<span class="code">alexandria:once-only <span class="paren6">(<span class="code">path reader</span>)</span>
`<span class="paren6">(<span class="code">when-let <span class="paren1">(<span class="code"><span class="paren2">(<span class="code">,stream <span class="paren3">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
<span class="paren1">(<span class="code"><i><span class="symbol">unwind-protect</span></i>
<span class="paren2">(<span class="code">do <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">,symbol
<span class="paren5">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span>
<span class="paren5">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span></span>)</span></span>)</span>
<span class="paren3">(<span class="code"><span class="paren4">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
,@body</span>)</span>
<span class="paren2">(<span class="code">close ,stream</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<p>There are a few interesting things to talk about here.</p>
<h3 id="s3-let-over-defmacro"><a href="index.html#s3-let-over-defmacro">Let Over Defmacro</a></h3>
<p>The very first line is unusual: instead of the <code>defmacro</code> being the top level
form, we wrap it in a <code>let</code> to generate one single unique EOF sentinel object:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eof <span class="paren4">(<span class="code">gensym <span class="string">&quot;EOF&quot;</span></span>)</span></span>)</span></span>)</span>
<span class="paren2">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren3">(<span class="code"></span>)</span>
</span>)</span></span>)</span></span></code></pre>
<p>We could put the <code>let</code> inside the macro, but then we'd be generating a separate
EOF object for every use of the macro, which is wasteful.</p>
<h3 id="s4-rest-and-key"><a href="index.html#s4-rest-and-key">&amp;rest and &amp;key</a></h3>
<p>Note how the argument list of the macro takes both <code>&amp;rest</code> and <code>&amp;key</code> arguments, and uses
<code>&amp;allow-other-keys</code> to let the macro take arbitrary keyword arguments</p>
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">symbol path
&amp;rest open-options
&amp;key <span class="paren4">(<span class="code">reader '#'read-line</span>)</span> &amp;allow-other-keys</span>)</span>
&amp;body body</span>)</span>
<span class="paren2">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">open-options <span class="paren5">(<span class="code">alexandria:remove-from-plist open-options <span class="keyword">:reader</span></span>)</span></span>)</span></span>)</span>
<span class="paren3">(<span class="code">when-let <span class="paren4">(<span class="code"><span class="paren5">(<span class="code">,stream <span class="paren6">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
</span>)</span></span>)</span></span>)</span></span></code></pre>
<p>We pass along any keyword arguments we get (aside from the special <code>:reader</code>
argument for this macro) to <code>open</code>. Using <code>&amp;allow-other-keys</code> means we don't
need to hardcode all the possible options to <code>open</code>, and also allows for
additional implementation-specific options to be passed to <code>open</code> if the user
wants.</p>
<p>We could have omitted the keyword arguments entirely, taken the arguments as
a raw <code>&amp;rest</code>, and pulled out <code>:reader</code> ourselves with <code>getf</code>. But doing it
this way means we don't have to fiddle around doing that, and also can also
provide slightly nicer documentation in an editor when it shows the macro's
argument list in the status bar. We'll also get a nicer error if we
accidentally pass an odd number of keyword arguments.</p>
<p>One more thing before we move on: note the extra level of quoting for the
<code>(reader '#'read-line)</code> default value. It's important to remember that this is
a <em>macro</em>, and so when someone writes <code>(do-file (… :reader #'foo) …)</code> the macro
isn't getting the <em>function</em> <code>foo</code> because it's not evaluated yet, it's getting
the <em>list</em> <code>(function foo)</code>. But the default value is <em>evaluated</em> when the
argument is missing, so we need the extra layer of quoting to make sure the
result makes sense and matches what we'd be getting normally.</p>
<h3 id="s5-macros-using-macros"><a href="index.html#s5-macros-using-macros">Macros Using Macros</a></h3>
<p>We use <code>with-gensyms</code> and <code>once-only</code> from Alexandria to maintain good hygiene
in the macro. We also use <a href="../../../2018/07/fun-with-macros-if-let/index.html"><code>when-let</code></a>
to avoid some more boilerplate:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren2">(<span class="code"></span>)</span>
<span class="paren2">(<span class="code"><i><span class="symbol">alexandria:with-gensyms</span></i> <span class="paren3">(<span class="code">stream</span>)</span>
<span class="paren3">(<span class="code">alexandria:once-only <span class="paren4">(<span class="code">path reader</span>)</span>
`<span class="paren4">(<span class="code">when-let <span class="paren5">(<span class="code"><span class="paren6">(<span class="code">,stream <span class="paren1">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
<span class="paren5">(<span class="code"><i><span class="symbol">unwind-protect</span></i>
<span class="paren6">(<span class="code">do …</span>)</span>
<span class="paren6">(<span class="code">close ,stream</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<h3 id="s6-don-t-loop"><a href="index.html#s6-don-t-loop">Don't Loop</a></h3>
<p>Finally we get to the meat of the macro:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">do <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">,symbol
<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span>
<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span></span>)</span></span>)</span>
<span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
,@body</span>)</span></span></code></pre>
<p>Unfortunately we need to use the tedious <code>do</code> instead of <code>loop</code> here to avoid an
annoying bug: if we expanded into a <code>loop</code> call, and the user is calling this
from their <em>own</em> loop, and they use <code>(loop-finish)</code> in the body code, then it
would finish <em>our</em> loop instead of <em>their</em> loop, which would very confusing.</p>
<p>Imagine the user wrote this very contrived example:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> find-the-cat <span class="paren2">(<span class="code">&amp;rest paths</span>)</span>
<span class="paren2">(<span class="code"><i><span class="symbol">loop</span></i>
<span class="keyword">:with</span> result = nil
<span class="keyword">:for</span> <span class="paren3">(<span class="code">path . remaining</span>)</span> <span class="keyword">:on</span> paths
<span class="keyword">:for</span> i <span class="keyword">:from</span> 1
<span class="keyword">:do</span> <span class="paren3">(<span class="code">do-file <span class="paren4">(<span class="code">line path</span>)</span>
<span class="paren4">(<span class="code">when <span class="paren5">(<span class="code">string= line <span class="string">&quot;meow&quot;</span></span>)</span>
<span class="paren5">(<span class="code">setf result path</span>)</span>
<span class="paren5">(<span class="code">loop-finish</span>)</span></span>)</span></span>)</span> <span class="comment">;; This should obviously go to the finally below.
</span> <span class="keyword">:finally</span>
<span class="paren3">(<span class="code">when result
<span class="paren4">(<span class="code">format t <span class="string">&quot;Found cat after searching ~D files (did not search ~D other~:P).&quot;</span>
i <span class="paren5">(<span class="code">length remaining</span>)</span></span>)</span>
<span class="paren4">(<span class="code">return result</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<p>If <code>do-file</code> expanded into a <code>loop</code> form, then the <code>(loop-finish)</code> would only
terminate <em>that</em> loop.</p>
<p>The same issue kind of applies with the implicit block named <code>nil</code> around <code>do</code>.
But this is much less surprising for a macro named <code>do-…</code>, and we've documented
it in the docstring, so that's probably okay.</p>
<h3 id="s7-repetition-allergies"><a href="index.html#s7-repetition-allergies">Repetition Allergies</a></h3>
<p>Using <code>do</code> here is a little annoying because the init form and the step form are
exactly the same. If you're allergic to repeating yourself you could use <code>#n=</code>
and <code>#n#</code> reader macros to get around it:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">do <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">,symbol #1=<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span> #1#</span>)</span></span>)</span>
<span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
,@body</span>)</span></span></code></pre>
<p>I find this more confusing than helpful, but to each their own.</p>
<h2 id="s8-result"><a href="index.html#s8-result">Result</a></h2>
<p>We've got a nice little macro for easily iterating over files piece by piece.
It can take any reader function that conforms to the usual <code>(read-foo stream
eof-error-p eof-value)</code> interface, which means we can write our own reader
functions that will compose nicely with the macro.</p>
<p>We'll end with an exercise for the reader: figure out how to support
declarations correctly. For example:</p>
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">n <span class="string">&quot;numbers.txt&quot;</span> <span class="keyword">:reader</span> #'read-fixnum</span>)</span>
<span class="paren2">(<span class="code">declare <span class="paren3">(<span class="code">type fixnum n</span>)</span></span>)</span>
<span class="paren2">(<span class="code">when <span class="paren3">(<span class="code">primep n</span>)</span>
<span class="paren3">(<span class="code">collect <span class="paren4">(<span class="code">* n n</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
<p>Hint: you'll need to deal with the sentinel value a bit differently so it
doesn't contaminate the type of the bound variable.</p>
</article></main><hr class='main-separator' /><footer><nav><a href='https://github.com/sjl/'>GitHub</a><a href='https://twitter.com/stevelosh/'>Twitter</a><a href='https://instagram.com/thirtytwobirds/'>Instagram</a><a href='https://hg.stevelosh.com/.plan/'>.plan</a></nav></footer></body></html>