230 lines
No EOL
20 KiB
HTML
230 lines
No EOL
20 KiB
HTML
<!DOCTYPE html>
|
|
<html lang='en'><head><meta charset='utf-8' /><meta name='pinterest' content='nopin' /><link href='../../../../static/css/style.css' rel='stylesheet' type='text/css' /><link href='../../../../static/css/print.css' rel='stylesheet' type='text/css' media='print' /><title>Fun with Macros: Do-File / Steve Losh</title></head><body><header><a id='logo' href='https://stevelosh.com/'>Steve Losh</a><nav><a href='../../../index.html'>Blog</a> - <a href='https://stevelosh.com/projects/'>Projects</a> - <a href='https://stevelosh.com/photography/'>Photography</a> - <a href='https://stevelosh.com/links/'>Links</a> - <a href='https://stevelosh.com/rss.xml'>Feed</a></nav></header><hr class='main-separator' /><main id='page-blog-entry'><article><h1><a href='index.html'>Fun with Macros: Do-File</a></h1><p class='date'>Posted on April 19th, 2022.</p><p>It's been a while, but it's time to take a look at another fun little Common
|
|
Lisp macro with some interesting things inside it: <code>do-file</code>.</p>
|
|
|
|
<ol class="table-of-contents"><li><a href="index.html#s1-usage">Usage</a></li><li><a href="index.html#s2-implementation">Implementation</a><ol><li><a href="index.html#s3-let-over-defmacro">Let Over Defmacro</a></li><li><a href="index.html#s4-rest-and-key">&rest and &key</a></li><li><a href="index.html#s5-macros-using-macros">Macros Using Macros</a></li><li><a href="index.html#s6-don-t-loop">Don't Loop</a></li><li><a href="index.html#s7-repetition-allergies">Repetition Allergies</a></li></ol></li><li><a href="index.html#s8-result">Result</a></li></ol>
|
|
|
|
<h2 id="s1-usage"><a href="index.html#s1-usage">Usage</a></h2>
|
|
|
|
<p>The macro we'll be taking a look at today is called <code>do-file</code>. It's used to
|
|
open a file and iterate over the contents using a reader function, saving you
|
|
some tedious boilerplate.</p>
|
|
|
|
<p>First let's look at some examples of how you could use it. Processing each
|
|
line of a file is the default:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">line <span class="string">"foo.txt"</span></span>)</span>
|
|
<span class="paren2">(<span class="code">unless <span class="paren3">(<span class="code">string= <span class="string">""</span> line</span>)</span>
|
|
<span class="paren3">(<span class="code">write-line <span class="paren4">(<span class="code">string-upcase line</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>Using a different reader function and <a href="../../../2018/05/fun-with-macros-gathering/index.html">another
|
|
macro</a> to gather data from inside the
|
|
iteration:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">gathering
|
|
<span class="paren2">(<span class="code">do-file <span class="paren3">(<span class="code">n <span class="keyword">:reader</span> #'read-integer</span>)</span>
|
|
<span class="paren3">(<span class="code">when <span class="paren4">(<span class="code">primep n</span>)</span>
|
|
<span class="paren4">(<span class="code">gather n</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>Passing along options to the underlying <code>open</code>, and returning early:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">form <span class="string">"foo.lisp"</span> <span class="keyword">:reader</span> #'read <span class="keyword">:external-format</span> <span class="keyword">:EBCDIC-US</span></span>)</span>
|
|
<span class="paren2">(<span class="code">when <span class="paren3">(<span class="code">eq form <span class="keyword">:stop</span></span>)</span>
|
|
<span class="paren3">(<span class="code">return <span class="keyword">:stopped-early</span></span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code">print form</span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>All of these could of course be done in other ways. You could have a separate
|
|
function that reads the file into a sequence and then pass that to <code>mapcar</code> or
|
|
something else, but it can be wasteful to cons up the entire list if you're only
|
|
going to process items and don't need to retain then (or if you're going to stop
|
|
early).</p>
|
|
|
|
<p>You could also write a <code>mapc-file</code> that takes a function instead of making this
|
|
a macro, but sometimes it's nice to not have to wrap things in a thunk. It's
|
|
probably worth having that function as an additional tool in the toolbox though!</p>
|
|
|
|
<h2 id="s2-implementation"><a href="index.html#s2-implementation">Implementation</a></h2>
|
|
|
|
<p>Here's the full implementation of the macro:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eof <span class="paren4">(<span class="code">gensym <span class="string">"EOF"</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">symbol path
|
|
&rest open-options
|
|
&key <span class="paren5">(<span class="code">reader '#'read-line</span>)</span> &allow-other-keys</span>)</span>
|
|
&body body</span>)</span>
|
|
<span class="string">"Iterate over the contents of `file` using `reader`.
|
|
|
|
During iteration, `symbol` will be set to successive values read from the
|
|
file by `reader`.
|
|
|
|
`reader` can be any function that conforms to the usual reading interface,
|
|
i.e. anything that can handle `(read-foo stream eof-error-p eof-value)`.
|
|
|
|
Any keyword arguments other than `:reader` will be passed along to `open`.
|
|
|
|
If `nil` is used for one of the `:if-…` options to `open` and this results
|
|
in `open` returning `nil`, no iteration will take place.
|
|
|
|
An implicit block named `nil` surrounds the iteration, so `return` can be
|
|
used to terminate early.
|
|
|
|
Returns `nil`.
|
|
|
|
Examples:
|
|
|
|
(do-file (line </span><span class="string">\"</span><span class="string">foo.txt</span><span class="string">\"</span><span class="string">)
|
|
(print line))
|
|
|
|
(do-file (form </span><span class="string">\"</span><span class="string">foo.lisp</span><span class="string">\"</span><span class="string"> :reader #'read :external-format :EBCDIC-US)
|
|
(when (eq form :stop)
|
|
(return :stopped-early))
|
|
(print form))
|
|
|
|
(do-file (line </span><span class="string">\"</span><span class="string">does-not-exist.txt</span><span class="string">\"</span><span class="string"> :if-does-not-exist nil)
|
|
(this-will-not-be-executed))
|
|
|
|
"</span>
|
|
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code">open-options <span class="paren6">(<span class="code">alexandria:remove-from-plist open-options <span class="keyword">:reader</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren4">(<span class="code"><i><span class="symbol">alexandria:with-gensyms</span></i> <span class="paren5">(<span class="code">stream</span>)</span>
|
|
<span class="paren5">(<span class="code">alexandria:once-only <span class="paren6">(<span class="code">path reader</span>)</span>
|
|
`<span class="paren6">(<span class="code">when-let <span class="paren1">(<span class="code"><span class="paren2">(<span class="code">,stream <span class="paren3">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code"><i><span class="symbol">unwind-protect</span></i>
|
|
<span class="paren2">(<span class="code">do <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">,symbol
|
|
<span class="paren5">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span>
|
|
<span class="paren5">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code"><span class="paren4">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
|
|
,@body</span>)</span>
|
|
<span class="paren2">(<span class="code">close ,stream</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>There are a few interesting things to talk about here.</p>
|
|
|
|
<h3 id="s3-let-over-defmacro"><a href="index.html#s3-let-over-defmacro">Let Over Defmacro</a></h3>
|
|
|
|
<p>The very first line is unusual: instead of the <code>defmacro</code> being the top level
|
|
form, we wrap it in a <code>let</code> to generate one single unique EOF sentinel object:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eof <span class="paren4">(<span class="code">gensym <span class="string">"EOF"</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren3">(<span class="code">…</span>)</span>
|
|
…</span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>We could put the <code>let</code> inside the macro, but then we'd be generating a separate
|
|
EOF object for every use of the macro, which is wasteful.</p>
|
|
|
|
<h3 id="s4-rest-and-key"><a href="index.html#s4-rest-and-key">&rest and &key</a></h3>
|
|
|
|
<p>Note how the argument list of the macro takes both <code>&rest</code> and <code>&key</code> arguments, and uses
|
|
<code>&allow-other-keys</code> to let the macro take arbitrary keyword arguments</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">symbol path
|
|
&rest open-options
|
|
&key <span class="paren4">(<span class="code">reader '#'read-line</span>)</span> &allow-other-keys</span>)</span>
|
|
&body body</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">open-options <span class="paren5">(<span class="code">alexandria:remove-from-plist open-options <span class="keyword">:reader</span></span>)</span></span>)</span></span>)</span>
|
|
…
|
|
<span class="paren3">(<span class="code">when-let <span class="paren4">(<span class="code"><span class="paren5">(<span class="code">,stream <span class="paren6">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
|
|
…</span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>We pass along any keyword arguments we get (aside from the special <code>:reader</code>
|
|
argument for this macro) to <code>open</code>. Using <code>&allow-other-keys</code> means we don't
|
|
need to hardcode all the possible options to <code>open</code>, and also allows for
|
|
additional implementation-specific options to be passed to <code>open</code> if the user
|
|
wants.</p>
|
|
|
|
<p>We could have omitted the keyword arguments entirely, taken the arguments as
|
|
a raw <code>&rest</code>, and pulled out <code>:reader</code> ourselves with <code>getf</code>. But doing it
|
|
this way means we don't have to fiddle around doing that, and also can also
|
|
provide slightly nicer documentation in an editor when it shows the macro's
|
|
argument list in the status bar. We'll also get a nicer error if we
|
|
accidentally pass an odd number of keyword arguments.</p>
|
|
|
|
<p>One more thing before we move on: note the extra level of quoting for the
|
|
<code>(reader '#'read-line)</code> default value. It's important to remember that this is
|
|
a <em>macro</em>, and so when someone writes <code>(do-file (… :reader #'foo) …)</code> the macro
|
|
isn't getting the <em>function</em> <code>foo</code> because it's not evaluated yet, it's getting
|
|
the <em>list</em> <code>(function foo)</code>. But the default value is <em>evaluated</em> when the
|
|
argument is missing, so we need the extra layer of quoting to make sure the
|
|
result makes sense and matches what we'd be getting normally.</p>
|
|
|
|
<h3 id="s5-macros-using-macros"><a href="index.html#s5-macros-using-macros">Macros Using Macros</a></h3>
|
|
|
|
<p>We use <code>with-gensyms</code> and <code>once-only</code> from Alexandria to maintain good hygiene
|
|
in the macro. We also use <a href="../../../2018/07/fun-with-macros-if-let/index.html"><code>when-let</code></a>
|
|
to avoid some more boilerplate:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defmacro</span></i> do-file <span class="paren2">(<span class="code">…</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">alexandria:with-gensyms</span></i> <span class="paren3">(<span class="code">stream</span>)</span>
|
|
<span class="paren3">(<span class="code">alexandria:once-only <span class="paren4">(<span class="code">path reader</span>)</span>
|
|
`<span class="paren4">(<span class="code">when-let <span class="paren5">(<span class="code"><span class="paren6">(<span class="code">,stream <span class="paren1">(<span class="code">open ,path <span class="keyword">:direction</span> <span class="keyword">:input</span> ,@open-options</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code"><i><span class="symbol">unwind-protect</span></i>
|
|
<span class="paren6">(<span class="code">do …</span>)</span>
|
|
<span class="paren6">(<span class="code">close ,stream</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<h3 id="s6-don-t-loop"><a href="index.html#s6-don-t-loop">Don't Loop</a></h3>
|
|
|
|
<p>Finally we get to the meat of the macro:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">do <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">,symbol
|
|
<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span>
|
|
<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
|
|
,@body</span>)</span></span></code></pre>
|
|
|
|
<p>Unfortunately we need to use the tedious <code>do</code> instead of <code>loop</code> here to avoid an
|
|
annoying bug: if we expanded into a <code>loop</code> call, and the user is calling this
|
|
from their <em>own</em> loop, and they use <code>(loop-finish)</code> in the body code, then it
|
|
would finish <em>our</em> loop instead of <em>their</em> loop, which would very confusing.</p>
|
|
|
|
<p>Imagine the user wrote this very contrived example:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> find-the-cat <span class="paren2">(<span class="code">&rest paths</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">loop</span></i>
|
|
<span class="keyword">:with</span> result = nil
|
|
<span class="keyword">:for</span> <span class="paren3">(<span class="code">path . remaining</span>)</span> <span class="keyword">:on</span> paths
|
|
<span class="keyword">:for</span> i <span class="keyword">:from</span> 1
|
|
<span class="keyword">:do</span> <span class="paren3">(<span class="code">do-file <span class="paren4">(<span class="code">line path</span>)</span>
|
|
<span class="paren4">(<span class="code">when <span class="paren5">(<span class="code">string= line <span class="string">"meow"</span></span>)</span>
|
|
<span class="paren5">(<span class="code">setf result path</span>)</span>
|
|
<span class="paren5">(<span class="code">loop-finish</span>)</span></span>)</span></span>)</span> <span class="comment">;; This should obviously go to the finally below.
|
|
</span> <span class="keyword">:finally</span>
|
|
<span class="paren3">(<span class="code">when result
|
|
<span class="paren4">(<span class="code">format t <span class="string">"Found cat after searching ~D files (did not search ~D other~:P)."</span>
|
|
i <span class="paren5">(<span class="code">length remaining</span>)</span></span>)</span>
|
|
<span class="paren4">(<span class="code">return result</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>If <code>do-file</code> expanded into a <code>loop</code> form, then the <code>(loop-finish)</code> would only
|
|
terminate <em>that</em> loop.</p>
|
|
|
|
<p>The same issue kind of applies with the implicit block named <code>nil</code> around <code>do</code>.
|
|
But this is much less surprising for a macro named <code>do-…</code>, and we've documented
|
|
it in the docstring, so that's probably okay.</p>
|
|
|
|
<h3 id="s7-repetition-allergies"><a href="index.html#s7-repetition-allergies">Repetition Allergies</a></h3>
|
|
|
|
<p>Using <code>do</code> here is a little annoying because the init form and the step form are
|
|
exactly the same. If you're allergic to repeating yourself you could use <code>#n=</code>
|
|
and <code>#n#</code> reader macros to get around it:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">do <span class="paren2">(<span class="code"><span class="paren3">(<span class="code">,symbol #1=<span class="paren4">(<span class="code">funcall ,reader ,stream nil ',eof</span>)</span> #1#</span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code"><span class="paren3">(<span class="code">eq ,symbol ',eof</span>)</span></span>)</span>
|
|
,@body</span>)</span></span></code></pre>
|
|
|
|
<p>I find this more confusing than helpful, but to each their own.</p>
|
|
|
|
<h2 id="s8-result"><a href="index.html#s8-result">Result</a></h2>
|
|
|
|
<p>We've got a nice little macro for easily iterating over files piece by piece.
|
|
It can take any reader function that conforms to the usual <code>(read-foo stream
|
|
eof-error-p eof-value)</code> interface, which means we can write our own reader
|
|
functions that will compose nicely with the macro.</p>
|
|
|
|
<p>We'll end with an exercise for the reader: figure out how to support
|
|
declarations correctly. For example:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code">do-file <span class="paren2">(<span class="code">n <span class="string">"numbers.txt"</span> <span class="keyword">:reader</span> #'read-fixnum</span>)</span>
|
|
<span class="paren2">(<span class="code">declare <span class="paren3">(<span class="code">type fixnum n</span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code">when <span class="paren3">(<span class="code">primep n</span>)</span>
|
|
<span class="paren3">(<span class="code">collect <span class="paren4">(<span class="code">* n n</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>Hint: you'll need to deal with the sentinel value a bit differently so it
|
|
doesn't contaminate the type of the bound variable.</p>
|
|
</article></main><hr class='main-separator' /><footer><nav><a href='https://github.com/sjl/'>GitHub</a> ・ <a href='https://twitter.com/stevelosh/'>Twitter</a> ・ <a href='https://instagram.com/thirtytwobirds/'>Instagram</a> ・ <a href='https://hg.stevelosh.com/.plan/'>.plan</a></nav></footer></body></html> |