emacs.d/clones/lisp/gigamonkeys.com/book/macros-defining-your-own.html

558 lines
No EOL
42 KiB
HTML

<HTML><HEAD><TITLE>Macros: Defining Your Own</TITLE><LINK REL="stylesheet" TYPE="text/css" HREF="style.css"/></HEAD><BODY><DIV CLASS="copyright">Copyright &copy; 2003-2005, Peter Seibel</DIV><H1>8. Macros: Defining Your Own</H1><P>Now it's time to start writing your own macros. The standard macros I
covered in the previous chapter hint at some of the things you can do
with macros, but that's just the beginning. Common Lisp doesn't
support macros so every Lisp programmer can create their own variants
of standard control constructs any more than C supports functions so
every C programmer can write trivial variants of the functions in the
C standard library. Macros are part of the language to allow you to
create abstractions on top of the core language and standard library
that move you closer toward being able to directly express the things
you want to express.</P><P>Perhaps the biggest barrier to a proper understanding of macros is,
ironically, that they're so well integrated into the language. In
many ways they seem like just a funny kind of function--they're
written in Lisp, they take arguments and return results, and they
allow you to abstract away distracting details. Yet despite these
many similarities, macros operate at a different level than functions
and create a totally different kind of abstraction.</P><P>Once you understand the difference between macros and functions, the
tight integration of macros in the language will be a huge benefit.
But in the meantime, it's a frequent source of confusion for new
Lispers. The following story, while not true in a historical or
technical sense, tries to alleviate the confusion by giving you a way
to think about how macros work. </P><A NAME="the-story-of-mac-a-just-so-story"><H2>The Story of Mac: A Just-So Story</H2></A><P>Once upon a time, long ago, there was a company of Lisp programmers.
It was so long ago, in fact, that Lisp had no macros. Anything that
couldn't be defined with a function or done with a special operator
had to be written in full every time, which was rather a drag.
Unfortunately, the programmers in this company--though
brilliant--were also quite lazy. Often in the middle of their
programs--when the tedium of writing a bunch of code got to be too
much--they would instead write a note describing the code they needed
to write at that place in the program. Even more unfortunately,
because they were lazy, the programmers also hated to go back and
actually write the code described by the notes. Soon the company had
a big stack of programs that nobody could run because they were full
of notes about code that still needed to be written.</P><P>In desperation, the big bosses hired a junior programmer, Mac, whose
job was to find the notes, write the required code, and insert it
into the program in place of the notes. Mac never ran the
programs--they weren't done yet, of course, so he couldn't. But even
if they had been completed, Mac wouldn't have known what inputs to
feed them. So he just wrote his code based on the contents of the
notes and sent it back to the original programmer.</P><P>With Mac's help, all the programs were soon completed, and the
company made a ton of money selling them--so much money that the
company could double the size of its programming staff. But for some
reason no one thought to hire anyone to help Mac; soon he was single-
handedly assisting several dozen programmers. To avoid spending all
his time searching for notes in source code, Mac made a small
modification to the compiler the programmers used. Thereafter,
whenever the compiler hit a note, it would e-mail him the note and
wait for him to e-mail back the replacement code. Unfortunately, even
with this change, Mac had a hard time keeping up with the
programmers. He worked as carefully as he could, but sometimes--
especially when the notes weren't clear--he would make mistakes.</P><P>The programmers noticed, however, that the more precisely they wrote
their notes, the more likely it was that Mac would send back correct
code. One day, one of the programmers, having a hard time describing
in words the code he wanted, included in one of his notes a Lisp
program that would generate the code he wanted. That was fine by Mac;
he just ran the program and sent the result to the compiler.</P><P>The next innovation came when a programmer put a note at the top of
one of his programs containing a function definition and a comment
that said, &quot;Mac, don't write any code here, but keep this function
for later; I'm going to use it in some of my other notes.&quot; Other
notes in the same program said things such as, &quot;Mac, replace this
note with the result of running that other function with the symbols
<CODE>x</CODE> and <CODE>y</CODE> as arguments.&quot;</P><P>This technique caught on so quickly that within a few days, most
programs contained dozens of notes defining functions that were only
used by code in other notes. To make it easy for Mac to pick out the
notes containing only definitions that didn't require any immediate
response, the programmers tagged them with the standard preface:
&quot;Definition for Mac, Read Only.&quot; This--as the programmers were still
quite lazy--was quickly shortened to &quot;DEF. MAC. R/O&quot; and then
&quot;DEFMACRO.&quot;</P><P>Pretty soon, there was no actual English left in the notes for Mac.
All he did all day was read and respond to e-mails from the compiler
containing DEFMACRO notes and calls to the functions defined in the
DEFMACROs. Since the Lisp programs in the notes did all the real
work, keeping up with the e-mails was no problem. Mac suddenly had a
lot of time on his hands and would sit in his office daydreaming
about white-sand beaches, clear blue ocean water, and drinks with
little paper umbrellas in them.</P><P>Several months later the programmers realized nobody had seen Mac for
quite some time. When they went to his office, they found a thin
layer of dust over everything, a desk littered with travel brochures
for various tropical locations, and the computer off. But the
compiler still worked--how could it be? It turned out Mac had made
one last change to the compiler: instead of e-mailing notes to Mac,
the compiler now saved the functions defined by DEFMACRO notes and
ran them when called for by the other notes. The programmers decided
there was no reason to tell the big bosses Mac wasn't coming to the
office anymore. So to this day, Mac draws a salary and from time to
time sends the programmers a postcard from one tropical locale or
another.</P><A NAME="macro-expansion-time-vs-runtime"><H2>Macro Expansion Time vs. Runtime</H2></A><P>The key to understanding macros is to be quite clear about the
distinction between the code that generates code (macros) and the
code that eventually makes up the program (everything else). When you
write macros, you're writing programs that will be used by the
compiler to generate the code that will then be compiled. Only after
all the macros have been fully expanded and the resulting code
compiled can the program actually be run. The time when macros run is
called <I>macro expansion time</I>; this is distinct from <I>runtime</I>,
when regular code, including the code generated by macros, runs.</P><P>It's important to keep this distinction firmly in mind because code
running at macro expansion time runs in a very different environment
than code running at runtime. Namely, at macro expansion time,
there's no way to access the data that will exist at runtime. Like
Mac, who couldn't run the programs he was working on because he
didn't know what the correct inputs were, code running at macro
expansion time can deal only with the data that's inherent in the
source code. For instance, suppose the following source code appears
somewhere in a program:</P><PRE>(defun foo (x)
(when (&gt; x 10) (print 'big)))</PRE><P>Normally you'd think of <CODE>x</CODE> as a variable that will hold the
argument passed in a call to <CODE>foo</CODE>. But at macro expansion time,
such as when the compiler is running the <CODE><B>WHEN</B></CODE> macro, the only
data available is the source code. Since the program isn't running
yet, there's no call to <CODE>foo</CODE> and thus no value associated with
<CODE>x</CODE>. Instead, the values the compiler passes to <CODE><B>WHEN</B></CODE> are
the Lisp lists representing the source code, namely, <CODE>(&gt; x 10)</CODE>
and <CODE>(print 'big)</CODE>. Suppose that <CODE><B>WHEN</B></CODE> is defined, as you
saw in the previous chapter, with something like the following macro:</P><PRE>(defmacro when (condition &amp;rest body)
`(if ,condition (progn ,@body)))</PRE><P>When the code in <CODE>foo</CODE> is compiled, the <CODE><B>WHEN</B></CODE> macro will be
run with those two forms as arguments. The parameter <CODE>condition</CODE>
will be bound to the form <CODE>(&gt; x 10)</CODE>, and the form <CODE>(print
'big)</CODE> will be collected into a list that will become the value of
the <CODE><B>&amp;rest</B></CODE> <CODE>body</CODE> parameter. The backquote expression will
then generate this code:</P><PRE>(if (&gt; x 10) (progn (print 'big)))</PRE><P>by interpolating in the value of <CODE>condition</CODE> and splicing the
value of <CODE>body</CODE> into the <CODE><B>PROGN</B></CODE>.</P><P>When Lisp is interpreted, rather than compiled, the distinction
between macro expansion time and runtime is less clear because
they're temporally intertwined. Also, the language standard doesn't
specify exactly how an interpreter must handle macros--it could
expand all the macros in the form being interpreted and then
interpret the resulting code, or it could start right in on
interpreting the form and expand macros when it hits them. In either
case, macros are always passed the unevaluated Lisp objects
representing the subforms of the macro form, and the job of the macro
is still to produce code that will do something rather than to do
anything directly. </P><A NAME="defmacro"><H2>DEFMACRO</H2></A><P>As you saw in Chapter 3, macros really are defined with <CODE><B>DEFMACRO</B></CODE>
forms, though it stands--of course--for DEFine MACRO, not Definition
for Mac. The basic skeleton of a <CODE><B>DEFMACRO</B></CODE> is quite similar to
the skeleton of a <CODE><B>DEFUN</B></CODE>.</P><PRE>(defmacro <I>name</I> (<I>parameter</I>*)
&quot;Optional documentation string.&quot;
<I>body-form</I>*)</PRE><P>Like a function, a macro consists of a name, a parameter list, an
optional documentation string, and a body of Lisp
expressions.<SUP>1</SUP>
However, as I just discussed, the job of a macro isn't to do anything
directly--its job is to generate code that will later do what you
want.</P><P>Macros can use the full power of Lisp to generate their expansion,
which means in this chapter I can only scratch the surface of what
you can do with macros. I can, however, describe a general process
for writing macros that works for all macros from the simplest to the
most complex. </P><P>The job of a macro is to translate a macro form--in other words, a
Lisp form whose first element is the name of the macro--into code
that does a particular thing. Sometimes you write a macro starting
with the code you'd like to be able to write, that is, with an
example macro form. Other times you decide to write a macro after
you've written the same pattern of code several times and realize you
can make your code clearer by abstracting the pattern.</P><P>Regardless of which end you start from, you need to figure out the
other end before you can start writing a macro: you need to know both
where you're coming from and where you're going before you can hope
to write code to do it automatically. Thus, the first step of writing
a macro is to write at least one example of a call to the macro and
the code into which that call should expand.</P><P>Once you have an example call and the desired expansion, you're ready
for the second step: writing the actual macro code. For simple macros
this will be a trivial matter of writing a backquoted template with
the macro parameters plugged into the right places. Complex macros
will be significant programs in their own right, complete with helper
functions and data structures.</P><P>After you've written code to translate the example call to the
appropriate expansion, you need to make sure the abstraction the
macro provides doesn't &quot;leak&quot; details of its implementation. Leaky
macro abstractions will work fine for certain arguments but not
others or will interact with code in the calling environment in
undesirable ways. As it turns out, macros can leak in a small handful
of ways, all of which are easily avoided as long as you know to check
for them. I'll discuss how in the section &quot;Plugging the Leaks.&quot;</P><P>To sum up, the steps to writing a macro are as follows:</P><P><OL>
<LI>Write a sample call to the macro and the code it should expand
into, or vice versa.</LI>
<LI>Write code that generates the handwritten expansion from the
arguments in the sample call.</LI>
<LI>Make sure the macro abstraction doesn't &quot;leak.&quot;</LI>
</OL></P><A NAME="a-sample-macro-do-primes"><H2>A Sample Macro: do-primes</H2></A><P>To see how this three-step process works, you'll write a macro
<CODE>do-primes</CODE> that provides a looping construct similar to
<CODE><B>DOTIMES</B></CODE> and <CODE><B>DOLIST</B></CODE> except that instead of iterating over
integers or elements of a list, it iterates over successive prime
numbers. This isn't meant to be an example of a particularly useful
macro--it's just a vehicle for demonstrating the process.</P><P>First, you'll need two utility functions, one to test whether a given
number is prime and another that returns the next prime number
greater or equal to its argument. In both cases you can use a simple,
but inefficient, brute-force approach.</P><PRE>(defun primep (number)
(when (&gt; number 1)
(loop for fac from 2 to (isqrt number) never (zerop (mod number fac)))))
(defun next-prime (number)
(loop for n from number when (primep n) return n))</PRE><P>Now you can write the macro. Following the procedure outlined
previously, you need at least one example of a call to the macro and
the code into which it should expand. Suppose you start with the idea
that you want to be able to write this: </P><PRE>(do-primes (p 0 19)
(format t &quot;~d &quot; p))</PRE><P>to express a loop that executes the body once each for each prime
number greater or equal to 0 and less than or equal to 19, with the
variable <CODE>p</CODE> holding the prime number. It makes sense to model
this macro on the form of the standard <CODE><B>DOLIST</B></CODE> and <CODE><B>DOTIMES</B></CODE>
macros; macros that follow the pattern of existing macros are easier
to understand and use than macros that introduce gratuitously novel
syntax.</P><P>Without the <CODE>do-primes</CODE> macro, you could write such a loop with
<CODE><B>DO</B></CODE> (and the two utility functions defined previously) like this:</P><PRE>(do ((p (next-prime 0) (next-prime (1+ p))))
((&gt; p 19))
(format t &quot;~d &quot; p))</PRE><P>Now you're ready to start writing the macro code that will translate
from the former to the latter. </P><A NAME="macro-parameters"><H2>Macro Parameters</H2></A><P>Since the arguments passed to a macro are Lisp objects representing
the source code of the macro call, the first step in any macro is to
extract whatever parts of those objects are needed to compute the
expansion. For macros that simply interpolate their arguments
directly into a template, this step is trivial: simply defining the
right parameters to hold the different arguments is sufficient.</P><P>But this approach, it seems, will not suffice for <CODE>do-primes</CODE>.
The first argument to the <CODE>do-primes</CODE> call is a list containing
the name of the loop variable, <CODE>p</CODE>; the lower bound, <CODE>0</CODE>;
and the upper bound, <CODE>19</CODE>. But if you look at the expansion, the
list as a whole doesn't appear in the expansion; the three element
are split up and put in different places.</P><P>You could define <CODE>do-primes</CODE> with two parameters, one to hold
the list and a <CODE><B>&amp;rest</B></CODE> parameter to hold the body forms, and then
take apart the list by hand, something like this:</P><PRE>(defmacro do-primes (var-and-range &amp;rest body)
(let ((var (first var-and-range))
(start (second var-and-range))
(end (third var-and-range)))
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
((&gt; ,var ,end))
,@body)))</PRE><P>In a moment I'll explain how the body generates the correct
expansion; for now you can just note that the variables <CODE>var</CODE>,
<CODE>start</CODE>, and <CODE>end</CODE> each hold a value, extracted from
<CODE>var-and-range</CODE>, that's then interpolated into the backquote
expression that generates <CODE>do-primes</CODE>'s expansion.</P><P>However, you don't need to take apart <CODE>var-and-range</CODE> &quot;by hand&quot;
because macro parameter lists are what are called <I>destructuring</I>
parameter lists. Destructuring, as the name suggests, involves taking
apart a structure--in this case the list structure of the forms
passed to a macro. </P><P>Within a destructuring parameter list, a simple parameter name can be
replaced with a nested parameter list. The parameters in the nested
parameter list will take their values from the elements of the
expression that would have been bound to the parameter the list
replaced. For instance, you can replace <CODE>var-and-range</CODE> with a
list <CODE>(var start end)</CODE>, and the three elements of the list will
automatically be destructured into those three parameters.</P><P>Another special feature of macro parameter lists is that you can use
<CODE><B>&amp;body</B></CODE> as a synonym for <CODE><B>&amp;rest</B></CODE>. Semantically <CODE><B>&amp;body</B></CODE> and
<CODE><B>&amp;rest</B></CODE> are equivalent, but many development environments will use
the presence of a <CODE><B>&amp;body</B></CODE> parameter to modify how they indent uses
of the macro--typically <CODE><B>&amp;body</B></CODE> parameters are used to hold a
list of forms that make up the body of the macro.</P><P>So you can streamline the definition of <CODE>do-primes</CODE> and give a
hint to both human readers and your development tools about its
intended use by defining it like this:</P><PRE>(defmacro do-primes ((var start end) &amp;body body)
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
((&gt; ,var ,end))
,@body))</PRE><P>In addition to being more concise, destructuring parameter lists also
give you automatic error checking--with <CODE>do-primes</CODE> defined this
way, Lisp will be able to detect a call whose first argument isn't a
three-element list and will give you a meaningful error message just
as if you had called a function with too few or too many arguments.
Also, in development environments such as SLIME that indicate what
arguments are expected as soon as you type the name of a function or
macro, if you use a destructuring parameter list, the environment
will be able to tell you more specifically the syntax of the macro
call. With the original definition, SLIME would tell you
<CODE>do-primes</CODE> is called like this: </P><PRE>(do-primes var-and-range &amp;rest body)</PRE><P>But with the new definition, it can tell you that a call should look
like this:</P><PRE>(do-primes (var start end) &amp;body body)</PRE><P>Destructuring parameter lists can contain <CODE><B>&amp;optional</B></CODE>, <CODE><B>&amp;key</B></CODE>,
and <CODE><B>&amp;rest</B></CODE> parameters and can contain nested destructuring lists.
However, you don't need any of those options to write
<CODE>do-primes</CODE>. </P><A NAME="generating-the-expansion"><H2>Generating the Expansion</H2></A><P>Because <CODE>do-primes</CODE> is a fairly simple macro, after you've
destructured the arguments, all that's left is to interpolate them
into a template to get the expansion.</P><P>For simple macros like <CODE>do-primes</CODE>, the special backquote syntax
is perfect. To review, a backquoted expression is similar to a quoted
expression except you can &quot;unquote&quot; particular subexpressions by
preceding them with a comma, possibly followed by an at (@) sign.
Without an at sign, the comma causes the value of the subexpression
to be included as is. With an at sign, the value--which must be a
list--is &quot;spliced&quot; into the enclosing list.</P><P>Another useful way to think about the backquote syntax is as a
particularly concise way of writing code that generates lists. This
way of thinking about it has the benefit of being pretty much exactly
what's happening under the covers--when the reader reads a backquoted
expression, it translates it into code that generates the appropriate
list structure. For instance, <CODE>`(,a b)</CODE> might be read as
<CODE>(list a 'b)</CODE>. The language standard doesn't specify exactly
what code the reader must produce as long as it generates the right
list structure.</P><P>Table 8-1 shows some examples of backquoted expressions along with
equivalent list-building code and the result you'd get if you
evaluated either the backquoted expression or the equivalent
code.<SUP>2</SUP> </P><P><DIV CLASS="table-caption">Table 8-1. Backquote Examples</DIV></P><TABLE CLASS="book-table"><TR><TD>Backquote Syntax</TD><TD>Equivalent List-Building Code</TD><TD>Result</TD></TR><TR><TD><CODE>`(a (+ 1 2) c)</CODE></TD><TD><CODE>(list 'a '(+ 1 2) 'c)</CODE></TD><TD><CODE>(a (+ 1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,(+ 1 2) c)</CODE></TD><TD><CODE>(list 'a (+ 1 2) 'c)</CODE></TD><TD><CODE>(a 3 c)</CODE></TD></TR><TR><TD><CODE>`(a (list 1 2) c)</CODE></TD><TD><CODE>(list 'a '(list 1 2) 'c)</CODE></TD><TD><CODE>(a (list 1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,(list 1 2) c)</CODE></TD><TD><CODE>(list 'a (list 1 2) 'c)</CODE></TD><TD><CODE>(a (1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,@(list 1 2) c)</CODE></TD><TD><CODE>(append (list 'a) (list 1 2) (list 'c))</CODE></TD><TD><CODE>(a 1 2 c)</CODE></TD></TR></TABLE><P>It's important to note that backquote is just a convenience. But it's
a big convenience. To appreciate how big, compare the backquoted
version of <CODE>do-primes</CODE> to the following version, which uses
explicit list-building code: </P><PRE>(defmacro do-primes-a ((var start end) &amp;body body)
(append '(do)
(list (list (list var
(list 'next-prime start)
(list 'next-prime (list '1+ var)))))
(list (list (list '&gt; var end)))
body))</PRE><P>As you'll see in a moment, the current implementation of
<CODE>do-primes</CODE> doesn't handle certain edge cases correctly. But
first you should verify that it at least works for the original
example. You can test it in two ways. You can test it indirectly by
simply using it--presumably, if the resulting behavior is correct,
the expansion is correct. For instance, you can type the original
example's use of <CODE>do-primes</CODE> to the REPL and see that it indeed
prints the right series of prime numbers.</P><PRE>CL-USER&gt; (do-primes (p 0 19) (format t &quot;~d &quot; p))
2 3 5 7 11 13 17 19
NIL</PRE><P>Or you can check the macro directly by looking at the expansion of a
particular call. The function <CODE><B>MACROEXPAND-1</B></CODE> takes any Lisp
expression as an argument and returns the result of doing one level
of macro expansion.<SUP>3</SUP> Because <CODE><B>MACROEXPAND-1</B></CODE> is a
function, to pass it a literal macro form you must quote it. You can
use it to see the expansion of the previous call.<SUP>4</SUP> </P><PRE>CL-USER&gt; (macroexpand-1 '(do-primes (p 0 19) (format t &quot;~d &quot; p)))
(DO ((P (NEXT-PRIME 0) (NEXT-PRIME (1+ P))))
((&gt; P 19))
(FORMAT T &quot;~d &quot; P))
T</PRE><P>Or, more conveniently, in SLIME you can check a macro's expansion by
placing the cursor on the opening parenthesis of a macro form in your
source code and typing <CODE>C-c RET</CODE> to invoke the Emacs function
<CODE>slime-macroexpand-1</CODE>, which will pass the macro form to
<CODE><B>MACROEXPAND-1</B></CODE> and &quot;pretty print&quot; the result in a temporary
buffer.</P><P>However you get to it, you can see that the result of macro expansion
is the same as the original handwritten expansion, so it seems that
<CODE>do-primes</CODE> works. </P><A NAME="plugging-the-leaks"><H2>Plugging the Leaks</H2></A><P>In his essay &quot;The Law of Leaky Abstractions,&quot; Joel Spolsky coined the
term <I>leaky abstraction</I> to describe an abstraction that &quot;leaks&quot;
details it's supposed to be abstracting away. Since writing a macro
is a way of creating an abstraction, you need to make sure your
macros don't leak needlessly.<SUP>5</SUP></P><P>As it turns out, a macro can leak details of its inner workings in
three ways. Luckily, it's pretty easy to tell whether a given macro
suffers from any of those leaks and to fix them.</P><P>The current definition suffers from one of the three possible macro
leaks: namely, it evaluates the <CODE>end</CODE> subform too many times.
Suppose you were to call <CODE>do-primes</CODE> with, instead of a literal
number such as <CODE>19</CODE>, an expression such as <CODE>(random 100)</CODE>
in the <CODE>end</CODE> position. </P><PRE>(do-primes (p 0 (random 100))
(format t &quot;~d &quot; p))</PRE><P>Presumably the intent here is to loop over the primes from zero to
whatever random number is returned by <CODE>(random 100)</CODE>. However,
this isn't what the current implementation does, as
<CODE><B>MACROEXPAND-1</B></CODE> shows.</P><PRE>CL-USER&gt; (macroexpand-1 '(do-primes (p 0 (random 100)) (format t &quot;~d &quot; p)))
(DO ((P (NEXT-PRIME 0) (NEXT-PRIME (1+ P))))
((&gt; P (RANDOM 100)))
(FORMAT T &quot;~d &quot; P))
T</PRE><P>When this expansion code is run, <CODE><B>RANDOM</B></CODE> will be called each time
the end test for the loop is evaluated. Thus, instead of looping
until <CODE>p</CODE> is greater than an initially chosen random number,
this loop will iterate until it happens to draw a random number less
than or equal to the current value of <CODE>p</CODE>. While the total
number of iterations will still be random, it will be drawn from a
much different distribution than the uniform distribution <CODE><B>RANDOM</B></CODE>
returns.</P><P>This is a leak in the abstraction because, to use the macro correctly,
the caller needs to be aware that the <CODE>end</CODE> form is going to be
evaluated more than once. One way to plug this leak would be to simply
define this as the behavior of <CODE>do-primes</CODE>. But that's not very
satisfactory--you should try to observe the Principle of Least
Astonishment when implementing macros. And programmers will typically
expect the forms they pass to macros to be evaluated no more times
than absolutely necessary.<SUP>6</SUP> Furthermore, since <CODE>do-primes</CODE> is built
on the model of the standard macros, <CODE><B>DOTIMES</B></CODE> and <CODE><B>DOLIST</B></CODE>,
neither of which causes any of the forms except those in the body to
be evaluated more than once, most programmers will expect
<CODE>do-primes</CODE> to behave similarly.</P><P>You can fix the multiple evaluation easily enough; you just need to
generate code that evaluates <CODE>end</CODE> once and saves the value in a
variable to be used later. Recall that in a <CODE><B>DO</B></CODE> loop, variables
defined with an initialization form and no step form don't change
from iteration to iteration. So you can fix the multiple evaluation
problem with this definition:</P><PRE>(defmacro do-primes ((var start end) &amp;body body)
`(do ((ending-value ,end)
(,var (next-prime ,start) (next-prime (1+ ,var))))
((&gt; ,var ending-value))
,@body))</PRE><P>Unfortunately, this fix introduces two new leaks to the macro
abstraction.</P><P>One new leak is similar to the multiple-evaluation leak you just
fixed. Because the initialization forms for variables in a <CODE><B>DO</B></CODE>
loop are evaluated in the order the variables are defined, when the
macro expansion is evaluated, the expression passed as <CODE>end</CODE>
will be evaluated before the expression passed as <CODE>start</CODE>,
opposite to the order they appear in the macro call. This leak
doesn't cause any problems when <CODE>start</CODE> and <CODE>end</CODE> are
literal values like 0 and 19. But when they're forms that can have
side effects, evaluating them out of order can once again run afoul
of the Principle of Least Astonishment. </P><P>This leak is trivially plugged by swapping the order of the two
variable definitions.</P><PRE>(defmacro do-primes ((var start end) &amp;body body)
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
(ending-value ,end))
((&gt; ,var ending-value))
,@body))</PRE><P>The last leak you need to plug was created by using the variable name
<CODE>ending-value</CODE>. The problem is that the name, which ought to be
a purely internal detail of the macro implementation, can end up
interacting with code passed to the macro or in the context where the
macro is called. The following seemingly innocent call to
<CODE>do-primes</CODE> doesn't work correctly because of this leak:</P><PRE>(do-primes (ending-value 0 10)
(print ending-value))</PRE><P>Neither does this one:</P><PRE>(let ((ending-value 0))
(do-primes (p 0 10)
(incf ending-value p))
ending-value)</PRE><P>Again, <CODE><B>MACROEXPAND-1</B></CODE> can show you the problem. The first call
expands to this: </P><PRE>(do ((ending-value (next-prime 0) (next-prime (1+ ending-value)))
(ending-value 10))
((&gt; ending-value ending-value))
(print ending-value))</PRE><P>Some Lisps may reject this code because <CODE>ending-value</CODE> is used
twice as a variable name in the same <CODE><B>DO</B></CODE> loop. If not rejected
outright, the code will loop forever since <CODE>ending-value</CODE> will
never be greater than itself.</P><P>The second problem call expands to the following:</P><PRE>(let ((ending-value 0))
(do ((p (next-prime 0) (next-prime (1+ p)))
(ending-value 10))
((&gt; p ending-value))
(incf ending-value p))
ending-value)</PRE><P>In this case the generated code is perfectly legal, but the behavior
isn't at all what you want. Because the binding of
<CODE>ending-value</CODE> established by the <CODE><B>LET</B></CODE> outside the loop is
shadowed by the variable with the same name inside the <CODE><B>DO</B></CODE>, the
form <CODE>(incf ending-value p)</CODE> increments the loop variable
<CODE>ending-value</CODE> instead of the outer variable with the same name,
creating another infinite loop.<SUP>7</SUP></P><P>Clearly, what you need to patch this leak is a symbol that will never
be used outside the code generated by the macro. You could try using
a really unlikely name, but that's no guarantee. You could also
protect yourself to some extent by using packages, as described in
Chapter 21. But there's a better solution. </P><P>The function <CODE><B>GENSYM</B></CODE> returns a unique symbol each time it's
called. This is a symbol that has never been read by the Lisp reader
and never will be because it isn't interned in any package. Thus,
instead of using a literal name like <CODE>ending-value</CODE>, you can
generate a new symbol each time <CODE>do-primes</CODE> is expanded.</P><PRE>(defmacro do-primes ((var start end) &amp;body body)
(let ((ending-value-name (gensym)))
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
(,ending-value-name ,end))
((&gt; ,var ,ending-value-name))
,@body)))</PRE><P>Note that the code that calls <CODE><B>GENSYM</B></CODE> isn't part of the
expansion; it runs as part of the macro expander and thus creates a
new symbol each time the macro is expanded. This may seem a bit
strange at first--<CODE>ending-value-name</CODE> is a variable whose value
is the name of another variable. But really it's no different from
the parameter <CODE>var</CODE> whose value is the name of a variable--the
difference is the value of <CODE>var</CODE> was created by the reader when
the macro form was read, and the value of <CODE>ending-value-name</CODE> is
generated programmatically when the macro code runs.</P><P>With this definition the two previously problematic forms expand into
code that works the way you want. The first form: </P><PRE>(do-primes (ending-value 0 10)
(print ending-value))</PRE><P>expands into the following:</P><PRE>(do ((ending-value (next-prime 0) (next-prime (1+ ending-value)))
(#:g2141 10))
((&gt; ending-value #:g2141))
(print ending-value))</PRE><P>Now the variable used to hold the ending value is the gensymed
symbol, <CODE>#:g2141</CODE>. The name of the symbol, <CODE>G2141</CODE>, was
generated by <CODE><B>GENSYM</B></CODE> but isn't significant; the thing that
matters is the object identity of the symbol. Gensymed symbols are
printed in the normal syntax for uninterned symbols, with a leading
<CODE>#:</CODE>.</P><P>The other previously problematic form:</P><PRE>(let ((ending-value 0))
(do-primes (p 0 10)
(incf ending-value p))
ending-value)</PRE><P>looks like this if you replace the <CODE>do-primes</CODE> form with its
expansion:</P><PRE>(let ((ending-value 0))
(do ((p (next-prime 0) (next-prime (1+ p)))
(#:g2140 10))
((&gt; p #:g2140))
(incf ending-value p))
ending-value)</PRE><P>Again, there's no leak since the <CODE>ending-value</CODE> variable bound
by the <CODE><B>LET</B></CODE> surrounding the <CODE>do-primes</CODE> loop is no longer
shadowed by any variables introduced in the expanded code.</P><P>Not all literal names used in a macro expansion will necessarily
cause a problem--as you get more experience with the various binding
forms, you'll be able to determine whether a given name is being used
in a position that could cause a leak in a macro abstraction. But
there's no real downside to using a gensymed name just to be safe.</P><P>With that fix, you've plugged all the leaks in the implementation of
<CODE>do-primes</CODE>. Once you've gotten a bit of macro-writing
experience under your belt, you'll learn to write macros with these
kinds of leaks preplugged. It's actually fairly simple if you follow
these rules of thumb: </P><UL><LI>Unless there's a particular reason to do otherwise, include any
subforms in the expansion in positions that will be evaluated in the
same order as the subforms appear in the macro call.</LI><LI>Unless there's a particular reason to do otherwise, make sure
subforms are evaluated only once by creating a variable in the
expansion to hold the value of evaluating the argument form and then
using that variable anywhere else the value is needed in the
expansion.</LI><LI>Use <CODE><B>GENSYM</B></CODE> at macro expansion time to create variable names
used in the expansion.</LI></UL><A NAME="macro-writing-macros"><H2>Macro-Writing Macros</H2></A><P>Of course, there's no reason you should be able to take advantage of
macros only when writing functions. The job of macros is to abstract
away common syntactic patterns, and certain patterns come up again
and again in writing macros that can also benefit from being
abstracted away.</P><P>In fact, you've already seen one such pattern--many macros will, like
the last version of <CODE>do-primes</CODE>, start with a <CODE><B>LET</B></CODE> that
introduces a few variables holding gensymed symbols to be used in the
macro's expansion. Since this is such a common pattern, why not
abstract it away with its own macro?</P><P>In this section you'll write a macro, <CODE>with-gensyms</CODE>, that does
just that. In other words, you'll write a macro-writing macro: a
macro that generates code that generates code. While complex
macro-writing macros can be a bit confusing until you get used to
keeping the various levels of code clear in your mind,
<CODE>with-gensyms</CODE> is fairly straightforward and will serve as a
useful but not too strenuous mental limbering exercise.</P><P>You want to be able to write something like this: </P><PRE>(defmacro do-primes ((var start end) &amp;body body)
(with-gensyms (ending-value-name)
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
(,ending-value-name ,end))
((&gt; ,var ,ending-value-name))
,@body)))</PRE><P>and have it be equivalent to the previous version of
<CODE>do-primes</CODE>. In other words, the <CODE>with-gensyms</CODE> needs to
expand into a <CODE><B>LET</B></CODE> that binds each named variable,
<CODE>ending-value-name</CODE> in this case, to a gensymed symbol. That's
easy enough to write with a simple backquote template.</P><PRE>(defmacro with-gensyms ((&amp;rest names) &amp;body body)
`(let ,(loop for n in names collect `(,n (gensym)))
,@body))</PRE><P>Note how you can use a comma to interpolate the value of the
<CODE><B>LOOP</B></CODE> expression. The loop generates a list of binding forms
where each binding form consists of a list containing one of the
names given to <CODE>with-gensyms</CODE> and the literal code
<CODE>(gensym)</CODE>. You can test what code the <CODE><B>LOOP</B></CODE> expression
would generate at the REPL by replacing <CODE>names</CODE> with a list of
symbols. </P><PRE>CL-USER&gt; (loop for n in '(a b c) collect `(,n (gensym)))
((A (GENSYM)) (B (GENSYM)) (C (GENSYM)))</PRE><P>After the list of binding forms, the body argument to
<CODE>with-gensyms</CODE> is spliced in as the body of the <CODE><B>LET</B></CODE>. Thus,
in the code you wrap in a <CODE>with-gensyms</CODE> you can refer to any of
the variables named in the list of variables passed to
<CODE>with-gensyms</CODE>.</P><P>If you macro-expand the <CODE>with-gensyms</CODE> form in the new
definition of <CODE>do-primes</CODE>, you should see something like this:</P><PRE>(let ((ending-value-name (gensym)))
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
(,ending-value-name ,end))
((&gt; ,var ,ending-value-name))
,@body))</PRE><P>Looks good. While this macro is fairly trivial, it's important to
keep clear about when the different macros are expanded: when you
compile the <CODE><B>DEFMACRO</B></CODE> of <CODE>do-primes</CODE>, the
<CODE>with-gensyms</CODE> form is expanded into the code just shown and
compiled. Thus, the compiled version of <CODE>do-primes</CODE> is just the
same as if you had written the outer <CODE><B>LET</B></CODE> by hand. When you
compile a function that uses <CODE>do-primes</CODE>, the code <I>generated</I>
by <CODE>with-gensyms</CODE> runs generating the <CODE>do-primes</CODE>
expansion, but <CODE>with-gensyms</CODE> itself isn't needed to compile a
<CODE>do-primes</CODE> form since it has already been expanded, back when
<CODE>do-primes</CODE> was compiled. </P><DIV CLASS="sidebarhead">Another classic macro-writing MACRO: ONCE-ONLY</DIV><DIV CLASS="sidebar"><P>Another classic macro-writing macro is <CODE>once-only</CODE>,
which is used to generate code that evaluates certain macro arguments
once only and in a particular order. Using <CODE>once-only</CODE>, you
could write <CODE>do-primes</CODE> almost as simply as the original leaky
version, like this:</P><PRE>(defmacro do-primes ((var start end) &amp;body body)
(once-only (start end)
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
((&gt; ,var ,end))
,@body)))</PRE><P>However, the implementation of <CODE>once-only</CODE> is a bit too involved
for a blow-by-blow explanation, as it relies on multiple levels of
backquoting and unquoting. If you really want to sharpen your macro
chops, you can try to figure out how it works. It looks like this:</P><PRE>(defmacro once-only ((&amp;rest names) &amp;body body)
(let ((gensyms (loop for n in names collect (gensym))))
`(let (,@(loop for g in gensyms collect `(,g (gensym))))
`(let (,,@(loop for g in gensyms for n in names collect ``(,,g ,,n)))
,(let (,@(loop for n in names for g in gensyms collect `(,n ,g)))
,@body)))))</PRE></DIV><A NAME="beyond-simple-macros"><H2>Beyond Simple Macros</H2></A><P>I could, of course, say a lot more about macros. All the macros
you've seen so far have been fairly simple examples that save you a
bit of typing but don't provide radical new ways of expressing
things. In upcoming chapters you'll see examples of macros that allow
you to express things in ways that would be virtually impossible
without macros. You'll start in the very next chapter, in which
you'll build a simple but effective unit test framework.
</P><HR/><DIV CLASS="notes"><P><SUP>1</SUP>As with functions, macros can also contain
declarations, but you don't need to worry about those for now.</P><P><SUP>2</SUP><CODE><B>APPEND</B></CODE>, which I haven't discussed yet, is a function
that takes any number of list arguments and returns the result of
splicing them together into a single list.</P><P><SUP>3</SUP>Another function, <CODE><B>MACROEXPAND</B></CODE>, keeps
expanding the result as long as the first element of the resulting
expansion is the name of the macro. However, this will often show you
a much lower-level view of what the code is doing than you want,
since basic control constructs such as <CODE><B>DO</B></CODE> are also implemented
as macros. In other words, while it can be educational to see what
your macro ultimately expands into, it isn't a very useful view into
what your own macros are doing.</P><P><SUP>4</SUP>If the macro
expansion is shown all on one line, it's probably because the
variable <CODE><B>*PRINT-PRETTY*</B></CODE> is <CODE><B>NIL</B></CODE>. If it is, evaluating
<CODE>(setf *print-pretty* t)</CODE> should make the macro expansion easier
to read.</P><P><SUP>5</SUP>This is from <I>Joel on Software</I>
by Joel Spolsky, also available at
<CODE>http://www.joelonsoftware.com/
articles/LeakyAbstractions.html</CODE>. Spolsky's point in the essay is
that all abstractions leak to some extent; that is, there are no
perfect abstractions. But that doesn't mean you should tolerate leaks
you can easily plug.</P><P><SUP>6</SUP>Of course, certain forms are supposed
to be evaluated more than once, such as the forms in the body of a
<CODE>do-primes</CODE> loop.</P><P><SUP>7</SUP>It may not be obvious that this
loop is necessarily infinite given the nonuniform occurrences of
prime numbers. The starting point for a proof that it is in fact
infinite is Bertrand's postulate, which says for any <I>n</I> &gt; 1, there
exists a prime <I>p</I>, <I>n</I> &lt; <I>p</I> &lt; <I>2n</I>. From there you can
prove that for any prime number, P less than the sum of the preceding
prime numbers, the next prime, P', is also smaller than the original
sum plus P.</P></DIV></BODY></HTML>