558 lines
42 KiB
HTML
558 lines
42 KiB
HTML
|
<HTML><HEAD><TITLE>Macros: Defining Your Own</TITLE><LINK REL="stylesheet" TYPE="text/css" HREF="style.css"/></HEAD><BODY><DIV CLASS="copyright">Copyright © 2003-2005, Peter Seibel</DIV><H1>8. Macros: Defining Your Own</H1><P>Now it's time to start writing your own macros. The standard macros I
|
||
|
covered in the previous chapter hint at some of the things you can do
|
||
|
with macros, but that's just the beginning. Common Lisp doesn't
|
||
|
support macros so every Lisp programmer can create their own variants
|
||
|
of standard control constructs any more than C supports functions so
|
||
|
every C programmer can write trivial variants of the functions in the
|
||
|
C standard library. Macros are part of the language to allow you to
|
||
|
create abstractions on top of the core language and standard library
|
||
|
that move you closer toward being able to directly express the things
|
||
|
you want to express.</P><P>Perhaps the biggest barrier to a proper understanding of macros is,
|
||
|
ironically, that they're so well integrated into the language. In
|
||
|
many ways they seem like just a funny kind of function--they're
|
||
|
written in Lisp, they take arguments and return results, and they
|
||
|
allow you to abstract away distracting details. Yet despite these
|
||
|
many similarities, macros operate at a different level than functions
|
||
|
and create a totally different kind of abstraction.</P><P>Once you understand the difference between macros and functions, the
|
||
|
tight integration of macros in the language will be a huge benefit.
|
||
|
But in the meantime, it's a frequent source of confusion for new
|
||
|
Lispers. The following story, while not true in a historical or
|
||
|
technical sense, tries to alleviate the confusion by giving you a way
|
||
|
to think about how macros work. </P><A NAME="the-story-of-mac-a-just-so-story"><H2>The Story of Mac: A Just-So Story</H2></A><P>Once upon a time, long ago, there was a company of Lisp programmers.
|
||
|
It was so long ago, in fact, that Lisp had no macros. Anything that
|
||
|
couldn't be defined with a function or done with a special operator
|
||
|
had to be written in full every time, which was rather a drag.
|
||
|
Unfortunately, the programmers in this company--though
|
||
|
brilliant--were also quite lazy. Often in the middle of their
|
||
|
programs--when the tedium of writing a bunch of code got to be too
|
||
|
much--they would instead write a note describing the code they needed
|
||
|
to write at that place in the program. Even more unfortunately,
|
||
|
because they were lazy, the programmers also hated to go back and
|
||
|
actually write the code described by the notes. Soon the company had
|
||
|
a big stack of programs that nobody could run because they were full
|
||
|
of notes about code that still needed to be written.</P><P>In desperation, the big bosses hired a junior programmer, Mac, whose
|
||
|
job was to find the notes, write the required code, and insert it
|
||
|
into the program in place of the notes. Mac never ran the
|
||
|
programs--they weren't done yet, of course, so he couldn't. But even
|
||
|
if they had been completed, Mac wouldn't have known what inputs to
|
||
|
feed them. So he just wrote his code based on the contents of the
|
||
|
notes and sent it back to the original programmer.</P><P>With Mac's help, all the programs were soon completed, and the
|
||
|
company made a ton of money selling them--so much money that the
|
||
|
company could double the size of its programming staff. But for some
|
||
|
reason no one thought to hire anyone to help Mac; soon he was single-
|
||
|
handedly assisting several dozen programmers. To avoid spending all
|
||
|
his time searching for notes in source code, Mac made a small
|
||
|
modification to the compiler the programmers used. Thereafter,
|
||
|
whenever the compiler hit a note, it would e-mail him the note and
|
||
|
wait for him to e-mail back the replacement code. Unfortunately, even
|
||
|
with this change, Mac had a hard time keeping up with the
|
||
|
programmers. He worked as carefully as he could, but sometimes--
|
||
|
especially when the notes weren't clear--he would make mistakes.</P><P>The programmers noticed, however, that the more precisely they wrote
|
||
|
their notes, the more likely it was that Mac would send back correct
|
||
|
code. One day, one of the programmers, having a hard time describing
|
||
|
in words the code he wanted, included in one of his notes a Lisp
|
||
|
program that would generate the code he wanted. That was fine by Mac;
|
||
|
he just ran the program and sent the result to the compiler.</P><P>The next innovation came when a programmer put a note at the top of
|
||
|
one of his programs containing a function definition and a comment
|
||
|
that said, "Mac, don't write any code here, but keep this function
|
||
|
for later; I'm going to use it in some of my other notes." Other
|
||
|
notes in the same program said things such as, "Mac, replace this
|
||
|
note with the result of running that other function with the symbols
|
||
|
<CODE>x</CODE> and <CODE>y</CODE> as arguments."</P><P>This technique caught on so quickly that within a few days, most
|
||
|
programs contained dozens of notes defining functions that were only
|
||
|
used by code in other notes. To make it easy for Mac to pick out the
|
||
|
notes containing only definitions that didn't require any immediate
|
||
|
response, the programmers tagged them with the standard preface:
|
||
|
"Definition for Mac, Read Only." This--as the programmers were still
|
||
|
quite lazy--was quickly shortened to "DEF. MAC. R/O" and then
|
||
|
"DEFMACRO."</P><P>Pretty soon, there was no actual English left in the notes for Mac.
|
||
|
All he did all day was read and respond to e-mails from the compiler
|
||
|
containing DEFMACRO notes and calls to the functions defined in the
|
||
|
DEFMACROs. Since the Lisp programs in the notes did all the real
|
||
|
work, keeping up with the e-mails was no problem. Mac suddenly had a
|
||
|
lot of time on his hands and would sit in his office daydreaming
|
||
|
about white-sand beaches, clear blue ocean water, and drinks with
|
||
|
little paper umbrellas in them.</P><P>Several months later the programmers realized nobody had seen Mac for
|
||
|
quite some time. When they went to his office, they found a thin
|
||
|
layer of dust over everything, a desk littered with travel brochures
|
||
|
for various tropical locations, and the computer off. But the
|
||
|
compiler still worked--how could it be? It turned out Mac had made
|
||
|
one last change to the compiler: instead of e-mailing notes to Mac,
|
||
|
the compiler now saved the functions defined by DEFMACRO notes and
|
||
|
ran them when called for by the other notes. The programmers decided
|
||
|
there was no reason to tell the big bosses Mac wasn't coming to the
|
||
|
office anymore. So to this day, Mac draws a salary and from time to
|
||
|
time sends the programmers a postcard from one tropical locale or
|
||
|
another.</P><A NAME="macro-expansion-time-vs-runtime"><H2>Macro Expansion Time vs. Runtime</H2></A><P>The key to understanding macros is to be quite clear about the
|
||
|
distinction between the code that generates code (macros) and the
|
||
|
code that eventually makes up the program (everything else). When you
|
||
|
write macros, you're writing programs that will be used by the
|
||
|
compiler to generate the code that will then be compiled. Only after
|
||
|
all the macros have been fully expanded and the resulting code
|
||
|
compiled can the program actually be run. The time when macros run is
|
||
|
called <I>macro expansion time</I>; this is distinct from <I>runtime</I>,
|
||
|
when regular code, including the code generated by macros, runs.</P><P>It's important to keep this distinction firmly in mind because code
|
||
|
running at macro expansion time runs in a very different environment
|
||
|
than code running at runtime. Namely, at macro expansion time,
|
||
|
there's no way to access the data that will exist at runtime. Like
|
||
|
Mac, who couldn't run the programs he was working on because he
|
||
|
didn't know what the correct inputs were, code running at macro
|
||
|
expansion time can deal only with the data that's inherent in the
|
||
|
source code. For instance, suppose the following source code appears
|
||
|
somewhere in a program:</P><PRE>(defun foo (x)
|
||
|
(when (> x 10) (print 'big)))</PRE><P>Normally you'd think of <CODE>x</CODE> as a variable that will hold the
|
||
|
argument passed in a call to <CODE>foo</CODE>. But at macro expansion time,
|
||
|
such as when the compiler is running the <CODE><B>WHEN</B></CODE> macro, the only
|
||
|
data available is the source code. Since the program isn't running
|
||
|
yet, there's no call to <CODE>foo</CODE> and thus no value associated with
|
||
|
<CODE>x</CODE>. Instead, the values the compiler passes to <CODE><B>WHEN</B></CODE> are
|
||
|
the Lisp lists representing the source code, namely, <CODE>(> x 10)</CODE>
|
||
|
and <CODE>(print 'big)</CODE>. Suppose that <CODE><B>WHEN</B></CODE> is defined, as you
|
||
|
saw in the previous chapter, with something like the following macro:</P><PRE>(defmacro when (condition &rest body)
|
||
|
`(if ,condition (progn ,@body)))</PRE><P>When the code in <CODE>foo</CODE> is compiled, the <CODE><B>WHEN</B></CODE> macro will be
|
||
|
run with those two forms as arguments. The parameter <CODE>condition</CODE>
|
||
|
will be bound to the form <CODE>(> x 10)</CODE>, and the form <CODE>(print
|
||
|
'big)</CODE> will be collected into a list that will become the value of
|
||
|
the <CODE><B>&rest</B></CODE> <CODE>body</CODE> parameter. The backquote expression will
|
||
|
then generate this code:</P><PRE>(if (> x 10) (progn (print 'big)))</PRE><P>by interpolating in the value of <CODE>condition</CODE> and splicing the
|
||
|
value of <CODE>body</CODE> into the <CODE><B>PROGN</B></CODE>.</P><P>When Lisp is interpreted, rather than compiled, the distinction
|
||
|
between macro expansion time and runtime is less clear because
|
||
|
they're temporally intertwined. Also, the language standard doesn't
|
||
|
specify exactly how an interpreter must handle macros--it could
|
||
|
expand all the macros in the form being interpreted and then
|
||
|
interpret the resulting code, or it could start right in on
|
||
|
interpreting the form and expand macros when it hits them. In either
|
||
|
case, macros are always passed the unevaluated Lisp objects
|
||
|
representing the subforms of the macro form, and the job of the macro
|
||
|
is still to produce code that will do something rather than to do
|
||
|
anything directly. </P><A NAME="defmacro"><H2>DEFMACRO</H2></A><P>As you saw in Chapter 3, macros really are defined with <CODE><B>DEFMACRO</B></CODE>
|
||
|
forms, though it stands--of course--for DEFine MACRO, not Definition
|
||
|
for Mac. The basic skeleton of a <CODE><B>DEFMACRO</B></CODE> is quite similar to
|
||
|
the skeleton of a <CODE><B>DEFUN</B></CODE>.</P><PRE>(defmacro <I>name</I> (<I>parameter</I>*)
|
||
|
"Optional documentation string."
|
||
|
<I>body-form</I>*)</PRE><P>Like a function, a macro consists of a name, a parameter list, an
|
||
|
optional documentation string, and a body of Lisp
|
||
|
expressions.<SUP>1</SUP>
|
||
|
However, as I just discussed, the job of a macro isn't to do anything
|
||
|
directly--its job is to generate code that will later do what you
|
||
|
want.</P><P>Macros can use the full power of Lisp to generate their expansion,
|
||
|
which means in this chapter I can only scratch the surface of what
|
||
|
you can do with macros. I can, however, describe a general process
|
||
|
for writing macros that works for all macros from the simplest to the
|
||
|
most complex. </P><P>The job of a macro is to translate a macro form--in other words, a
|
||
|
Lisp form whose first element is the name of the macro--into code
|
||
|
that does a particular thing. Sometimes you write a macro starting
|
||
|
with the code you'd like to be able to write, that is, with an
|
||
|
example macro form. Other times you decide to write a macro after
|
||
|
you've written the same pattern of code several times and realize you
|
||
|
can make your code clearer by abstracting the pattern.</P><P>Regardless of which end you start from, you need to figure out the
|
||
|
other end before you can start writing a macro: you need to know both
|
||
|
where you're coming from and where you're going before you can hope
|
||
|
to write code to do it automatically. Thus, the first step of writing
|
||
|
a macro is to write at least one example of a call to the macro and
|
||
|
the code into which that call should expand.</P><P>Once you have an example call and the desired expansion, you're ready
|
||
|
for the second step: writing the actual macro code. For simple macros
|
||
|
this will be a trivial matter of writing a backquoted template with
|
||
|
the macro parameters plugged into the right places. Complex macros
|
||
|
will be significant programs in their own right, complete with helper
|
||
|
functions and data structures.</P><P>After you've written code to translate the example call to the
|
||
|
appropriate expansion, you need to make sure the abstraction the
|
||
|
macro provides doesn't "leak" details of its implementation. Leaky
|
||
|
macro abstractions will work fine for certain arguments but not
|
||
|
others or will interact with code in the calling environment in
|
||
|
undesirable ways. As it turns out, macros can leak in a small handful
|
||
|
of ways, all of which are easily avoided as long as you know to check
|
||
|
for them. I'll discuss how in the section "Plugging the Leaks."</P><P>To sum up, the steps to writing a macro are as follows:</P><P><OL>
|
||
|
|
||
|
<LI>Write a sample call to the macro and the code it should expand
|
||
|
into, or vice versa.</LI>
|
||
|
|
||
|
<LI>Write code that generates the handwritten expansion from the
|
||
|
arguments in the sample call.</LI>
|
||
|
|
||
|
<LI>Make sure the macro abstraction doesn't "leak."</LI>
|
||
|
|
||
|
</OL></P><A NAME="a-sample-macro-do-primes"><H2>A Sample Macro: do-primes</H2></A><P>To see how this three-step process works, you'll write a macro
|
||
|
<CODE>do-primes</CODE> that provides a looping construct similar to
|
||
|
<CODE><B>DOTIMES</B></CODE> and <CODE><B>DOLIST</B></CODE> except that instead of iterating over
|
||
|
integers or elements of a list, it iterates over successive prime
|
||
|
numbers. This isn't meant to be an example of a particularly useful
|
||
|
macro--it's just a vehicle for demonstrating the process.</P><P>First, you'll need two utility functions, one to test whether a given
|
||
|
number is prime and another that returns the next prime number
|
||
|
greater or equal to its argument. In both cases you can use a simple,
|
||
|
but inefficient, brute-force approach.</P><PRE>(defun primep (number)
|
||
|
(when (> number 1)
|
||
|
(loop for fac from 2 to (isqrt number) never (zerop (mod number fac)))))
|
||
|
|
||
|
(defun next-prime (number)
|
||
|
(loop for n from number when (primep n) return n))</PRE><P>Now you can write the macro. Following the procedure outlined
|
||
|
previously, you need at least one example of a call to the macro and
|
||
|
the code into which it should expand. Suppose you start with the idea
|
||
|
that you want to be able to write this: </P><PRE>(do-primes (p 0 19)
|
||
|
(format t "~d " p))</PRE><P>to express a loop that executes the body once each for each prime
|
||
|
number greater or equal to 0 and less than or equal to 19, with the
|
||
|
variable <CODE>p</CODE> holding the prime number. It makes sense to model
|
||
|
this macro on the form of the standard <CODE><B>DOLIST</B></CODE> and <CODE><B>DOTIMES</B></CODE>
|
||
|
macros; macros that follow the pattern of existing macros are easier
|
||
|
to understand and use than macros that introduce gratuitously novel
|
||
|
syntax.</P><P>Without the <CODE>do-primes</CODE> macro, you could write such a loop with
|
||
|
<CODE><B>DO</B></CODE> (and the two utility functions defined previously) like this:</P><PRE>(do ((p (next-prime 0) (next-prime (1+ p))))
|
||
|
((> p 19))
|
||
|
(format t "~d " p))</PRE><P>Now you're ready to start writing the macro code that will translate
|
||
|
from the former to the latter. </P><A NAME="macro-parameters"><H2>Macro Parameters</H2></A><P>Since the arguments passed to a macro are Lisp objects representing
|
||
|
the source code of the macro call, the first step in any macro is to
|
||
|
extract whatever parts of those objects are needed to compute the
|
||
|
expansion. For macros that simply interpolate their arguments
|
||
|
directly into a template, this step is trivial: simply defining the
|
||
|
right parameters to hold the different arguments is sufficient.</P><P>But this approach, it seems, will not suffice for <CODE>do-primes</CODE>.
|
||
|
The first argument to the <CODE>do-primes</CODE> call is a list containing
|
||
|
the name of the loop variable, <CODE>p</CODE>; the lower bound, <CODE>0</CODE>;
|
||
|
and the upper bound, <CODE>19</CODE>. But if you look at the expansion, the
|
||
|
list as a whole doesn't appear in the expansion; the three element
|
||
|
are split up and put in different places.</P><P>You could define <CODE>do-primes</CODE> with two parameters, one to hold
|
||
|
the list and a <CODE><B>&rest</B></CODE> parameter to hold the body forms, and then
|
||
|
take apart the list by hand, something like this:</P><PRE>(defmacro do-primes (var-and-range &rest body)
|
||
|
(let ((var (first var-and-range))
|
||
|
(start (second var-and-range))
|
||
|
(end (third var-and-range)))
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
|
||
|
((> ,var ,end))
|
||
|
,@body)))</PRE><P>In a moment I'll explain how the body generates the correct
|
||
|
expansion; for now you can just note that the variables <CODE>var</CODE>,
|
||
|
<CODE>start</CODE>, and <CODE>end</CODE> each hold a value, extracted from
|
||
|
<CODE>var-and-range</CODE>, that's then interpolated into the backquote
|
||
|
expression that generates <CODE>do-primes</CODE>'s expansion.</P><P>However, you don't need to take apart <CODE>var-and-range</CODE> "by hand"
|
||
|
because macro parameter lists are what are called <I>destructuring</I>
|
||
|
parameter lists. Destructuring, as the name suggests, involves taking
|
||
|
apart a structure--in this case the list structure of the forms
|
||
|
passed to a macro. </P><P>Within a destructuring parameter list, a simple parameter name can be
|
||
|
replaced with a nested parameter list. The parameters in the nested
|
||
|
parameter list will take their values from the elements of the
|
||
|
expression that would have been bound to the parameter the list
|
||
|
replaced. For instance, you can replace <CODE>var-and-range</CODE> with a
|
||
|
list <CODE>(var start end)</CODE>, and the three elements of the list will
|
||
|
automatically be destructured into those three parameters.</P><P>Another special feature of macro parameter lists is that you can use
|
||
|
<CODE><B>&body</B></CODE> as a synonym for <CODE><B>&rest</B></CODE>. Semantically <CODE><B>&body</B></CODE> and
|
||
|
<CODE><B>&rest</B></CODE> are equivalent, but many development environments will use
|
||
|
the presence of a <CODE><B>&body</B></CODE> parameter to modify how they indent uses
|
||
|
of the macro--typically <CODE><B>&body</B></CODE> parameters are used to hold a
|
||
|
list of forms that make up the body of the macro.</P><P>So you can streamline the definition of <CODE>do-primes</CODE> and give a
|
||
|
hint to both human readers and your development tools about its
|
||
|
intended use by defining it like this:</P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
|
||
|
((> ,var ,end))
|
||
|
,@body))</PRE><P>In addition to being more concise, destructuring parameter lists also
|
||
|
give you automatic error checking--with <CODE>do-primes</CODE> defined this
|
||
|
way, Lisp will be able to detect a call whose first argument isn't a
|
||
|
three-element list and will give you a meaningful error message just
|
||
|
as if you had called a function with too few or too many arguments.
|
||
|
Also, in development environments such as SLIME that indicate what
|
||
|
arguments are expected as soon as you type the name of a function or
|
||
|
macro, if you use a destructuring parameter list, the environment
|
||
|
will be able to tell you more specifically the syntax of the macro
|
||
|
call. With the original definition, SLIME would tell you
|
||
|
<CODE>do-primes</CODE> is called like this: </P><PRE>(do-primes var-and-range &rest body)</PRE><P>But with the new definition, it can tell you that a call should look
|
||
|
like this:</P><PRE>(do-primes (var start end) &body body)</PRE><P>Destructuring parameter lists can contain <CODE><B>&optional</B></CODE>, <CODE><B>&key</B></CODE>,
|
||
|
and <CODE><B>&rest</B></CODE> parameters and can contain nested destructuring lists.
|
||
|
However, you don't need any of those options to write
|
||
|
<CODE>do-primes</CODE>. </P><A NAME="generating-the-expansion"><H2>Generating the Expansion</H2></A><P>Because <CODE>do-primes</CODE> is a fairly simple macro, after you've
|
||
|
destructured the arguments, all that's left is to interpolate them
|
||
|
into a template to get the expansion.</P><P>For simple macros like <CODE>do-primes</CODE>, the special backquote syntax
|
||
|
is perfect. To review, a backquoted expression is similar to a quoted
|
||
|
expression except you can "unquote" particular subexpressions by
|
||
|
preceding them with a comma, possibly followed by an at (@) sign.
|
||
|
Without an at sign, the comma causes the value of the subexpression
|
||
|
to be included as is. With an at sign, the value--which must be a
|
||
|
list--is "spliced" into the enclosing list.</P><P>Another useful way to think about the backquote syntax is as a
|
||
|
particularly concise way of writing code that generates lists. This
|
||
|
way of thinking about it has the benefit of being pretty much exactly
|
||
|
what's happening under the covers--when the reader reads a backquoted
|
||
|
expression, it translates it into code that generates the appropriate
|
||
|
list structure. For instance, <CODE>`(,a b)</CODE> might be read as
|
||
|
<CODE>(list a 'b)</CODE>. The language standard doesn't specify exactly
|
||
|
what code the reader must produce as long as it generates the right
|
||
|
list structure.</P><P>Table 8-1 shows some examples of backquoted expressions along with
|
||
|
equivalent list-building code and the result you'd get if you
|
||
|
evaluated either the backquoted expression or the equivalent
|
||
|
code.<SUP>2</SUP> </P><P><DIV CLASS="table-caption">Table 8-1. Backquote Examples</DIV></P><TABLE CLASS="book-table"><TR><TD>Backquote Syntax</TD><TD>Equivalent List-Building Code</TD><TD>Result</TD></TR><TR><TD><CODE>`(a (+ 1 2) c)</CODE></TD><TD><CODE>(list 'a '(+ 1 2) 'c)</CODE></TD><TD><CODE>(a (+ 1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,(+ 1 2) c)</CODE></TD><TD><CODE>(list 'a (+ 1 2) 'c)</CODE></TD><TD><CODE>(a 3 c)</CODE></TD></TR><TR><TD><CODE>`(a (list 1 2) c)</CODE></TD><TD><CODE>(list 'a '(list 1 2) 'c)</CODE></TD><TD><CODE>(a (list 1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,(list 1 2) c)</CODE></TD><TD><CODE>(list 'a (list 1 2) 'c)</CODE></TD><TD><CODE>(a (1 2) c)</CODE></TD></TR><TR><TD><CODE>`(a ,@(list 1 2) c)</CODE></TD><TD><CODE>(append (list 'a) (list 1 2) (list 'c))</CODE></TD><TD><CODE>(a 1 2 c)</CODE></TD></TR></TABLE><P>It's important to note that backquote is just a convenience. But it's
|
||
|
a big convenience. To appreciate how big, compare the backquoted
|
||
|
version of <CODE>do-primes</CODE> to the following version, which uses
|
||
|
explicit list-building code: </P><PRE>(defmacro do-primes-a ((var start end) &body body)
|
||
|
(append '(do)
|
||
|
(list (list (list var
|
||
|
(list 'next-prime start)
|
||
|
(list 'next-prime (list '1+ var)))))
|
||
|
(list (list (list '> var end)))
|
||
|
body))</PRE><P>As you'll see in a moment, the current implementation of
|
||
|
<CODE>do-primes</CODE> doesn't handle certain edge cases correctly. But
|
||
|
first you should verify that it at least works for the original
|
||
|
example. You can test it in two ways. You can test it indirectly by
|
||
|
simply using it--presumably, if the resulting behavior is correct,
|
||
|
the expansion is correct. For instance, you can type the original
|
||
|
example's use of <CODE>do-primes</CODE> to the REPL and see that it indeed
|
||
|
prints the right series of prime numbers.</P><PRE>CL-USER> (do-primes (p 0 19) (format t "~d " p))
|
||
|
2 3 5 7 11 13 17 19
|
||
|
NIL</PRE><P>Or you can check the macro directly by looking at the expansion of a
|
||
|
particular call. The function <CODE><B>MACROEXPAND-1</B></CODE> takes any Lisp
|
||
|
expression as an argument and returns the result of doing one level
|
||
|
of macro expansion.<SUP>3</SUP> Because <CODE><B>MACROEXPAND-1</B></CODE> is a
|
||
|
function, to pass it a literal macro form you must quote it. You can
|
||
|
use it to see the expansion of the previous call.<SUP>4</SUP> </P><PRE>CL-USER> (macroexpand-1 '(do-primes (p 0 19) (format t "~d " p)))
|
||
|
(DO ((P (NEXT-PRIME 0) (NEXT-PRIME (1+ P))))
|
||
|
((> P 19))
|
||
|
(FORMAT T "~d " P))
|
||
|
T</PRE><P>Or, more conveniently, in SLIME you can check a macro's expansion by
|
||
|
placing the cursor on the opening parenthesis of a macro form in your
|
||
|
source code and typing <CODE>C-c RET</CODE> to invoke the Emacs function
|
||
|
<CODE>slime-macroexpand-1</CODE>, which will pass the macro form to
|
||
|
<CODE><B>MACROEXPAND-1</B></CODE> and "pretty print" the result in a temporary
|
||
|
buffer.</P><P>However you get to it, you can see that the result of macro expansion
|
||
|
is the same as the original handwritten expansion, so it seems that
|
||
|
<CODE>do-primes</CODE> works. </P><A NAME="plugging-the-leaks"><H2>Plugging the Leaks</H2></A><P>In his essay "The Law of Leaky Abstractions," Joel Spolsky coined the
|
||
|
term <I>leaky abstraction</I> to describe an abstraction that "leaks"
|
||
|
details it's supposed to be abstracting away. Since writing a macro
|
||
|
is a way of creating an abstraction, you need to make sure your
|
||
|
macros don't leak needlessly.<SUP>5</SUP></P><P>As it turns out, a macro can leak details of its inner workings in
|
||
|
three ways. Luckily, it's pretty easy to tell whether a given macro
|
||
|
suffers from any of those leaks and to fix them.</P><P>The current definition suffers from one of the three possible macro
|
||
|
leaks: namely, it evaluates the <CODE>end</CODE> subform too many times.
|
||
|
Suppose you were to call <CODE>do-primes</CODE> with, instead of a literal
|
||
|
number such as <CODE>19</CODE>, an expression such as <CODE>(random 100)</CODE>
|
||
|
in the <CODE>end</CODE> position. </P><PRE>(do-primes (p 0 (random 100))
|
||
|
(format t "~d " p))</PRE><P>Presumably the intent here is to loop over the primes from zero to
|
||
|
whatever random number is returned by <CODE>(random 100)</CODE>. However,
|
||
|
this isn't what the current implementation does, as
|
||
|
<CODE><B>MACROEXPAND-1</B></CODE> shows.</P><PRE>CL-USER> (macroexpand-1 '(do-primes (p 0 (random 100)) (format t "~d " p)))
|
||
|
(DO ((P (NEXT-PRIME 0) (NEXT-PRIME (1+ P))))
|
||
|
((> P (RANDOM 100)))
|
||
|
(FORMAT T "~d " P))
|
||
|
T</PRE><P>When this expansion code is run, <CODE><B>RANDOM</B></CODE> will be called each time
|
||
|
the end test for the loop is evaluated. Thus, instead of looping
|
||
|
until <CODE>p</CODE> is greater than an initially chosen random number,
|
||
|
this loop will iterate until it happens to draw a random number less
|
||
|
than or equal to the current value of <CODE>p</CODE>. While the total
|
||
|
number of iterations will still be random, it will be drawn from a
|
||
|
much different distribution than the uniform distribution <CODE><B>RANDOM</B></CODE>
|
||
|
returns.</P><P>This is a leak in the abstraction because, to use the macro correctly,
|
||
|
the caller needs to be aware that the <CODE>end</CODE> form is going to be
|
||
|
evaluated more than once. One way to plug this leak would be to simply
|
||
|
define this as the behavior of <CODE>do-primes</CODE>. But that's not very
|
||
|
satisfactory--you should try to observe the Principle of Least
|
||
|
Astonishment when implementing macros. And programmers will typically
|
||
|
expect the forms they pass to macros to be evaluated no more times
|
||
|
than absolutely necessary.<SUP>6</SUP> Furthermore, since <CODE>do-primes</CODE> is built
|
||
|
on the model of the standard macros, <CODE><B>DOTIMES</B></CODE> and <CODE><B>DOLIST</B></CODE>,
|
||
|
neither of which causes any of the forms except those in the body to
|
||
|
be evaluated more than once, most programmers will expect
|
||
|
<CODE>do-primes</CODE> to behave similarly.</P><P>You can fix the multiple evaluation easily enough; you just need to
|
||
|
generate code that evaluates <CODE>end</CODE> once and saves the value in a
|
||
|
variable to be used later. Recall that in a <CODE><B>DO</B></CODE> loop, variables
|
||
|
defined with an initialization form and no step form don't change
|
||
|
from iteration to iteration. So you can fix the multiple evaluation
|
||
|
problem with this definition:</P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
`(do ((ending-value ,end)
|
||
|
(,var (next-prime ,start) (next-prime (1+ ,var))))
|
||
|
((> ,var ending-value))
|
||
|
,@body))</PRE><P>Unfortunately, this fix introduces two new leaks to the macro
|
||
|
abstraction.</P><P>One new leak is similar to the multiple-evaluation leak you just
|
||
|
fixed. Because the initialization forms for variables in a <CODE><B>DO</B></CODE>
|
||
|
loop are evaluated in the order the variables are defined, when the
|
||
|
macro expansion is evaluated, the expression passed as <CODE>end</CODE>
|
||
|
will be evaluated before the expression passed as <CODE>start</CODE>,
|
||
|
opposite to the order they appear in the macro call. This leak
|
||
|
doesn't cause any problems when <CODE>start</CODE> and <CODE>end</CODE> are
|
||
|
literal values like 0 and 19. But when they're forms that can have
|
||
|
side effects, evaluating them out of order can once again run afoul
|
||
|
of the Principle of Least Astonishment. </P><P>This leak is trivially plugged by swapping the order of the two
|
||
|
variable definitions.</P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
|
||
|
(ending-value ,end))
|
||
|
((> ,var ending-value))
|
||
|
,@body))</PRE><P>The last leak you need to plug was created by using the variable name
|
||
|
<CODE>ending-value</CODE>. The problem is that the name, which ought to be
|
||
|
a purely internal detail of the macro implementation, can end up
|
||
|
interacting with code passed to the macro or in the context where the
|
||
|
macro is called. The following seemingly innocent call to
|
||
|
<CODE>do-primes</CODE> doesn't work correctly because of this leak:</P><PRE>(do-primes (ending-value 0 10)
|
||
|
(print ending-value))</PRE><P>Neither does this one:</P><PRE>(let ((ending-value 0))
|
||
|
(do-primes (p 0 10)
|
||
|
(incf ending-value p))
|
||
|
ending-value)</PRE><P>Again, <CODE><B>MACROEXPAND-1</B></CODE> can show you the problem. The first call
|
||
|
expands to this: </P><PRE>(do ((ending-value (next-prime 0) (next-prime (1+ ending-value)))
|
||
|
(ending-value 10))
|
||
|
((> ending-value ending-value))
|
||
|
(print ending-value))</PRE><P>Some Lisps may reject this code because <CODE>ending-value</CODE> is used
|
||
|
twice as a variable name in the same <CODE><B>DO</B></CODE> loop. If not rejected
|
||
|
outright, the code will loop forever since <CODE>ending-value</CODE> will
|
||
|
never be greater than itself.</P><P>The second problem call expands to the following:</P><PRE>(let ((ending-value 0))
|
||
|
(do ((p (next-prime 0) (next-prime (1+ p)))
|
||
|
(ending-value 10))
|
||
|
((> p ending-value))
|
||
|
(incf ending-value p))
|
||
|
ending-value)</PRE><P>In this case the generated code is perfectly legal, but the behavior
|
||
|
isn't at all what you want. Because the binding of
|
||
|
<CODE>ending-value</CODE> established by the <CODE><B>LET</B></CODE> outside the loop is
|
||
|
shadowed by the variable with the same name inside the <CODE><B>DO</B></CODE>, the
|
||
|
form <CODE>(incf ending-value p)</CODE> increments the loop variable
|
||
|
<CODE>ending-value</CODE> instead of the outer variable with the same name,
|
||
|
creating another infinite loop.<SUP>7</SUP></P><P>Clearly, what you need to patch this leak is a symbol that will never
|
||
|
be used outside the code generated by the macro. You could try using
|
||
|
a really unlikely name, but that's no guarantee. You could also
|
||
|
protect yourself to some extent by using packages, as described in
|
||
|
Chapter 21. But there's a better solution. </P><P>The function <CODE><B>GENSYM</B></CODE> returns a unique symbol each time it's
|
||
|
called. This is a symbol that has never been read by the Lisp reader
|
||
|
and never will be because it isn't interned in any package. Thus,
|
||
|
instead of using a literal name like <CODE>ending-value</CODE>, you can
|
||
|
generate a new symbol each time <CODE>do-primes</CODE> is expanded.</P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
(let ((ending-value-name (gensym)))
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
|
||
|
(,ending-value-name ,end))
|
||
|
((> ,var ,ending-value-name))
|
||
|
,@body)))</PRE><P>Note that the code that calls <CODE><B>GENSYM</B></CODE> isn't part of the
|
||
|
expansion; it runs as part of the macro expander and thus creates a
|
||
|
new symbol each time the macro is expanded. This may seem a bit
|
||
|
strange at first--<CODE>ending-value-name</CODE> is a variable whose value
|
||
|
is the name of another variable. But really it's no different from
|
||
|
the parameter <CODE>var</CODE> whose value is the name of a variable--the
|
||
|
difference is the value of <CODE>var</CODE> was created by the reader when
|
||
|
the macro form was read, and the value of <CODE>ending-value-name</CODE> is
|
||
|
generated programmatically when the macro code runs.</P><P>With this definition the two previously problematic forms expand into
|
||
|
code that works the way you want. The first form: </P><PRE>(do-primes (ending-value 0 10)
|
||
|
(print ending-value))</PRE><P>expands into the following:</P><PRE>(do ((ending-value (next-prime 0) (next-prime (1+ ending-value)))
|
||
|
(#:g2141 10))
|
||
|
((> ending-value #:g2141))
|
||
|
(print ending-value))</PRE><P>Now the variable used to hold the ending value is the gensymed
|
||
|
symbol, <CODE>#:g2141</CODE>. The name of the symbol, <CODE>G2141</CODE>, was
|
||
|
generated by <CODE><B>GENSYM</B></CODE> but isn't significant; the thing that
|
||
|
matters is the object identity of the symbol. Gensymed symbols are
|
||
|
printed in the normal syntax for uninterned symbols, with a leading
|
||
|
<CODE>#:</CODE>.</P><P>The other previously problematic form:</P><PRE>(let ((ending-value 0))
|
||
|
(do-primes (p 0 10)
|
||
|
(incf ending-value p))
|
||
|
ending-value)</PRE><P>looks like this if you replace the <CODE>do-primes</CODE> form with its
|
||
|
expansion:</P><PRE>(let ((ending-value 0))
|
||
|
(do ((p (next-prime 0) (next-prime (1+ p)))
|
||
|
(#:g2140 10))
|
||
|
((> p #:g2140))
|
||
|
(incf ending-value p))
|
||
|
ending-value)</PRE><P>Again, there's no leak since the <CODE>ending-value</CODE> variable bound
|
||
|
by the <CODE><B>LET</B></CODE> surrounding the <CODE>do-primes</CODE> loop is no longer
|
||
|
shadowed by any variables introduced in the expanded code.</P><P>Not all literal names used in a macro expansion will necessarily
|
||
|
cause a problem--as you get more experience with the various binding
|
||
|
forms, you'll be able to determine whether a given name is being used
|
||
|
in a position that could cause a leak in a macro abstraction. But
|
||
|
there's no real downside to using a gensymed name just to be safe.</P><P>With that fix, you've plugged all the leaks in the implementation of
|
||
|
<CODE>do-primes</CODE>. Once you've gotten a bit of macro-writing
|
||
|
experience under your belt, you'll learn to write macros with these
|
||
|
kinds of leaks preplugged. It's actually fairly simple if you follow
|
||
|
these rules of thumb: </P><UL><LI>Unless there's a particular reason to do otherwise, include any
|
||
|
subforms in the expansion in positions that will be evaluated in the
|
||
|
same order as the subforms appear in the macro call.</LI><LI>Unless there's a particular reason to do otherwise, make sure
|
||
|
subforms are evaluated only once by creating a variable in the
|
||
|
expansion to hold the value of evaluating the argument form and then
|
||
|
using that variable anywhere else the value is needed in the
|
||
|
expansion.</LI><LI>Use <CODE><B>GENSYM</B></CODE> at macro expansion time to create variable names
|
||
|
used in the expansion.</LI></UL><A NAME="macro-writing-macros"><H2>Macro-Writing Macros</H2></A><P>Of course, there's no reason you should be able to take advantage of
|
||
|
macros only when writing functions. The job of macros is to abstract
|
||
|
away common syntactic patterns, and certain patterns come up again
|
||
|
and again in writing macros that can also benefit from being
|
||
|
abstracted away.</P><P>In fact, you've already seen one such pattern--many macros will, like
|
||
|
the last version of <CODE>do-primes</CODE>, start with a <CODE><B>LET</B></CODE> that
|
||
|
introduces a few variables holding gensymed symbols to be used in the
|
||
|
macro's expansion. Since this is such a common pattern, why not
|
||
|
abstract it away with its own macro?</P><P>In this section you'll write a macro, <CODE>with-gensyms</CODE>, that does
|
||
|
just that. In other words, you'll write a macro-writing macro: a
|
||
|
macro that generates code that generates code. While complex
|
||
|
macro-writing macros can be a bit confusing until you get used to
|
||
|
keeping the various levels of code clear in your mind,
|
||
|
<CODE>with-gensyms</CODE> is fairly straightforward and will serve as a
|
||
|
useful but not too strenuous mental limbering exercise.</P><P>You want to be able to write something like this: </P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
(with-gensyms (ending-value-name)
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
|
||
|
(,ending-value-name ,end))
|
||
|
((> ,var ,ending-value-name))
|
||
|
,@body)))</PRE><P>and have it be equivalent to the previous version of
|
||
|
<CODE>do-primes</CODE>. In other words, the <CODE>with-gensyms</CODE> needs to
|
||
|
expand into a <CODE><B>LET</B></CODE> that binds each named variable,
|
||
|
<CODE>ending-value-name</CODE> in this case, to a gensymed symbol. That's
|
||
|
easy enough to write with a simple backquote template.</P><PRE>(defmacro with-gensyms ((&rest names) &body body)
|
||
|
`(let ,(loop for n in names collect `(,n (gensym)))
|
||
|
,@body))</PRE><P>Note how you can use a comma to interpolate the value of the
|
||
|
<CODE><B>LOOP</B></CODE> expression. The loop generates a list of binding forms
|
||
|
where each binding form consists of a list containing one of the
|
||
|
names given to <CODE>with-gensyms</CODE> and the literal code
|
||
|
<CODE>(gensym)</CODE>. You can test what code the <CODE><B>LOOP</B></CODE> expression
|
||
|
would generate at the REPL by replacing <CODE>names</CODE> with a list of
|
||
|
symbols. </P><PRE>CL-USER> (loop for n in '(a b c) collect `(,n (gensym)))
|
||
|
((A (GENSYM)) (B (GENSYM)) (C (GENSYM)))</PRE><P>After the list of binding forms, the body argument to
|
||
|
<CODE>with-gensyms</CODE> is spliced in as the body of the <CODE><B>LET</B></CODE>. Thus,
|
||
|
in the code you wrap in a <CODE>with-gensyms</CODE> you can refer to any of
|
||
|
the variables named in the list of variables passed to
|
||
|
<CODE>with-gensyms</CODE>.</P><P>If you macro-expand the <CODE>with-gensyms</CODE> form in the new
|
||
|
definition of <CODE>do-primes</CODE>, you should see something like this:</P><PRE>(let ((ending-value-name (gensym)))
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var)))
|
||
|
(,ending-value-name ,end))
|
||
|
((> ,var ,ending-value-name))
|
||
|
,@body))</PRE><P>Looks good. While this macro is fairly trivial, it's important to
|
||
|
keep clear about when the different macros are expanded: when you
|
||
|
compile the <CODE><B>DEFMACRO</B></CODE> of <CODE>do-primes</CODE>, the
|
||
|
<CODE>with-gensyms</CODE> form is expanded into the code just shown and
|
||
|
compiled. Thus, the compiled version of <CODE>do-primes</CODE> is just the
|
||
|
same as if you had written the outer <CODE><B>LET</B></CODE> by hand. When you
|
||
|
compile a function that uses <CODE>do-primes</CODE>, the code <I>generated</I>
|
||
|
by <CODE>with-gensyms</CODE> runs generating the <CODE>do-primes</CODE>
|
||
|
expansion, but <CODE>with-gensyms</CODE> itself isn't needed to compile a
|
||
|
<CODE>do-primes</CODE> form since it has already been expanded, back when
|
||
|
<CODE>do-primes</CODE> was compiled. </P><DIV CLASS="sidebarhead">Another classic macro-writing MACRO: ONCE-ONLY</DIV><DIV CLASS="sidebar"><P>Another classic macro-writing macro is <CODE>once-only</CODE>,
|
||
|
which is used to generate code that evaluates certain macro arguments
|
||
|
once only and in a particular order. Using <CODE>once-only</CODE>, you
|
||
|
could write <CODE>do-primes</CODE> almost as simply as the original leaky
|
||
|
version, like this:</P><PRE>(defmacro do-primes ((var start end) &body body)
|
||
|
(once-only (start end)
|
||
|
`(do ((,var (next-prime ,start) (next-prime (1+ ,var))))
|
||
|
((> ,var ,end))
|
||
|
,@body)))</PRE><P>However, the implementation of <CODE>once-only</CODE> is a bit too involved
|
||
|
for a blow-by-blow explanation, as it relies on multiple levels of
|
||
|
backquoting and unquoting. If you really want to sharpen your macro
|
||
|
chops, you can try to figure out how it works. It looks like this:</P><PRE>(defmacro once-only ((&rest names) &body body)
|
||
|
(let ((gensyms (loop for n in names collect (gensym))))
|
||
|
`(let (,@(loop for g in gensyms collect `(,g (gensym))))
|
||
|
`(let (,,@(loop for g in gensyms for n in names collect ``(,,g ,,n)))
|
||
|
,(let (,@(loop for n in names for g in gensyms collect `(,n ,g)))
|
||
|
,@body)))))</PRE></DIV><A NAME="beyond-simple-macros"><H2>Beyond Simple Macros</H2></A><P>I could, of course, say a lot more about macros. All the macros
|
||
|
you've seen so far have been fairly simple examples that save you a
|
||
|
bit of typing but don't provide radical new ways of expressing
|
||
|
things. In upcoming chapters you'll see examples of macros that allow
|
||
|
you to express things in ways that would be virtually impossible
|
||
|
without macros. You'll start in the very next chapter, in which
|
||
|
you'll build a simple but effective unit test framework.
|
||
|
</P><HR/><DIV CLASS="notes"><P><SUP>1</SUP>As with functions, macros can also contain
|
||
|
declarations, but you don't need to worry about those for now.</P><P><SUP>2</SUP><CODE><B>APPEND</B></CODE>, which I haven't discussed yet, is a function
|
||
|
that takes any number of list arguments and returns the result of
|
||
|
splicing them together into a single list.</P><P><SUP>3</SUP>Another function, <CODE><B>MACROEXPAND</B></CODE>, keeps
|
||
|
expanding the result as long as the first element of the resulting
|
||
|
expansion is the name of the macro. However, this will often show you
|
||
|
a much lower-level view of what the code is doing than you want,
|
||
|
since basic control constructs such as <CODE><B>DO</B></CODE> are also implemented
|
||
|
as macros. In other words, while it can be educational to see what
|
||
|
your macro ultimately expands into, it isn't a very useful view into
|
||
|
what your own macros are doing.</P><P><SUP>4</SUP>If the macro
|
||
|
expansion is shown all on one line, it's probably because the
|
||
|
variable <CODE><B>*PRINT-PRETTY*</B></CODE> is <CODE><B>NIL</B></CODE>. If it is, evaluating
|
||
|
<CODE>(setf *print-pretty* t)</CODE> should make the macro expansion easier
|
||
|
to read.</P><P><SUP>5</SUP>This is from <I>Joel on Software</I>
|
||
|
by Joel Spolsky, also available at
|
||
|
<CODE>http://www.joelonsoftware.com/
|
||
|
articles/LeakyAbstractions.html</CODE>. Spolsky's point in the essay is
|
||
|
that all abstractions leak to some extent; that is, there are no
|
||
|
perfect abstractions. But that doesn't mean you should tolerate leaks
|
||
|
you can easily plug.</P><P><SUP>6</SUP>Of course, certain forms are supposed
|
||
|
to be evaluated more than once, such as the forms in the body of a
|
||
|
<CODE>do-primes</CODE> loop.</P><P><SUP>7</SUP>It may not be obvious that this
|
||
|
loop is necessarily infinite given the nonuniform occurrences of
|
||
|
prime numbers. The starting point for a proof that it is in fact
|
||
|
infinite is Bertrand's postulate, which says for any <I>n</I> > 1, there
|
||
|
exists a prime <I>p</I>, <I>n</I> < <I>p</I> < <I>2n</I>. From there you can
|
||
|
prove that for any prime number, P less than the sum of the preceding
|
||
|
prime numbers, the next prime, P', is also smaller than the original
|
||
|
sum plus P.</P></DIV></BODY></HTML>
|