emacs.d/clones/gigamonkeys.com/book/programming-in-the-large-packages-and-symbols.html

547 lines
No EOL
41 KiB
HTML

<HTML><HEAD><TITLE>Programming in the Large: Packages and Symbols</TITLE><LINK REL="stylesheet" TYPE="text/css" HREF="style.css"/></HEAD><BODY><DIV CLASS="copyright">Copyright &copy; 2003-2005, Peter Seibel</DIV><H1>21. Programming in the Large: Packages and Symbols</H1><P>In Chapter 4 I discussed how the Lisp reader translates textual names
into objects to be passed to the evaluator, representing them with a
kind of object called a <I>symbol</I>. It turns out that having a
built-in data type specifically for representing names is quite handy
for a lot of kinds of programming.<SUP>1</SUP>
That, however, isn't the topic of this chapter. In this chapter I'll
discuss one of the more immediate and practical aspects of dealing
with names: how to avoid name conflicts between independently
developed pieces of code.</P><P>Suppose, for instance, you're writing a program and decide to use a
third-party library. You don't want to have to know the name of every
function, variable, class, or macro used in the internals of that
library in order to avoid conflicts between those names and the names
you use in your program. You'd like for most of the names in the
library and the names in your program to be considered distinct even
if they happen to have the same textual representation. At the same
time, you'd like certain names defined in the library to be readily
accessible--the names that make up its public API, which you'll want
to use in your program.</P><P>In Common Lisp, this namespace problem boils down to a question of
controlling how the reader translates textual names into symbols: if
you want two occurrences of the same name to be considered the same
by the evaluator, you need to make sure the reader uses the same
symbol to represent each name. Conversely, if you want two names to
be considered distinct, even if they happen to have the same textual
name, you need the reader to create different symbols to represent
each name.</P><A NAME="how-the-reader-uses-packages"><H2>How the Reader Uses Packages</H2></A><P>In Chapter 4 I discussed briefly how the Lisp reader translates names
into symbols, but I glossed over most of the details--now it's time
to take a closer look at what actually happens.</P><P>I'll start by describing the syntax of names understood by the reader
and how that syntax relates to packages. For the moment you can think
of a package as a table that maps strings to symbols. As you'll see
in the next section, the actual mapping is slightly more flexible
than a simple lookup table but not in ways that matter much to the
reader. Each package also has a name, which can be used to find the
package using the function <CODE><B>FIND-PACKAGE</B></CODE>.</P><P>The two key functions that the reader uses to access the
name-to-symbol mappings in a package are <CODE><B>FIND-SYMBOL</B></CODE> and
<CODE><B>INTERN</B></CODE>. Both these functions take a string and, optionally, a
package. If not supplied, the package argument defaults to the value
of the global variable <CODE><B>*PACKAGE*</B></CODE>, also called the <I>current
package</I>.</P><P><CODE><B>FIND-SYMBOL</B></CODE> looks in the package for a symbol with the given
string for a name and returns it, or <CODE><B>NIL</B></CODE> if no symbol is found.
<CODE><B>INTERN</B></CODE> also will return an existing symbol; otherwise it creates
a new symbol with the string as its name and adds it to the package. </P><P>Most names you use are <I>unqualified</I>, names that contain no colons.
When the reader reads such a name, it translates it to a symbol by
converting any unescaped letters to uppercase and passing the
resulting string to <CODE><B>INTERN</B></CODE>. Thus, each time the reader reads the
same name in the same package, it'll get the same symbol object. This
is important because the evaluator uses the object identity of
symbols to determine which function, variable, or other program
element a given symbol refers to. Thus, the reason an expression such
as <CODE>(hello-world)</CODE> results in calling a particular
<CODE>hello-world</CODE> function is because the reader returns the same
symbol when it reads the function call as it did when it read the
<CODE><B>DEFUN</B></CODE> form that defined the function.</P><P>A name containing either a single colon or a double colon is a
package-qualified name. When the reader reads a package-qualified
name, it splits the name on the colon(s) and uses the first part as
the name of a package and the second part as the name of the symbol.
The reader looks up the appropriate package and uses it to translate
the symbol name to a symbol object.</P><P>A name containing only a single colon must refer to an <I>external</I>
symbol--one the package <I>exports</I> for public use. If the named
package doesn't contain a symbol with a given name, or if it does but
it hasn't been exported, the reader signals an error. A double-colon
name can refer to any symbol from the named package, though it's
usually a bad idea--the set of exported symbols defines a package's
public interface, and if you don't respect the package author's
decision about what names to make public and which ones to keep
private, you're asking for trouble down the road. On the other hand,
sometimes a package author will neglect to export a symbol that
really ought to be public. In that case, a double-colon name lets you
get work done without having to wait for the next version of the
package to be released. </P><P>Two other bits of symbol syntax the reader understands are those for
keyword symbols and uninterned symbols. Keyword symbols are written
with names starting with a colon. Such symbols are interned in the
package named <CODE><B>KEYWORD</B></CODE> and automatically exported. Additionally,
when the reader interns a symbol in the <CODE><B>KEYWORD</B></CODE>, it also defines
a constant variable with the symbol as both its name and value. This
is why you can use keywords in argument lists without quoting
them--when they appear in a value position, they evaluate to
themselves. Thus:</P><PRE>(eql ':foo :foo) ==&gt; T</PRE><P>The names of keyword symbols, like all symbols, are converted to all
uppercase by the reader before they're interned. The name doesn't
include the leading colon.</P><PRE>(symbol-name :foo) ==&gt; &quot;FOO&quot;</PRE><P>Uninterned symbols are written with a leading <CODE>#:</CODE>. These names
(minus the <CODE>#:</CODE>) are converted to uppercase as normal and then
translated into symbols, but the symbols aren't interned in any
package; each time the reader reads a <CODE>#:</CODE> name, it creates a
new symbol. Thus:</P><PRE>(eql '#:foo '#:foo) ==&gt; NIL</PRE><P>You'll rarely, if ever, write this syntax yourself, but will
sometimes see it when you print an s-expression containing symbols
returned by the function <CODE><B>GENSYM</B></CODE>.</P><PRE>(gensym) ==&gt; #:G3128</PRE><A NAME="a-bit-of-package-and-symbol-vocabulary"><H2>A Bit of Package and Symbol Vocabulary</H2></A><P>As I mentioned previously, the mapping from names to symbols
implemented by a package is slightly more flexible than a simple
lookup table. At its core, every package contains a name-to-symbol
lookup table, but a symbol can be made accessible via an unqualified
name in a given package in other ways. To talk sensibly about these
other mechanisms, you'll need a little bit of vocabulary.</P><P>To start with, all the symbols that can be found in a given package
using <CODE><B>FIND-SYMBOL</B></CODE> are said to be <I>accessible</I> in that package.
In other words, the accessible symbols in a package are those that
can be referred to with unqualified names when the package is
current.</P><P>A symbol can be accessible in two ways. The first is for the
package's name-to-symbol table to contain an entry for the symbol, in
which case the symbol is said to be <I>present</I> in the package. When
the reader interns a new symbol in a package, it's added to the
package's name-to-symbol table. The package in which a symbol is
first interned is called the symbol's <I>home package</I>.</P><P>The other way a symbol can be accessible in a package is if the
package <I>inherits</I> it. A package inherits symbols from other
packages by <I>using</I> the other packages. Only <I>external</I> symbols
in the used packages are inherited. A symbol is made external in a
package by <I>exporting</I> it. In addition to causing it to be
inherited by using packages, exporting a symbol also--as you saw in
the previous section--makes it possible to refer to the symbol using
a single-colon qualified name. </P><P>To keep the mappings from names to symbols deterministic, the package
system allows only one symbol to be accessible in a given package for
each name. That is, a package can't have a present symbol and an
inherited symbol with the same name or inherit two different symbols,
from different packages, with the same name. However, you can resolve
conflicts by making one of the accessible symbols a <I>shadowing</I>
symbol, which makes the other symbols of the same name inaccessible.
In addition to its name-to-symbol table, each package maintains a
list of shadowing symbols.</P><P>An existing symbol can be <I>imported</I> into another package by adding
it to the package's name-to-symbol table. Thus, the same symbol can
be present in multiple packages. Sometimes you'll import symbols
simply because you want them to be accessible in the importing
package without using their home package. Other times you'll import a
symbol because only present symbols can be exported or be shadowing
symbols. For instance, if a package needs to use two packages that
have external symbols of the same name, one of the symbols must be
imported into the using package in order to be added to its shadowing
list and make the other symbol inaccessible. </P><P>Finally, a present symbol can be <I>uninterned</I> from a package, which
causes it to be removed from the name-to-symbol table and, if it's a
shadowing symbol, from the shadowing list. You might unintern a
symbol from a package to resolve a conflict between the symbol and an
external symbol from a package you want to use. A symbol that isn't
present in any package is called an <I>uninterned</I> symbol, can no
longer be read by the reader, and will be printed using the
<CODE>#:foo</CODE> syntax.</P><A NAME="three-standard-packages"><H2>Three Standard Packages</H2></A><P>In the next section I'll show you how to define your own packages,
including how to make one package use another and how to export,
shadow, and import symbols. But first let's look at a few packages
you've been using already. When you first start Lisp, the value of
<CODE><B>*PACKAGE*</B></CODE> is typically the <CODE>COMMON-LISP-USER</CODE> package, also
known as <CODE>CL-USER</CODE>.<SUP>2</SUP>
<CODE>CL-USER</CODE> uses the package <CODE>COMMON-LISP</CODE>, which exports all
the names defined by the language standard. Thus, when you type an
expression at the REPL, all the names of standard functions, macros,
variables, and so on, will be translated to the symbols exported from
<CODE>COMMON-LISP</CODE>, and all other names will be interned in the
<CODE>COMMON-LISP-USER</CODE> package. For example, the name <CODE><B>*PACKAGE*</B></CODE>
is exported from <CODE>COMMON-LISP</CODE>--if you want to see the value of
<CODE><B>*PACKAGE*</B></CODE>, you can type this: </P><PRE>CL-USER&gt; *package*
#&lt;The COMMON-LISP-USER package&gt;</PRE><P>because <CODE>COMMON-LISP-USER</CODE> uses <CODE>COMMON-LISP</CODE>. Or you can
use a package-qualified name.</P><PRE>CL-USER&gt; common-lisp:*package*
#&lt;The COMMON-LISP-USER package&gt;</PRE><P>You can even use <CODE>COMMON-LISP</CODE>'s nickname, <CODE>CL</CODE>.</P><PRE>CL-USER&gt; cl:*package*
#&lt;The COMMON-LISP-USER package&gt;</PRE><P>But <CODE>*X*</CODE> isn't a symbol in <CODE>COMMON-LISP</CODE>, so you if type
this:</P><PRE>CL-USER&gt; (defvar *x* 10)
*X*</PRE><P>the reader reads <CODE><B>DEFVAR</B></CODE> as the symbol from the
<CODE>COMMON-LISP</CODE> package and <CODE>*X*</CODE> as a symbol in
<CODE>COMMON-LISP-USER</CODE>.</P><P>The REPL can't start in the <CODE>COMMON-LISP</CODE> package because you're
not allowed to intern new symbols in it; <CODE>COMMON-LISP-USER</CODE>
serves as a &quot;scratch&quot; package where you can create your own names
while still having easy access to all the symbols in
<CODE>COMMON-LISP</CODE>.<SUP>3</SUP> Typically, all packages you'll define will also use
<CODE>COMMON-LISP</CODE>, so you don't have to write things like this:</P><PRE>(cl:defun (x) (cl:+ x 2))</PRE><P>The third standard package is the <CODE><B>KEYWORD</B></CODE> package, the package
the Lisp reader uses to intern names starting with colon. Thus, you
can also refer to any keyword symbol with an explicit package
qualification of <CODE>keyword</CODE> like this: </P><PRE>CL-USER&gt; :a
:A
CL-USER&gt; keyword:a
:A
CL-USER&gt; (eql :a keyword:a)
T</PRE><A NAME="defining-your-own-packages"><H2>Defining Your Own Packages</H2></A><P>Working in <CODE>COMMON-LISP-USER</CODE> is fine for experiments at the
REPL, but once you start writing actual programs you'll want to
define new packages so different programs loaded into the same Lisp
environment don't stomp on each other's names. And when you write
libraries that you intend to use in different contexts, you'll want
to define separate packages and then export the symbols that make up
the libraries' public APIs.</P><P>However, before you start defining packages, it's important to
understand one thing about what packages do <I>not</I> do. Packages
don't provide direct control over who can call what function or
access what variable. They provide you with basic control over
namespaces by controlling how the reader translates textual names
into symbol objects, but it isn't until later, in the evaluator, that
the symbol is interpreted as the name of a function or variable or
whatever else. Thus, it doesn't make sense to talk about exporting a
function or a variable from a package. You can export symbols to make
certain names easier to refer to, but the package system doesn't
allow you to restrict how those names are used.<SUP>4</SUP></P><P>With that in mind, you can start looking at how to define packages
and tie them together. You define new packages with the macro
<CODE><B>DEFPACKAGE</B></CODE>, which allows you to not only create the package but
to specify what packages it uses, what symbols it exports, and what
symbols it imports from other packages and to resolve conflicts by
creating shadowing symbols.<SUP>5</SUP> </P><P>I'll describe the various options in terms of how you might use
packages while writing a program that organizes e-mail messages into
a searchable database. The program is purely hypothetical, as are the
libraries I'll refer to--the point is to look at how the packages
used in such a program might be structured.</P><P>The first package you'd need is one to provide a namespace for the
application--you want to be able to name your functions, variables,
and so on, without having to worry about name collisions with
unrelated code. So you'd define a new package with <CODE><B>DEFPACKAGE</B></CODE>.</P><P>If the application is simple enough to be written with no libraries
beyond the facilities provided by the language itself, you could
define a simple package like this: </P><PRE>(defpackage :com.gigamonkeys.email-db
(:use :common-lisp))</PRE><P>This defines a package, named <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>, that
inherits all the symbols exported by the <CODE>COMMON-LISP</CODE>
package.<SUP>6</SUP></P><P>You actually have several choices of how to represent the names of
packages and, as you'll see, the names of symbols in a
<CODE><B>DEFPACKAGE</B></CODE>. Packages and symbols are named with strings. However,
in a <CODE><B>DEFPACKAGE</B></CODE> form, you can specify the names of packages and
symbols with <I>string designators</I>. A string designator is either a
string, which designates itself; a symbol, which designates its name;
or a character, which designates a one-character string containing
just the character. Using keyword symbols, as in the previous
<CODE><B>DEFPACKAGE</B></CODE>, is a common style that allows you to write the names
in lowercase--the reader will convert the names to uppercase for you.
You could also write the <CODE><B>DEFPACKAGE</B></CODE> with strings, but then you
have to write them in all uppercase, because the true names of most
symbols and packages are in fact uppercase because of the case
conversion performed by the reader.<SUP>7</SUP></P><PRE>(defpackage &quot;COM.GIGAMONKEYS.EMAIL-DB&quot;
(:use &quot;COMMON-LISP&quot;))</PRE><P>You could also use nonkeyword symbols--the names in <CODE><B>DEFPACKAGE</B></CODE>
aren't evaluated--but then the very act of reading the
<CODE><B>DEFPACKAGE</B></CODE> form would cause those symbols to be interned in the
current package, which at the very least will pollute that namespace
and may also cause problems later if you try to use the
package.<SUP>8</SUP></P><P>To read code in this package, you need to make it the current package
with the <CODE><B>IN-PACKAGE</B></CODE> macro:</P><PRE>(in-package :com.gigamonkeys.email-db)</PRE><P>If you type this expression at the REPL, it will change the value of
<CODE><B>*PACKAGE*</B></CODE>, affecting how the REPL reads subsequent expressions,
until you change it with another call to <CODE><B>IN-PACKAGE</B></CODE>. Similarly,
if you include an <CODE><B>IN-PACKAGE</B></CODE> in a file that's loaded with
<CODE><B>LOAD</B></CODE> or compiled with <CODE><B>COMPILE-FILE</B></CODE>, it will change the
package, affecting the way subsequent expressions in the file are
read.<SUP>9</SUP></P><P>With the current package set to the <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>
package, other than names inherited from the <CODE>COMMON-LISP</CODE>
package, you can use any name you want for whatever purpose you want.
Thus, you could define a new <CODE>hello-world</CODE> function that could
coexist with the <CODE>hello-world</CODE> function previously defined in
<CODE>COMMON-LISP-USER</CODE>. Here's the behavior of the existing
function: </P><PRE>CL-USER&gt; (hello-world)
hello, world
NIL</PRE><P>Now you can switch to the new package using <CODE><B>IN-PACKAGE</B></CODE>.<SUP>10</SUP> Notice how the prompt changes--the exact
form is determined by the development environment, but in SLIME the
default prompt consists of an abbreviated version of the package
name. </P><PRE>CL-USER&gt; (in-package :com.gigamonkeys.email-db)
#&lt;The COM.GIGAMONKEYS.EMAIL-DB package&gt;
EMAIL-DB&gt; </PRE><P>You can define a new <CODE>hello-world</CODE> in this package:</P><PRE>EMAIL-DB&gt; (defun hello-world () (format t &quot;hello from EMAIL-DB package~%&quot;))
HELLO-WORLD</PRE><P>And test it, like this:</P><PRE>EMAIL-DB&gt; (hello-world)
hello from EMAIL-DB package
NIL</PRE><P>Now switch back to <CODE>CL-USER</CODE>.</P><PRE>EMAIL-DB&gt; (in-package :cl-user)
#&lt;The COMMON-LISP-USER package&gt;
CL-USER&gt; </PRE><P>And the old function is undisturbed.</P><PRE>CL-USER&gt; (hello-world)
hello, world
NIL</PRE><A NAME="packaging-reusable-libraries"><H2>Packaging Reusable Libraries</H2></A><P>While working on the e-mail database, you might write several
functions related to storing and retrieving text that don't have
anything in particular to do with e-mail. You might realize that
those functions could be useful in other programs and decide to
repackage them as a library. You should define a new package, but
this time you'll export certain names to make them available to other
packages.</P><PRE>(defpackage :com.gigamonkeys.text-db
(:use :common-lisp)
(:export :open-db
:save
:store))</PRE><P>Again, you use the <CODE>COMMON-LISP</CODE> package, because you'll need
access to standard functions within <CODE>COM.GIGAMONKEYS.TEXT-DB</CODE>.
The <CODE>:export</CODE> clause specifies names that will be external in
<CODE>COM.GIGAMONKEYS.TEXT-DB</CODE> and thus accessible in packages that
<CODE>:use</CODE> it. Therefore, after you've defined this package, you can
change the definition of the main application package to the
following: </P><PRE>(defpackage :com.gigamonkeys.email-db
(:use :common-lisp :com.gigamonkeys.text-db))</PRE><P>Now code written in <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE> can use
unqualified names to refer to the exported symbols from both
<CODE>COMMON-LISP</CODE> and <CODE>COM.GIGAMONKEYS.TEXT-DB</CODE>. All other names
will continue to be interned directly in the
<CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE> package.</P><A NAME="importing-individual-names"><H2>Importing Individual Names</H2></A><P>Now suppose you find a third-party library of functions for
manipulating e-mail messages. The names used in the library's API are
exported from the package <CODE>COM.ACME.EMAIL</CODE>, so you could
<CODE>:use</CODE> that package to get easy access to those names. But
suppose you need to use only one function from this library, and
other exported symbols conflict with names you already use (or plan
to use) in our own code.<SUP>11</SUP> In this case, you can import the one
symbol you need with an <CODE>:import-from</CODE> clause in the
<CODE><B>DEFPACKAGE</B></CODE>. For instance, if the name of the function you want
to use is <CODE>parse-email-address</CODE>, you can change the
<CODE><B>DEFPACKAGE</B></CODE> to this: </P><PRE>(defpackage :com.gigamonkeys.email-db
(:use :common-lisp :com.gigamonkeys.text-db)
(:import-from :com.acme.email :parse-email-address))</PRE><P>Now anywhere the name <CODE>parse-email-address</CODE> appears in code read
in the <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE> package, it will be read as
the symbol from <CODE>COM.ACME.EMAIL</CODE>. If you need to import more
than one symbol from a single package, you can include multiple names
after the package name in a single <CODE>:import-from</CODE> clause. A
<CODE><B>DEFPACKAGE</B></CODE> can also include multiple <CODE>:import-from</CODE> clauses
in order to import symbols from different packages.</P><P>Occasionally you'll run into the opposite situation--a package may
export a bunch of names you want to use and a few you don't. Rather
than listing all the symbols you <I>do</I> want to use in an
<CODE>:import-from</CODE> clause, you can instead <CODE>:use</CODE> the package
and then list the names you <I>don't</I> want to inherit in a
<CODE>:shadow</CODE> clause. For instance, suppose the <CODE>COM.ACME.TEXT</CODE>
package exports a bunch of names of functions and classes used in
text processing. Further suppose that most of these functions and
classes are ones you'll want to use in your code, but one of the
names, <CODE>build-index</CODE>, conflicts with a name you've already used.
You can make the <CODE>build-index</CODE> from <CODE>COM.ACME.TEXT</CODE>
inaccessible by shadowing it. </P><PRE>(defpackage :com.gigamonkeys.email-db
(:use
:common-lisp
:com.gigamonkeys.text-db
:com.acme.text)
(:import-from :com.acme.email :parse-email-address)
(:shadow :build-index))</PRE><P>The <CODE>:shadow</CODE> clause causes a new symbol named
<CODE>BUILD-INDEX</CODE> to be created and added directly to
<CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>'s name-to-symbol map. Now if the
reader reads the name <CODE>BUILD-INDEX</CODE>, it will translate it to the
symbol in <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>'s map, rather than the one
that would otherwise be inherited from <CODE>COM.ACME.TEXT</CODE>. The new
symbol is also added to a <I>shadowing symbols list</I> that's part of
the <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE> package, so if you later use
another package that also exports a <CODE>BUILD-INDEX</CODE> symbol, the
package system will know there's no conflict--that you want the
symbol from <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE> to be used rather than
any other symbols with the same name inherited from other packages.</P><P>A similar situation can arise if you want to use two packages that
export the same name. In this case the reader won't know which
inherited name to use when it reads the textual name. In such
situations you must resolve the ambiguity by shadowing the
conflicting names. If you don't need to use the name from either
package, you could shadow the name with a <CODE>:shadow</CODE> clause,
creating a new symbol with the same name in your package. But if you
actually want to use one of the inherited symbols, then you need to
resolve the ambiguity with a <CODE>:shadowing-import-from</CODE> clause.
Like an <CODE>:import-from</CODE> clause, a <CODE>:shadowing-import-from</CODE>
clause consists of a package name followed by the names to import
from that package. For instance, if <CODE>COM.ACME.TEXT</CODE> exports a
name <CODE>SAVE</CODE> that conflicts with the name exported from
<CODE>COM.GIGAMONKEYS.TEXT-DB</CODE>, you could resolve the ambiguity with
the following <CODE><B>DEFPACKAGE</B></CODE>: </P><PRE>(defpackage :com.gigamonkeys.email-db
(:use
:common-lisp
:com.gigamonkeys.text-db
:com.acme.text)
(:import-from :com.acme.email :parse-email-address)
(:shadow :build-index)
(:shadowing-import-from :com.gigamonkeys.text-db :save))</PRE><A NAME="packaging-mechanics"><H2>Packaging Mechanics</H2></A><P>That covers the basics of how to use packages to manage namespaces in
several common situations. However, another level of how to use
packages is worth discussing--the raw mechanics of how to organize
code that uses different packages. In this section I'll discuss a few
rules of thumb about how to organize code--where to put your
<CODE><B>DEFPACKAGE</B></CODE> forms relative to the code that uses your packages
via <CODE><B>IN-PACKAGE</B></CODE>.</P><P>Because packages are used by the reader, a package must be defined
before you can <CODE><B>LOAD</B></CODE> or <CODE><B>COMPILE-FILE</B></CODE> a file that contains an
<CODE><B>IN-PACKAGE</B></CODE> expression switching to that package. Packages also
must be defined before other <CODE><B>DEFPACKAGE</B></CODE> forms can refer to them.
For instance, if you're going to <CODE>:use</CODE>
<CODE>COM.GIGAMONKEYS.TEXT-DB</CODE> in <CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>,
then <CODE>COM.GIGAMONKEYS.TEXT-DB</CODE>'s <CODE><B>DEFPACKAGE</B></CODE> must be
evaluated before the <CODE><B>DEFPACKAGE</B></CODE> of
<CODE>COM.GIGAMONKEYS.EMAIL-DB</CODE>.</P><P>The best first step toward making sure packages exist when they need
to is to put all your <CODE><B>DEFPACKAGE</B></CODE>s in files separate from the
code that needs to be read in those packages. Some folks like to
create a <CODE>foo-package.lisp</CODE> file for each individual package,
and others create a single <CODE>packages.lisp</CODE> that contains all the
<CODE><B>DEFPACKAGE</B></CODE> forms for a group of related packages. Either
approach is reasonable, though the one-file-per-package approach also
requires that you arrange to load the individual files in the right
order according to the interpackage dependencies.</P><P>Either way, once all the <CODE><B>DEFPACKAGE</B></CODE> forms have been separated
from the code that will be read in the packages they define, you can
arrange to <CODE><B>LOAD</B></CODE> the files containing the <CODE><B>DEFPACKAGE</B></CODE>s before
you compile or load any of the other files. For simple programs you
can do this by hand: simply <CODE><B>LOAD</B></CODE> the file or files containing
the <CODE><B>DEFPACKAGE</B></CODE> forms, possibly compiling them with
<CODE><B>COMPILE-FILE</B></CODE> first. Then <CODE><B>LOAD</B></CODE> the files that use those
packages, again optionally compiling them first with
<CODE><B>COMPILE-FILE</B></CODE>. Note, however, that the packages don't exist until
you <CODE><B>LOAD</B></CODE> the package definitions, either the source or the files
produced by <CODE><B>COMPILE-FILE</B></CODE>. Thus, if you're compiling everything,
you must still <CODE><B>LOAD</B></CODE> all the package definitions before you can
<CODE><B>COMPILE-FILE</B></CODE> any files to be read in the packages. </P><P>Doing these steps by hand will get tedious after a while. For simple
programs you can automate the steps by writing a file,
<CODE>load.lisp</CODE>, that contains the appropriate <CODE><B>LOAD</B></CODE> and
<CODE><B>COMPILE-FILE</B></CODE> calls in the right order. Then you can just
<CODE><B>LOAD</B></CODE> that file. For more complex programs you'll want to use a
<I>system definition</I> facility to manage loading and compiling files
in the right order.<SUP>12</SUP></P><P>The other key rule of thumb is that each file should contain exactly
one <CODE><B>IN-PACKAGE</B></CODE> form, and it should be the first form in the file
other than comments. Files containing <CODE><B>DEFPACKAGE</B></CODE> forms should
start with <CODE>(in-package &quot;COMMON-LISP-USER&quot;)</CODE>, and all other
files should contain an <CODE><B>IN-PACKAGE</B></CODE> of one of your packages.</P><P>If you violate this rule and switch packages in the middle of a file,
you'll confuse human readers who don't notice the second
<CODE><B>IN-PACKAGE</B></CODE>. Also, many Lisp development environments,
particularly Emacs-based ones such as SLIME, look for an
<CODE><B>IN-PACKAGE</B></CODE> to determine the package they should use when
communicating with Common Lisp. Multiple <CODE><B>IN-PACKAGE</B></CODE> forms per
file may confuse these tools as well.</P><P>On the other hand, it's fine to have multiple files read in the same
package, each with an identical <CODE><B>IN-PACKAGE</B></CODE> form. It's just a
matter of how you like to organize your code.</P><P>The other bit of packaging mechanics has to do with how to name
packages. Package names live in a flat namespace--package names are
just strings, and different packages must have textually distinct
names. Thus, you have to consider the possibility of conflicts
between package names. If you're using only packages you developed
yourself, then you can probably get away with using short names for
your packages. But if you're planning to use third-party libraries or
to publish your code for use by other programmers, then you need to
follow a naming convention that will minimize the possibility of name
collisions between different packages. Many Lispers these days are
adopting Java-style names, like the ones used in this chapter,
consisting of a reversed Internet domain name followed by a dot and a
descriptive string.</P><A NAME="package-gotchas"><H2>Package Gotchas</H2></A><P>Once you're familiar with packages, you won't spend a bunch of time
thinking about them. There's just not that much to them. However, a
couple of gotchas that bite most new Lisp programmers make the
package system seem more complicated and unfriendly than it really
is.</P><P>The number-one gotcha arises most commonly when playing around at the
REPL. You'll be looking at some library that defines certain
interesting functions. You'll try to call one of the functions like
this:</P><PRE>CL-USER&gt; (foo)</PRE><P>and get dropped into the debugger with this error: </P><PRE>attempt to call `FOO' which is an undefined function.
[Condition of type UNDEFINED-FUNCTION]
Restarts:
0: [TRY-AGAIN] Try calling FOO again.
1: [RETURN-VALUE] Return a value instead of calling FOO.
2: [USE-VALUE] Try calling a function other than FOO.
3: [STORE-VALUE] Setf the symbol-function of FOO and call it again.
4: [ABORT] Abort handling SLIME request.
5: [ABORT] Abort entirely from this (lisp) process.</PRE><P>Ah, of course--you forgot to use the library's package. So you quit
the debugger and try to <CODE><B>USE-PACKAGE</B></CODE> the library's package in
order to get access to the name <CODE>FOO</CODE> so you can call the
function.</P><PRE>CL-USER&gt; (use-package :foolib)</PRE><P>But that drops you back into the debugger with this error message:</P><PRE>Using package `FOOLIB' results in name conflicts for these symbols: FOO
[Condition of type PACKAGE-ERROR]
Restarts:
0: [CONTINUE] Unintern the conflicting symbols from the `COMMON-LISP-USER' package.
1: [ABORT] Abort handling SLIME request.
2: [ABORT] Abort entirely from this (lisp) process.</PRE><P>Huh? The problem is the first time you called <CODE>foo</CODE>, the reader
read the name <CODE>foo</CODE> and interned it in <CODE>CL-USER</CODE> before the
evaluator got hold of it and discovered that this newly interned
symbol isn't the name of a function. This new symbol then conflicts
with the symbol of the same name exported from the <CODE>FOOLIB</CODE>
package. If you had remembered to <CODE><B>USE-PACKAGE</B></CODE> <CODE>FOOLIB</CODE>
before you tried to call <CODE>foo</CODE>, the reader would have read
<CODE>foo</CODE> as the inherited symbol and not interned a <CODE>foo</CODE>
symbol in <CODE>CL-USER</CODE>.</P><P>However, all isn't lost, because the first restart offered by the
debugger will patch things up in just the right way: it will unintern
the <CODE>foo</CODE> symbol from <CODE>COMMON-LISP-USER</CODE>, putting the
<CODE>CL-USER</CODE> package back to the state it was in before you called
<CODE>foo</CODE>, allowing the <CODE><B>USE-PACKAGE</B></CODE> to proceed and allowing for
the inherited <CODE>foo</CODE> to be available in <CODE>CL-USER</CODE>.</P><P>This kind of problem can also occur when loading and compiling files.
For instance, if you defined a package, <CODE>MY-APP</CODE>, for code that
was going to use functions with names from the <CODE>FOOLIB</CODE> package,
but forgot to <CODE>:use</CODE> <CODE>FOOLIB</CODE>, when you compile the files
with an <CODE>(in-package :my-app)</CODE> in them, the reader will intern
new symbols in <CODE>MY-APP</CODE> for the names that were supposed to be
read as symbols from <CODE>FOOLIB</CODE>. When you try to run the compiled
code, you'll get undefined function errors. If you then try to
redefine the <CODE>MY-APP</CODE> package to <CODE>:use</CODE> <CODE>FOOLIB</CODE>,
you'll get the conflicting symbols error. The solution is the same:
select the restart to unintern the conflicting symbols from
<CODE>MY-APP</CODE>. You'll then need to recompile the code in the
<CODE>MY-APP</CODE> package so it will refer to the inherited names. </P><P>The next gotcha is essentially the reverse of the first gotcha. In
this case, you'd have defined a package--again, let's say it's
<CODE>MY-APP</CODE>--that uses another package, say, <CODE>FOOLIB</CODE>. Now you
start writing code in the <CODE>MY-APP</CODE> package. Although you used
<CODE>FOOLIB</CODE> in order to be able to refer to the <CODE>foo</CODE>
function, <CODE>FOOLIB</CODE> may export other symbols as well. If you use
one of those exported symbols--say, <CODE>bar</CODE>--as the name of a
function in your own code, Lisp won't complain. Instead, the name of
your function will be the symbol exported by <CODE>FOOLIB</CODE>, which
will clobber the definition of <CODE>bar</CODE> from <CODE>FOOLIB</CODE>.</P><P>This gotcha is more insidious because it doesn't cause an error--from
the evaluator's point of view it's just being asked to associate a
new function with an old name, something that's perfectly legal. It's
suspect only because the code doing the redefining was read with a
different value for <CODE><B>*PACKAGE*</B></CODE> than the name's package. But the
evaluator doesn't necessarily know that. However, in most Lisps
you'll get an warning about &quot;<CODE>redefining BAR, originally defined
in</CODE>?&quot;. You should heed those warnings. If you clobber a definition
from a library, you can restore it by reloading the library code with
<CODE><B>LOAD</B></CODE>.<SUP>13</SUP></P><P>The last package-related gotcha is, by comparison, quite trivial, but
it bites most Lisp programmers at least a few times: you define a
package that uses <CODE>COMMON-LISP</CODE> and maybe a few libraries. Then
at the REPL you change to that package to play around. Then you decide
to quit Lisp altogether and try to call <CODE>(quit)</CODE>. However,
<CODE>quit</CODE> isn't a name from the <CODE>COMMON-LISP</CODE> package--it's
defined by the implementation in some implementation-specific package
that happens to be used by <CODE>COMMON-LISP-USER</CODE>. The solution is
simple--change packages back to <CODE>CL-USER</CODE> to quit. Or use the
SLIME REPL shortcut <CODE>quit</CODE>, which will also save you from having
to remember that in certain Common Lisp implementations the function
to quit is <CODE>exit</CODE>, not <CODE>quit</CODE>.</P><P>You're almost done with your tour of Common Lisp. In the next chapter
I'll discuss the details of the extended <CODE><B>LOOP</B></CODE> macro. After that,
the rest of the book is devoted to &quot;practicals&quot;: a spam filter, a
library for parsing binary files, and various parts of a streaming
MP3 server with a Web interface.
</P><HR/><DIV CLASS="notes"><P><SUP>1</SUP>The kind of programming that
relies on a symbol data type is called, appropriately enough,
<I>symbolic</I> computation. It's typically contrasted to <I>numeric</I>
programming. An example of a primarily symbolic program that all
programmers should be familiar with is a compiler--it treats the text
of a program as symbolic data and translates it into a new form.</P><P><SUP>2</SUP>Every package has one official name and
zero or more <I>nicknames</I> that can be used anywhere you need to use
the package name, such as in package-qualified names or to refer to
the package in a <CODE><B>DEFPACKAGE</B></CODE> or <CODE><B>IN-PACKAGE</B></CODE> form.</P><P><SUP>3</SUP><CODE>COMMON-LISP-USER</CODE> is also allowed to
provide access to symbols exported by other implementation-defined
packages. While this is intended as a convenience for the user--it
makes implementation-specific functionality readily accessible--it
can also cause confusion for new Lispers: Lisp will complain about an
attempt to redefine some name that isn't listed in the language
standard. To see what packages <CODE>COMMON-LISP-USER</CODE> inherits
symbols from in a particular implementation, evaluate this expression
at the REPL:</P><PRE>(mapcar #'package-name (package-use-list :cl-user))</PRE><P>And to find out what package a symbol came from originally, evaluate
this:</P><PRE>(package-name (symbol-package 'some-symbol))</PRE><P>with <CODE>some-symbol</CODE> replaced by the symbol in question. For instance:</P><PRE>(package-name (symbol-package 'car)) ==&gt; &quot;COMMON-LISP&quot;
(package-name (symbol-package 'foo)) ==&gt; &quot;COMMON-LISP-USER&quot;</PRE><P>Symbols inherited from implementation-defined packages will return
some other value.</P><P><SUP>4</SUP>This is
different from the Java package system, which provides a namespace
for classes but is also involved in Java's access control mechanism.
The non-Lisp language with a package system most like Common Lisp's
packages is Perl.</P><P><SUP>5</SUP>All the manipulations performed by
<CODE><B>DEFPACKAGE</B></CODE> can also be performed with functions that man-
ipulate package objects. However, since a package generally needs to
be fully defined before it can be used, those functions are rarely
used. Also, <CODE><B>DEFPACKAGE</B></CODE> takes care of performing all the package
manipulations in the right order--for instance, <CODE><B>DEFPACKAGE</B></CODE> adds
symbols to the shadowing list before it tries to use the used
packages.</P><P><SUP>6</SUP>In many Lisp implementations the <CODE>:use</CODE> clause is
optional if you want only to <CODE>:use</CODE> <CODE>COMMON-LISP</CODE>--if it's
omitted, the package will automatically inherit names from an
implementation-defined list of packages that will usually include
<CODE>COMMON-LISP</CODE>. However, your code will be more portable if you
always explicitly specify the packages you want to <CODE>:use</CODE>. Those
who are averse to typing can use the package's nickname and write
<CODE>(:use :cl)</CODE>.</P><P><SUP>7</SUP>Using keywords instead of
strings has another advantage--Allegro provides a &quot;modern mode&quot; Lisp
in which the reader does no case conversion of names and in which,
instead of a <CODE><B>COMMON-LISP</B></CODE> package with uppercase names, provides a
<CODE>common-lisp</CODE> package with lowercase names. Strictly speaking,
this Lisp isn't a conforming Common Lisp since all the names in the
standard are defined to be uppercase. But if you write your
<CODE><B>DEFPACKAGE</B></CODE> forms using keyword symbols, they will work both in
Common Lisp and in this near relative.</P><P><SUP>8</SUP>Some folks, instead of keywords, use uninterned
symbols, using the <CODE>#:</CODE> syntax.</P><PRE>(defpackage #:com.gigamonkeys.email-db
(:use #:common-lisp))</PRE><P>This saves a tiny bit of memory by not interning any symbols in the
keyword package--the symbol can become garbage after <CODE><B>DEFPACKAGE</B></CODE>
(or the code it expands into) is done with it. However, the difference
is so slight that it really boils down to a matter of aesthetics.</P><P><SUP>9</SUP>The reason to use <CODE><B>IN-PACKAGE</B></CODE> instead of just
<CODE><B>SETF</B></CODE>ing <CODE><B>*PACKAGE*</B></CODE> is that <CODE><B>IN-PACKAGE</B></CODE> expands into code
that will run when the file is compiled by <CODE><B>COMPILE-FILE</B></CODE> as well
as when the file is loaded, changing the way the reader reads the
rest of the file during compilation.</P><P><SUP>10</SUP>In
the REPL buffer in SLIME you can also change packages with a REPL
shortcut. Type a comma, and then enter <CODE>change-package</CODE> at the
<CODE>Command:</CODE> prompt.</P><P><SUP>11</SUP>During development, if you try to
<CODE>:use</CODE> a package that exports a symbol with the same name as a
symbol already interned in the using package, Lisp will signal an
error and typically offer you a restart that will unintern the
offending symbol from the using package. For more on this, see the
section &quot;Package Gotchas.&quot;</P><P><SUP>12</SUP>The code for the &quot;Practical&quot; chapters,
available from this book's Web site, uses the ASDF system definition
library. ASDF stands for Another System Definition Facility.</P><P><SUP>13</SUP>Some Common Lisp implementations, such as Allegro and
SBCL, provide a facility for &quot;locking&quot; the symbols in a particular
package so they can be used in defining forms such as <CODE><B>DEFUN</B></CODE>,
<CODE><B>DEFVAR</B></CODE>, and <CODE><B>DEFCLASS</B></CODE> only when their home package is the
current package.</P></DIV></BODY></HTML>