323 lines
16 KiB
HTML
323 lines
16 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
|
|
<!Converted with LaTeX2HTML 0.6.5 (Tue Nov 15 1994) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds >
|
|
<HEAD>
|
|
<TITLE>13.2. Predicates on Characters</TITLE>
|
|
</HEAD>
|
|
<BODY>
|
|
<meta name="description" value=" Predicates on Characters">
|
|
<meta name="keywords" value="clm">
|
|
<meta name="resource-type" value="document">
|
|
<meta name="distribution" value="global">
|
|
<P>
|
|
<b>Common Lisp the Language, 2nd Edition</b>
|
|
<BR> <HR><A NAME=tex2html3238 HREF="node138.html"><IMG ALIGN=BOTTOM ALT="next" SRC="icons/next_motif.gif"></A> <A NAME=tex2html3236 HREF="node135.html"><IMG ALIGN=BOTTOM ALT="up" SRC="icons/up_motif.gif"></A> <A NAME=tex2html3230 HREF="node136.html"><IMG ALIGN=BOTTOM ALT="previous" SRC="icons/previous_motif.gif"></A> <A NAME=tex2html3240 HREF="node1.html"><IMG ALIGN=BOTTOM ALT="contents" SRC="icons/contents_motif.gif"></A> <A NAME=tex2html3241 HREF="index.html"><IMG ALIGN=BOTTOM ALT="index" SRC="icons/index_motif.gif"></A> <BR>
|
|
<B> Next:</B> <A NAME=tex2html3239 HREF="node138.html"> Character Construction and </A>
|
|
<B>Up:</B> <A NAME=tex2html3237 HREF="node135.html"> Characters</A>
|
|
<B> Previous:</B> <A NAME=tex2html3231 HREF="node136.html"> Character Attributes</A>
|
|
<HR> <P>
|
|
<H1><A NAME=SECTION001720000000000000000>13.2. Predicates on Characters</A></H1>
|
|
<P>
|
|
The predicate <tt>characterp</tt> may be used to determine
|
|
whether any Lisp object is a character object.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>standard-char-p <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<tt>standard-char-p</tt> is true if the argument is a ``standard character,''
|
|
that is, an object of type <tt>standard-char</tt>.
|
|
<P>
|
|
Note that any character with a non-zero bits or
|
|
font attribute is non-standard.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>graphic-char-p <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<tt>graphic-char-p</tt> is true if the argument is a ``graphic'' (printing)
|
|
character, and false if it is a ``non-graphic'' (formatting or control)
|
|
character. Graphic characters have a standard textual representation
|
|
as a single glyph, such as <tt>A</tt> or <tt>*</tt> or <tt>=</tt>.
|
|
By convention, the space character is considered to be graphic.
|
|
Of the standard characters
|
|
all but <tt>#\Newline</tt> are graphic.
|
|
The semi-standard characters
|
|
<tt>#\Backspace</tt>, <tt>#\Tab</tt>, <tt>#\Rubout</tt>, <tt>#\Linefeed</tt>, <tt>#\Return</tt>,
|
|
and <tt>#\Page</tt> are not graphic.
|
|
<P>
|
|
Programs may assume that
|
|
graphic characters of font 0 are all of the same width
|
|
when printed, for example, for purposes of columnar
|
|
formatting. (This does not prohibit the use of a variable-pitch font
|
|
as font 0, but merely implies that every implementation of Common Lisp
|
|
must provide <i>some</i> mode of operation in which font 0 is
|
|
a fixed-pitch font.)
|
|
Portable programs should assume that, in general,
|
|
non-graphic characters and characters of
|
|
other fonts may be of varying widths.
|
|
<P>
|
|
Any character with a non-zero bits attribute is non-graphic.
|
|
<P>
|
|
<img align=bottom alt="old_change_begin" src="gif/old_change_begin.gif">
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>string-char-p <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<tt>string-char-p</tt> is true if <i>char</i> can be stored into
|
|
a string, and otherwise is false.
|
|
Any character that satisfies <tt>standard-char-p</tt>
|
|
also satisfies <tt>string-char-p</tt>; others may also.
|
|
<br><img align=bottom alt="old_change_end" src="gif/old_change_end.gif">
|
|
<P>
|
|
<img align=bottom alt="change_begin" src="gif/change_begin.gif"><br>
|
|
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) <A NAME=14529> </A>
|
|
to eliminate <tt>string-char-p</tt>.
|
|
<br><img align=bottom alt="change_end" src="gif/change_end.gif">
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>alpha-char-p <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<tt>alpha-char-p</tt> is true if the argument is an alphabetic
|
|
character, and otherwise is false.
|
|
<P>
|
|
If a character is alphabetic, then it is perforce graphic.
|
|
Therefore any character with a non-zero bits attribute cannot be alphabetic.
|
|
Whether a character is alphabetic may depend on its font number.
|
|
<P>
|
|
Of the standard characters (as defined by <tt>standard-char-p</tt>),
|
|
the letters <tt>A</tt> through <tt>Z</tt> and <tt>a</tt> through <tt>z</tt> are alphabetic.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>upper-case-p <i>char</i> <BR></tt><tt>lower-case-p <i>char</i> <BR></tt><tt>both-case-p <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<P>
|
|
<tt>upper-case-p</tt> is true if the argument is an uppercase
|
|
character, and otherwise is false.
|
|
<P>
|
|
<tt>lower-case-p</tt> is true if the argument is a lowercase
|
|
character, and otherwise is false.
|
|
<P>
|
|
<tt>both-case-p</tt> is true if the argument is an uppercase character and
|
|
there is a corresponding lowercase character (which can be obtained
|
|
using <tt>char-downcase</tt>), or if the argument is a lowercase character and
|
|
there is a corresponding uppercase character (which can be obtained
|
|
using <tt>char-upcase</tt>).
|
|
<P>
|
|
If a character is either uppercase or lowercase, it is necessarily
|
|
alphabetic (and therefore is graphic, and therefore has a zero bits
|
|
attribute). However, it is permissible in theory for an alphabetic
|
|
character to be neither uppercase nor lowercase (in a non-Roman font,
|
|
for example).
|
|
<P>
|
|
Of the standard characters (as defined by <tt>standard-char-p</tt>),
|
|
the letters <tt>A</tt> through <tt>Z</tt> are uppercase and <tt>a</tt>
|
|
through <tt>z</tt> are lowercase.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>digit-char-p <i>char</i> &optional (<i>radix</i> 10)</tt><P>The argument <i>char</i> must be a character object,
|
|
and <i>radix</i> must be a non-negative integer.
|
|
If <i>char</i> is not a digit of the radix
|
|
specified by <i>radix</i>, then <tt>digit-char-p</tt> is
|
|
false; otherwise it returns
|
|
a non-negative integer that is the ``weight'' of <i>char</i> in that radix.
|
|
<P>
|
|
Digits are necessarily graphic characters.
|
|
<P>
|
|
Of the standard characters (as defined by <tt>standard-char-p</tt>),
|
|
the characters <tt>0</tt> through <tt>9</tt>, <tt>A</tt> through <tt>Z</tt>,
|
|
and <tt>a</tt> through <tt>z</tt>
|
|
are digits. The weights of <tt>0</tt> through <tt>9</tt> are the integers 0 through 9,
|
|
and of <tt>A</tt> through <tt>Z</tt> (and also <tt>a</tt> through <tt>z</tt>) are 10 through 35.
|
|
<tt>digit-char-p</tt> returns the weight for one of these digits if and only if
|
|
its weight is strictly less than <i>radix</i>. Thus, for example,
|
|
the digits for radix 16 are
|
|
<P><pre>
|
|
0 1 2 3 4 5 6 7 8 9 A B C D E F
|
|
</pre><P>
|
|
<P>
|
|
Here is an example of the use of <tt>digit-char-p</tt>:
|
|
<P><pre>
|
|
(defun convert-string-to-integer (str &optional (radix 10))
|
|
"Given a digit string and optional radix, return an integer."
|
|
(do ((j 0 (+ j 1))
|
|
(n 0 (+ (* n radix)
|
|
(or (digit-char-p (char str j) radix)
|
|
(error "Bad radix-~D digit: ~C"
|
|
radix
|
|
(char str j))))))
|
|
((= j (length str)) n)))
|
|
</pre><P>
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>alphanumericp <i>char</i></tt><P>The argument <i>char</i> must be a character object.
|
|
<tt>alphanumericp</tt> is true if <i>char</i> is either alphabetic
|
|
or numeric. By definition,
|
|
<P><pre>
|
|
(alphanumericp x)
|
|
== (or (alpha-char-p x) (not (null (digit-char-p x))))
|
|
</pre><P>
|
|
Alphanumeric characters are therefore necessarily graphic
|
|
(as defined by the predicate <tt>graphic-char-p</tt>).
|
|
<P>
|
|
Of the standard characters (as defined by <tt>standard-char-p</tt>),
|
|
the characters <tt>0</tt> through <tt>9</tt>, <tt>A</tt> through <tt>Z</tt>,
|
|
and <tt>a</tt> through <tt>z</tt> are alphanumeric.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>char= <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char/= <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char< <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char> <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char<= <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char>= <i>character</i> &rest <i>more-characters</i></tt><P>The arguments must all be character objects.
|
|
These functions compare the objects using the implementation-dependent
|
|
total ordering on characters, in a manner analogous to numeric
|
|
comparisons by <tt>=</tt> and related functions.
|
|
<P>
|
|
The total ordering on characters is guaranteed to have the following
|
|
properties:
|
|
<UL><LI>
|
|
The standard alphanumeric characters obey the following partial ordering:
|
|
<P><pre>
|
|
A<B<C<D<E<F<G<H<I<J<K<L<M<N<O<P<Q<R<S<T<U<V<W<X<Y<Z
|
|
to 0pta<b<c<d<e<f<g<h<i<j<k<l<m<n<o< p<q<r<s<t<u<v<w<x<y<z
|
|
0<1<2<3<4<5<6<7<8<9
|
|
<i>either</i> 9<A <i>or</i> Z<0
|
|
<i>either</i> 9<a <i>or</i> z<0
|
|
</pre><P>
|
|
This implies that alphabetic ordering holds within each case (upper and
|
|
lower), and that the digits as a group
|
|
are not interleaved with letters. However, the ordering
|
|
or possible interleaving of
|
|
uppercase letters and lowercase letters is unspecified.
|
|
(Note that both the ASCII and the EBCDIC character sets
|
|
conform to this specification. As it happens, neither ordering
|
|
interleaves uppercase and lowercase letters:
|
|
in the ASCII ordering, <tt>9<A</tt> and <tt>Z<a</tt>,
|
|
whereas in the EBCDIC ordering <tt>z<A</tt> and <tt>Z<0</tt>.)
|
|
</UL>
|
|
<P>
|
|
<img align=bottom alt="old_change_begin" src="gif/old_change_begin.gif"><br>
|
|
<UL><LI>
|
|
If two characters have the same bits and font attributes,
|
|
then their ordering by <tt>char<</tt> is consistent with the numerical
|
|
ordering by the predicate <tt><</tt> on their code attributes.
|
|
<P>
|
|
<LI>
|
|
If two characters differ in any attribute (code, bits, or font), then they
|
|
are different.
|
|
</UL>
|
|
<br><img align=bottom alt="old_change_end" src="gif/old_change_end.gif">
|
|
<P>
|
|
<img align=bottom alt="change_begin" src="gif/change_begin.gif"><br>
|
|
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) <A NAME=14622> </A>
|
|
to replace the notion of bits and font attributes with
|
|
that of implementation-defined attributes.
|
|
<P>
|
|
<UL><LI>
|
|
If two characters have identical implementation-defined attributes,
|
|
then their ordering by <tt>char<</tt> is consistent with the numerical
|
|
ordering by the predicate <tt><</tt> on their codes, and similarly
|
|
for <tt>char></tt>, <tt>char<=</tt>, and <tt>char>=</tt>.
|
|
<P>
|
|
<LI>
|
|
If two characters differ in any implementation-defined
|
|
attribute, then they are not <tt>char=</tt>.
|
|
</UL>
|
|
<br><img align=bottom alt="change_end" src="gif/change_end.gif">
|
|
<P>
|
|
The total ordering is not necessarily the same as the total
|
|
ordering on the integers produced by applying <tt>char-int</tt> to the
|
|
characters (although it is a reasonable implementation technique to
|
|
use that ordering).
|
|
<P>
|
|
While alphabetic characters of a given case must be
|
|
properly ordered, they need not be contiguous; thus <tt>(char<= #\a x
|
|
#\z)</tt> is <i>not</i> a valid way of determining whether or not <tt>x</tt> is a
|
|
lowercase letter. That is why a separate
|
|
<tt>lower-case-p</tt> predicate is provided.
|
|
<P>
|
|
<P><pre>
|
|
(char= #\d #\d) is true.
|
|
(char/= #\d #\d) is false.
|
|
(char= #\d #\x) is false.
|
|
(char/= #\d #\x) is true.
|
|
(char= #\d #\D) is false.
|
|
(char/= #\d #\D) is true.
|
|
(char= #\d #\d #\d #\d) is true.
|
|
(char/= #\d #\d #\d #\d) is false.
|
|
(char= #\d #\d #\x #\d) is false.
|
|
(char/= #\d #\d #\x #\d) is false.
|
|
(char= #\d #\y #\x #\c) is false.
|
|
(char/= #\d #\y #\x #\c) is true.
|
|
(char= #\d #\c #\d) is false.
|
|
(char/= #\d #\c #\d) is false.
|
|
(char< #\d #\x) is true.
|
|
(char<= #\d #\x) is true.
|
|
(char< #\d #\d) is false.
|
|
(char<= #\d #\d) is true.
|
|
(char< #\a #\e #\y #\z) is true.
|
|
(char<= #\a #\e #\y #\z) is true.
|
|
(char< #\a #\e #\e #\y) is false.
|
|
(char<= #\a #\e #\e #\y) is true.
|
|
(char> #\e #\d) is true.
|
|
(char>= #\e #\d) is true.
|
|
(char> #\d #\c #\b #\a) is true.
|
|
(char>= #\d #\c #\b #\a) is true.
|
|
(char> #\d #\d #\c #\a) is false.
|
|
(char>= #\d #\d #\c #\a) is true.
|
|
(char> #\e #\d #\b #\c #\a) is false.
|
|
(char>= #\e #\d #\b #\c #\a) is false.
|
|
(char> #\z #\A) may be true or false.
|
|
(char> #\Z #\a) may be true or false.
|
|
</pre><P>
|
|
<P>
|
|
There is no requirement that <tt>(eq c1 c2)</tt> be true merely because
|
|
<tt>(char= c1 c2)</tt> is true. While <tt>eq</tt> may distinguish two character
|
|
objects that <tt>char=</tt> does not, it is distinguishing them not
|
|
as <i>characters</i>, but in some sense on the basis of a lower-level
|
|
implementation characteristic.
|
|
(Of course, if <tt>(eq c1 c2)</tt> is true,
|
|
then one may expect <tt>(char= c1 c2)</tt> to be true.)
|
|
However, <tt>eql</tt> and <tt>equal</tt>
|
|
compare character objects in the same
|
|
way that <tt>char=</tt> does.
|
|
<P>
|
|
<BR><b>[Function]</b><BR>
|
|
<tt>char-equal <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char-not-equal <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char-lessp <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char-greaterp <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char-not-greaterp <i>character</i> &rest <i>more-characters</i> <BR></tt><tt>char-not-lessp <i>character</i> &rest <i>more-characters</i></tt>
|
|
<P>
|
|
<img align=bottom alt="old_change_begin" src="gif/old_change_begin.gif"><br>
|
|
The predicate <tt>char-equal</tt> is like <tt>char=</tt>, and similarly
|
|
for the others, except according to a different ordering such that
|
|
differences of bits attributes and case are ignored,
|
|
and font information is taken into account in an implementation-dependent
|
|
manner.
|
|
<br><img align=bottom alt="old_change_end" src="gif/old_change_end.gif">
|
|
<P>
|
|
<img align=bottom alt="change_begin" src="gif/change_begin.gif"><br>
|
|
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) <A NAME=14789> </A>
|
|
to replace the notion of bits and font attributes with
|
|
that of implementation-defined attributes. The effect, if any,
|
|
of each such attribute on the behavior of
|
|
<tt>char-equal</tt>, <tt>char-not-equal</tt>, <tt>char-lessp</tt>, <tt>char-greaterp</tt>,
|
|
<tt>char-not-greaterp</tt>, and <tt>char-not-lessp</tt> must be specified
|
|
as part of the definition of that attribute.
|
|
<br><img align=bottom alt="change_end" src="gif/change_end.gif">
|
|
<P>
|
|
For the standard characters, the ordering is such that
|
|
<tt>A=a</tt>, <tt>B=b</tt>, and so on, up to <tt>Z=z</tt>, and furthermore either
|
|
<tt>9<A</tt> or <tt>Z<0</tt>.
|
|
For example:
|
|
<P><pre>
|
|
(char-equal #\A #\a) is true.
|
|
(char= #\A #\a) is false.
|
|
(char-equal #\A #\Control-A) is true.
|
|
</pre><P>
|
|
<img align=bottom alt="old_change_begin" src="gif/old_change_begin.gif"><br>
|
|
The ordering may depend on the font information. For example, an implementation
|
|
might decree that <tt>(char-equal #\p #\<i>p</i>)</tt> be true, but that
|
|
<tt>(char-equal #\p #\</tt><b>pi</b><tt>)</tt> be false
|
|
(where <tt>#\</tt><b>pi</b> is a
|
|
lowercase <tt>p</tt> in some font). Assuming italics to be in font 1
|
|
and the Greek alphabet in font 2, this is the same as saying that
|
|
<tt>(char-equal #0\p #1\p)</tt> may be true and at the same time
|
|
<tt>(char-equal #0\p #2\p)</tt> may be false.
|
|
<br><img align=bottom alt="old_change_end" src="gif/old_change_end.gif">
|
|
<P>
|
|
<BR> <HR><A NAME=tex2html3238 HREF="node138.html"><IMG ALIGN=BOTTOM ALT="next" SRC="icons/next_motif.gif"></A> <A NAME=tex2html3236 HREF="node135.html"><IMG ALIGN=BOTTOM ALT="up" SRC="icons/up_motif.gif"></A> <A NAME=tex2html3230 HREF="node136.html"><IMG ALIGN=BOTTOM ALT="previous" SRC="icons/previous_motif.gif"></A> <A NAME=tex2html3240 HREF="node1.html"><IMG ALIGN=BOTTOM ALT="contents" SRC="icons/contents_motif.gif"></A> <A NAME=tex2html3241 HREF="index.html"><IMG ALIGN=BOTTOM ALT="index" SRC="icons/index_motif.gif"></A> <BR>
|
|
<B> Next:</B> <A NAME=tex2html3239 HREF="node138.html"> Character Construction and </A>
|
|
<B>Up:</B> <A NAME=tex2html3237 HREF="node135.html"> Characters</A>
|
|
<B> Previous:</B> <A NAME=tex2html3231 HREF="node136.html"> Character Attributes</A>
|
|
<HR> <P>
|
|
<HR>
|
|
<P><ADDRESS>
|
|
AI.Repository@cs.cmu.edu
|
|
</ADDRESS>
|
|
</BODY>
|