1231 lines
42 KiB
HTML
1231 lines
42 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta name="generator" content=
|
||
"HTML Tidy for HTML5 for Linux version 5.2.0">
|
||
<title>Strings</title>
|
||
<meta charset="utf-8">
|
||
<meta name="description" content="A collection of examples of using Common Lisp">
|
||
<meta name="viewport" content=
|
||
"width=device-width, initial-scale=1">
|
||
<link rel="icon" href=
|
||
"assets/cl-logo-blue.png"/>
|
||
<link rel="stylesheet" href=
|
||
"assets/style.css">
|
||
<script type="text/javascript" src=
|
||
"assets/highlight-lisp.js">
|
||
</script>
|
||
<script type="text/javascript" src=
|
||
"assets/jquery-3.2.1.min.js">
|
||
</script>
|
||
<script type="text/javascript" src=
|
||
"assets/jquery.toc/jquery.toc.min.js">
|
||
</script>
|
||
<script type="text/javascript" src=
|
||
"assets/toggle-toc.js">
|
||
</script>
|
||
|
||
<link rel="stylesheet" href=
|
||
"assets/github.css">
|
||
|
||
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
|
||
</head>
|
||
<body>
|
||
<h1 id="title-xs"><a href="index.html">The Common Lisp Cookbook</a> – Strings</h1>
|
||
<div id="logo-container">
|
||
<a href="index.html">
|
||
<img id="logo" src="assets/cl-logo-blue.png"/>
|
||
</a>
|
||
|
||
<div id="searchform-container">
|
||
<form onsubmit="duckSearch()" action="javascript:void(0)">
|
||
<input id="searchField" type="text" value="" placeholder="Search...">
|
||
</form>
|
||
</div>
|
||
|
||
<div id="toc-container" class="toc-close">
|
||
<div id="toc-title">Table of Contents</div>
|
||
<ul id="toc" class="list-unstyled"></ul>
|
||
</div>
|
||
</div>
|
||
|
||
<div id="content-container">
|
||
<h1 id="title-non-xs"><a href="index.html">The Common Lisp Cookbook</a> – Strings</h1>
|
||
|
||
<!-- Announcement we can keep for 1 month or more. I remove it and re-add it from time to time. -->
|
||
<p class="announce">
|
||
📹 <a href="https://www.udemy.com/course/common-lisp-programming/?couponCode=6926D599AA-LISP4ALL">NEW! Learn Lisp in videos and support our contributors with this 40% discount.</a>
|
||
</p>
|
||
<p class="announce-neutral">
|
||
📕 <a href="index.html#download-in-epub">Get the EPUB and PDF</a>
|
||
</p>
|
||
|
||
|
||
<div id="content"
|
||
<p>The most important thing to know about strings in Common Lisp is probably that
|
||
they are arrays and thus also sequences. This implies that all concepts that are
|
||
applicable to arrays and sequences also apply to strings. If you can’t find a
|
||
particular string function, make sure you’ve also searched for the more general
|
||
<a href="http://www.gigamonkeys.com/book/collections.html">array or sequence functions</a>. We’ll only cover a fraction of what can be done
|
||
with and to strings here.</p>
|
||
|
||
<p>ASDF3, which is included with almost all Common Lisp implementations,
|
||
includes
|
||
<a href="https://gitlab.common-lisp.net/asdf/asdf/blob/master/uiop/README.md">Utilities for Implementation- and OS- Portability (UIOP)</a>,
|
||
which defines functions to work on strings (<code>strcat</code>,
|
||
<code>string-prefix-p</code>, <code>string-enclosed-p</code>, <code>first-char</code>, <code>last-char</code>,
|
||
<code>split-string</code>, <code>stripln</code>).</p>
|
||
|
||
<p>Some external libraries available on Quicklisp bring some more
|
||
functionality or some shorter ways to do.</p>
|
||
|
||
<ul>
|
||
<li><a href="https://github.com/vindarel/cl-str">str</a> defines <code>trim</code>, <code>words</code>,
|
||
<code>unwords</code>, <code>lines</code>, <code>unlines</code>, <code>concat</code>, <code>split</code>, <code>shorten</code>, <code>repeat</code>,
|
||
<code>replace-all</code>, <code>starts-with?</code>, <code>ends-with?</code>, <code>blankp</code>, <code>emptyp</code>, …</li>
|
||
<li><a href="https://github.com/ruricolist/serapeum/blob/master/REFERENCE.md#strings">Serapeum</a> is a large set of utilities with many string manipulation functions.</li>
|
||
<li><a href="https://github.com/rudolfochrist/cl-change-case">cl-change-case</a>
|
||
has functions to convert strings between camelCase, param-case,
|
||
snake_case and more. They are also included into <code>str</code>.</li>
|
||
<li><a href="https://github.com/cbaggers/mk-string-metrics">mk-string-metrics</a>
|
||
has functions to calculate various string metrics efficiently
|
||
(Damerau-Levenshtein, Hamming, Jaro, Jaro-Winkler, Levenshtein, etc),</li>
|
||
<li>and <code>cl-ppcre</code> can come in handy, for example
|
||
<code>ppcre:replace-regexp-all</code>. See the <a href="regexp.html">regexp</a> section.</li>
|
||
</ul>
|
||
|
||
<p>Last but not least, when you’ll need to tackle the <code>format</code> construct,
|
||
don’t miss the following resources:</p>
|
||
|
||
<ul>
|
||
<li>the official <a href="http://www.lispworks.com/documentation/HyperSpec/Body/22_c.htm">CLHS documentation</a></li>
|
||
<li>a <a href="http://clqr.boundp.org/">quick reference</a></li>
|
||
<li>a <a href="https://www.hexstreamsoft.com/articles/common-lisp-format-reference/clhs-summary/#subsections-summary-table">CLHS summary on HexstreamSoft</a></li>
|
||
<li>plus a Slime tip: type <code>C-c C-d ~</code> plus a letter of a format directive to open up its documentation. Again more useful with <code>ivy-mode</code> or <code>helm-mode</code>.</li>
|
||
</ul>
|
||
|
||
<h2 id="creating-strings">Creating strings</h2>
|
||
|
||
<p>A string is created with double quotes, all right, but we can recall
|
||
these other ways:</p>
|
||
|
||
<ul>
|
||
<li>using <code>format nil</code> doesn’t <em>print</em> but returns a new string (see
|
||
more examples of <code>format</code> below):</li>
|
||
</ul>
|
||
|
||
<pre><code class="language-lisp">(defparameter *person* "you")
|
||
(format nil "hello ~a" *person*) ;; => "hello you"
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li><code>make-string count</code> creates a string of the given length. The
|
||
<code>:initial-element</code> character is repeated <code>count</code> times:</li>
|
||
</ul>
|
||
|
||
<pre><code class="language-lisp">(make-string 3 :initial-element #\♥) ;; => "♥♥♥"
|
||
</code></pre>
|
||
|
||
<h2 id="accessing-substrings">Accessing Substrings</h2>
|
||
|
||
<p>As a string is a sequence, you can access substrings with the SUBSEQ
|
||
function. The index into the string is, as always, zero-based. The third,
|
||
optional, argument is the index of the first character which is not a part of
|
||
the substring, it is not the length of the substring.</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Groucho Marx"))
|
||
*MY-STRING*
|
||
* (subseq *my-string* 8)
|
||
"Marx"
|
||
* (subseq *my-string* 0 7)
|
||
"Groucho"
|
||
* (subseq *my-string* 1 5)
|
||
"rouc"
|
||
</code></pre>
|
||
|
||
<p>You can also manipulate the substring if you use SUBSEQ together with SETF.</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Harpo Marx"))
|
||
*MY-STRING*
|
||
* (subseq *my-string* 0 5)
|
||
"Harpo"
|
||
* (setf (subseq *my-string* 0 5) "Chico")
|
||
"Chico"
|
||
* *my-string*
|
||
"Chico Marx"
|
||
</code></pre>
|
||
|
||
<p>But note that the string isn’t “stretchable”. To cite from the HyperSpec: “If
|
||
the subsequence and the new sequence are not of equal length, the shorter length
|
||
determines the number of elements that are replaced.” For example:</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Karl Marx"))
|
||
*MY-STRING*
|
||
* (subseq *my-string* 0 4)
|
||
"Karl"
|
||
* (setf (subseq *my-string* 0 4) "Harpo")
|
||
"Harpo"
|
||
* *my-string*
|
||
"Harp Marx"
|
||
* (subseq *my-string* 4)
|
||
" Marx"
|
||
* (setf (subseq *my-string* 4) "o Marx")
|
||
"o Marx"
|
||
* *my-string*
|
||
"Harpo Mar"
|
||
</code></pre>
|
||
|
||
<h2 id="accessing-individual-characters">Accessing Individual Characters</h2>
|
||
|
||
<p>You can use the function CHAR to access individual characters of a string. CHAR
|
||
can also be used in conjunction with SETF.</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Groucho Marx"))
|
||
*MY-STRING*
|
||
* (char *my-string* 11)
|
||
#\x
|
||
* (char *my-string* 7)
|
||
#\Space
|
||
* (char *my-string* 6)
|
||
#\o
|
||
* (setf (char *my-string* 6) #\y)
|
||
#\y
|
||
* *my-string*
|
||
"Grouchy Marx"
|
||
</code></pre>
|
||
|
||
<p>Note that there’s also SCHAR. If efficiency is important, SCHAR can be a bit
|
||
faster where appropriate.</p>
|
||
|
||
<p>Because strings are arrays and thus sequences, you can also use the more generic
|
||
functions AREF and ELT (which are more general while CHAR might be implemented
|
||
more efficiently).</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Groucho Marx"))
|
||
*MY-STRING*
|
||
* (aref *my-string* 3)
|
||
#\u
|
||
* (elt *my-string* 8)
|
||
#\M
|
||
</code></pre>
|
||
|
||
<p>Each character in a string has an integer code. The range of recognized codes
|
||
and Lisp’s ability to print them is directed related to your implementation’s
|
||
character set support, e.g. ISO-8859-1, or Unicode. Here are some examples in
|
||
SBCL of UTF-8 which encodes characters as 1 to 4 8 bit bytes. The first example
|
||
shows a character outside the first 128 chars, or what is considered the normal
|
||
Latin character set. The second example shows a multibyte encoding (beyond the
|
||
value 255). Notice the Lisp reader can round-trip characters by name.</p>
|
||
|
||
<pre><code class="language-lisp">* (stream-external-format *standard-output*)
|
||
|
||
:UTF-8
|
||
* (code-char 200)
|
||
|
||
#\LATIN_CAPITAL_LETTER_E_WITH_GRAVE
|
||
* (char-code #\LATIN_CAPITAL_LETTER_E_WITH_GRAVE)
|
||
|
||
200
|
||
* (code-char 1488)
|
||
#\HEBREW_LETTER_ALEF
|
||
|
||
* (char-code #\HEBREW_LETTER_ALEF)
|
||
1488
|
||
</code></pre>
|
||
|
||
<p>Check out the UTF-8 Wikipedia article for the range of supported characters and
|
||
their encodings.</p>
|
||
|
||
<h2 id="remove-or-replace-characters-from-a-string">Remove or replace characters from a string</h2>
|
||
|
||
<p>There’s a slew of (sequence) functions that can be used to manipulate a string
|
||
and we’ll only provide some examples here. See the sequences dictionary in the
|
||
HyperSpec for more.</p>
|
||
|
||
<p><code>remove</code> one character from a string:</p>
|
||
|
||
<pre><code class="language-lisp">* (remove #\o "Harpo Marx")
|
||
"Harp Marx"
|
||
* (remove #\a "Harpo Marx")
|
||
"Hrpo Mrx"
|
||
* (remove #\a "Harpo Marx" :start 2)
|
||
"Harpo Mrx"
|
||
* (remove-if #'upper-case-p "Harpo Marx")
|
||
"arpo arx"
|
||
</code></pre>
|
||
|
||
<p>Replace one character with <code>substitute</code> (non destructive) or <code>replace</code> (destructive):</p>
|
||
|
||
<pre><code class="language-lisp">* (substitute #\u #\o "Groucho Marx")
|
||
"Gruuchu Marx"
|
||
* (substitute-if #\_ #'upper-case-p "Groucho Marx")
|
||
"_roucho _arx"
|
||
* (defparameter *my-string* (string "Zeppo Marx"))
|
||
*MY-STRING*
|
||
* (replace *my-string* "Harpo" :end1 5)
|
||
"Harpo Marx"
|
||
* *my-string*
|
||
"Harpo Marx"
|
||
</code></pre>
|
||
|
||
<h2 id="concatenating-strings">Concatenating Strings</h2>
|
||
|
||
<p>The name says it all: CONCATENATE is your friend. Note that this is a generic
|
||
sequence function and you have to provide the result type as the first argument.</p>
|
||
|
||
<pre><code class="language-lisp">* (concatenate 'string "Karl" " " "Marx")
|
||
"Karl Marx"
|
||
* (concatenate 'list "Karl" " " "Marx")
|
||
(#\K #\a #\r #\l #\Space #\M #\a #\r #\x)
|
||
</code></pre>
|
||
|
||
<p>With UIOP, use <code>strcat</code>:</p>
|
||
|
||
<pre><code class="language-lisp">* (uiop:strcat "karl" " " marx")
|
||
</code></pre>
|
||
|
||
<p>or with the library <code>str</code>, use <code>concat</code>:</p>
|
||
|
||
<pre><code class="language-lisp">* (str:concat "foo" "bar")
|
||
</code></pre>
|
||
|
||
<p>If you have to construct a string out of many parts, all of these calls to
|
||
CONCATENATE seem wasteful, though. There are at least three other good ways to
|
||
construct a string piecemeal, depending on what exactly your data is. If you
|
||
build your string one character at a time, make it an adjustable VECTOR (a
|
||
one-dimensional ARRAY) of type character with a fill-pointer of zero, then use
|
||
VECTOR-PUSH-EXTEND on it. That way, you can also give hints to the system if you
|
||
can estimate how long the string will be. (See the optional third argument to
|
||
VECTOR-PUSH-EXTEND.)</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (make-array 0
|
||
:element-type 'character
|
||
:fill-pointer 0
|
||
:adjustable t))
|
||
*MY-STRING*
|
||
* *my-string*
|
||
""
|
||
* (dolist (char '(#\Z #\a #\p #\p #\a))
|
||
(vector-push-extend char *my-string*))
|
||
NIL
|
||
* *my-string*
|
||
"Zappa"
|
||
</code></pre>
|
||
|
||
<p>If the string will be constructed out of (the printed representations of)
|
||
arbitrary objects, (symbols, numbers, characters, strings, …), you can use
|
||
FORMAT with an output stream argument of NIL. This directs FORMAT to return the
|
||
indicated output as a string.</p>
|
||
|
||
<pre><code class="language-lisp">* (format nil "This is a string with a list ~A in it"
|
||
'(1 2 3))
|
||
"This is a string with a list (1 2 3) in it"
|
||
</code></pre>
|
||
|
||
<p>We can use the looping constructs of the FORMAT mini language to emulate
|
||
CONCATENATE.</p>
|
||
|
||
<pre><code class="language-lisp">* (format nil "The Marx brothers are:~{ ~A~}."
|
||
'("Groucho" "Harpo" "Chico" "Zeppo" "Karl"))
|
||
"The Marx brothers are: Groucho Harpo Chico Zeppo Karl."
|
||
</code></pre>
|
||
|
||
<p>FORMAT can do a lot more processing but it has a relatively arcane syntax. After
|
||
this last example, you can find the details in the CLHS section about formatted
|
||
output.</p>
|
||
|
||
<pre><code class="language-lisp">* (format nil "The Marx brothers are:~{ ~A~^,~}."
|
||
'("Groucho" "Harpo" "Chico" "Zeppo" "Karl"))
|
||
"The Marx brothers are: Groucho, Harpo, Chico, Zeppo, Karl."
|
||
</code></pre>
|
||
|
||
<p>Another way to create a string out of the printed representation of various
|
||
object is using WITH-OUTPUT-TO-STRING. The value of this handy macro is a string
|
||
containing everything that was output to the string stream within the body to
|
||
the macro. This means you also have the full power of FORMAT at your disposal,
|
||
should you need it.</p>
|
||
|
||
<pre><code class="language-lisp">* (with-output-to-string (stream)
|
||
(dolist (char '(#\Z #\a #\p #\p #\a #\, #\Space))
|
||
(princ char stream))
|
||
(format stream "~S - ~S" 1940 1993))
|
||
"Zappa, 1940 - 1993"
|
||
</code></pre>
|
||
|
||
<h2 id="processing-a-string-one-character-at-a-time">Processing a String One Character at a Time</h2>
|
||
|
||
<p>Use the MAP function to process a string one character at a time.</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "Groucho Marx"))
|
||
*MY-STRING*
|
||
* (map 'string #'(lambda (c) (print c)) *my-string*)
|
||
#\G
|
||
#\r
|
||
#\o
|
||
#\u
|
||
#\c
|
||
#\h
|
||
#\o
|
||
#\Space
|
||
#\M
|
||
#\a
|
||
#\r
|
||
#\x
|
||
"Groucho Marx"
|
||
</code></pre>
|
||
|
||
<p>Or do it with LOOP.</p>
|
||
|
||
<pre><code class="language-lisp">* (loop for char across "Zeppo"
|
||
collect char)
|
||
(#\Z #\e #\p #\p #\o)
|
||
</code></pre>
|
||
|
||
<h2 id="reversing-a-string-by-word-or-character">Reversing a String by Word or Character</h2>
|
||
|
||
<p>Reversing a string by character is easy using the built-in REVERSE function (or
|
||
its destructive counterpart NREVERSE).</p>
|
||
|
||
<pre><code class="language-lisp">*(defparameter *my-string* (string "DSL"))
|
||
*MY-STRING*
|
||
* (reverse *my-string*)
|
||
"LSD"
|
||
</code></pre>
|
||
|
||
<p>There’s no one-liner in CL to reverse a string by word (like you would do it in
|
||
Perl with split and join). You either have to use functions from an external
|
||
library like SPLIT-SEQUENCE or you have to roll your own solution.</p>
|
||
|
||
<p>Here’s an attempt with the <code>str</code> library:</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *singing* "singing in the rain")
|
||
*SINGING*
|
||
* (str:words *SINGING*)
|
||
("singing" "in" "the" "rain")
|
||
* (reverse *)
|
||
("rain" "the" "in" "singing")
|
||
* (str:unwords *)
|
||
"rain the in singing"
|
||
</code></pre>
|
||
|
||
<p>And here’s another one with no external dependencies:</p>
|
||
|
||
<pre><code class="language-lisp">* (defun split-by-one-space (string)
|
||
"Returns a list of substrings of string
|
||
divided by ONE space each.
|
||
Note: Two consecutive spaces will be seen as
|
||
if there were an empty string between them."
|
||
(loop for i = 0 then (1+ j)
|
||
as j = (position #\Space string :start i)
|
||
collect (subseq string i j)
|
||
while j))
|
||
SPLIT-BY-ONE-SPACE
|
||
* (split-by-one-space "Singing in the rain")
|
||
("Singing" "in" "the" "rain")
|
||
* (split-by-one-space "Singing in the rain")
|
||
("Singing" "in" "the" "" "rain")
|
||
* (split-by-one-space "Cool")
|
||
("Cool")
|
||
* (split-by-one-space " Cool ")
|
||
("" "Cool" "")
|
||
* (defun join-string-list (string-list)
|
||
"Concatenates a list of strings
|
||
and puts spaces between the elements."
|
||
(format nil "~{~A~^ ~}" string-list))
|
||
JOIN-STRING-LIST
|
||
* (join-string-list '("We" "want" "better" "examples"))
|
||
"We want better examples"
|
||
* (join-string-list '("Really"))
|
||
"Really"
|
||
* (join-string-list '())
|
||
""
|
||
* (join-string-list
|
||
(nreverse
|
||
(split-by-one-space
|
||
"Reverse this sentence by word")))
|
||
"word by sentence this Reverse"
|
||
</code></pre>
|
||
|
||
<h2 id="dealing-with-unicode-strings">Dealing with unicode strings</h2>
|
||
|
||
<p>We’ll use here <a href="http://www.sbcl.org/manual/index.html#String-operations">SBCL’s string operations</a>. More generally, see <a href="http://www.sbcl.org/manual/index.html#Unicode-Support">SBCL’s unicode support</a>.</p>
|
||
|
||
<h3 id="sorting-unicode-strings-alphabetically">Sorting unicode strings alphabetically</h3>
|
||
|
||
<p>Sorting unicode strings with <code>string-lessp</code> as the comparison function
|
||
isn’t satisfying:</p>
|
||
|
||
<pre><code class="language-lisp">(sort '("Aaa" "Ééé" "Zzz") #'string-lessp)
|
||
;; ("Aaa" "Zzz" "Ééé")
|
||
</code></pre>
|
||
|
||
<p>With <a href="http://www.sbcl.org/manual/#String-operations">SBCL</a>, use <code>sb-unicode:unicode<</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(sort '("Aaa" "Ééé" "Zzz") #'sb-unicode:unicode<)
|
||
;; ("Aaa" "Ééé" "Zzz")
|
||
</code></pre>
|
||
|
||
<h3 id="breaking-strings-into-graphenes-sentences-lines-and-words">Breaking strings into graphenes, sentences, lines and words</h3>
|
||
|
||
<p>These functions use SBCL’s <a href="http://www.sbcl.org/manual/#String-operations"><code>sb-unicode</code></a>: they are SBCL specific.</p>
|
||
|
||
<p>Use <code>sb-unicode:sentences</code> to break a string into sentences according
|
||
to the default sentence breaking rules.</p>
|
||
|
||
<p>Use <code>sb-unicode:lines</code> to break a string into lines that are no wider
|
||
than the <code>:margin</code> keyword argument. Combining marks will always be kept together with their base characters, and spaces (but not other types of whitespace) will be removed from the end of lines. If <code>:margin</code> is unspecified, it defaults to 80 characters</p>
|
||
|
||
<pre><code class="language-lisp">(sb-unicode:lines "A first sentence. A second somewhat long one." :margin 10)
|
||
;; => ("A first"
|
||
"sentence."
|
||
"A second"
|
||
"somewhat"
|
||
"long one.")
|
||
</code></pre>
|
||
|
||
<p>See also <code>sb-unicode:words</code> and <code>sb-unicode:graphenes</code>.</p>
|
||
|
||
<p>Tip: you can ensure these functions are run only in SBCL with a feature flag:</p>
|
||
|
||
<pre><code>#+sbcl
|
||
(runs on sbcl)
|
||
#-sbcl
|
||
(runs on other implementations)
|
||
</code></pre>
|
||
|
||
<h2 id="controlling-case">Controlling Case</h2>
|
||
|
||
<p>Common Lisp has a couple of functions to control the case of a string.</p>
|
||
|
||
<pre><code class="language-lisp">* (string-upcase "cool")
|
||
"COOL"
|
||
* (string-upcase "Cool")
|
||
"COOL"
|
||
* (string-downcase "COOL")
|
||
"cool"
|
||
* (string-downcase "Cool")
|
||
"cool"
|
||
* (string-capitalize "cool")
|
||
"Cool"
|
||
* (string-capitalize "cool example")
|
||
"Cool Example"
|
||
</code></pre>
|
||
|
||
<p>These functions take the <code>:start</code> and <code>:end</code> keyword arguments so you can optionally
|
||
only manipulate a part of the string. They also have destructive counterparts
|
||
whose names starts with “N”.</p>
|
||
|
||
<pre><code class="language-lisp">* (string-capitalize "cool example" :start 5)
|
||
"cool Example"
|
||
* (string-capitalize "cool example" :end 5)
|
||
"Cool example"
|
||
* (defparameter *my-string* (string "BIG"))
|
||
*MY-STRING*
|
||
* (defparameter *my-downcase-string* (nstring-downcase *my-string*))
|
||
*MY-DOWNCASE-STRING*
|
||
* *my-downcase-string*
|
||
"big"
|
||
* *my-string*
|
||
"big"
|
||
</code></pre>
|
||
|
||
<p>Note this potential caveat: according to the HyperSpec,</p>
|
||
|
||
<blockquote>
|
||
<p>for STRING-UPCASE, STRING-DOWNCASE, and STRING-CAPITALIZE, string is not modified. However, if no characters in string require conversion, the result may be either string or a copy of it, at the implementation’s discretion.</p>
|
||
</blockquote>
|
||
|
||
<p>This implies that the last result in
|
||
the following example is implementation-dependent - it may either be “BIG” or
|
||
“BUG”. If you want to be sure, use COPY-SEQ.</p>
|
||
|
||
<pre><code class="language-lisp">* (defparameter *my-string* (string "BIG"))
|
||
*MY-STRING*
|
||
* (defparameter *my-upcase-string* (string-upcase *my-string*))
|
||
*MY-UPCASE-STRING*
|
||
* (setf (char *my-string* 1) #\U)
|
||
#\U
|
||
* *my-string*
|
||
"BUG"
|
||
* *my-upcase-string*
|
||
"BIG"
|
||
</code></pre>
|
||
|
||
<h3 id="with-the-format-function">With the format function</h3>
|
||
|
||
<p>The format function has directives to change the case of words:</p>
|
||
|
||
<h4 id="to-lower-case--">To lower case: ~( ~)</h4>
|
||
|
||
<pre><code class="language-lisp">(format t "~(~a~)" "HELLO WORLD")
|
||
;; => hello world
|
||
</code></pre>
|
||
|
||
<h4 id="capitalize-every-word--">Capitalize every word: ~:( ~)</h4>
|
||
|
||
<pre><code class="language-lisp">(format t "~:(~a~)" "HELLO WORLD")
|
||
Hello World
|
||
NIL
|
||
</code></pre>
|
||
|
||
<h4 id="capitalize-the-first-word--">Capitalize the first word: ~@( ~)</h4>
|
||
|
||
<pre><code class="language-lisp">(format t "~@(~a~)" "hello world")
|
||
Hello world
|
||
NIL
|
||
</code></pre>
|
||
|
||
<h4 id="to-upper-case--">To upper case: ~@:( ~)</h4>
|
||
|
||
<p>Where we re-use the colon and the @:</p>
|
||
|
||
<pre><code class="language-lisp">(format t "~@:(~a~)" "hello world")
|
||
HELLO WORLD
|
||
NIL
|
||
</code></pre>
|
||
|
||
<h2 id="trimming-blanks-from-the-ends-of-a-string">Trimming Blanks from the Ends of a String</h2>
|
||
|
||
<p>Not only can you trim blanks, but you can get rid of arbitrary characters. The
|
||
functions STRING-TRIM, STRING-LEFT-TRIM and STRING-RIGHT-TRIM return a substring
|
||
of their second argument where all characters that are in the first argument are
|
||
removed off the beginning and/or the end. The first argument can be any sequence
|
||
of characters.</p>
|
||
|
||
<pre><code class="language-lisp">* (string-trim " " " trim me ")
|
||
"trim me"
|
||
* (string-trim " et" " trim me ")
|
||
"rim m"
|
||
* (string-left-trim " et" " trim me ")
|
||
"rim me "
|
||
* (string-right-trim " et" " trim me ")
|
||
" trim m"
|
||
* (string-right-trim '(#\Space #\e #\t) " trim me ")
|
||
" trim m"
|
||
* (string-right-trim '(#\Space #\e #\t #\m) " trim me ")
|
||
</code></pre>
|
||
|
||
<p>Note: The caveat mentioned in the section about Controlling Case also applies
|
||
here.</p>
|
||
|
||
<h2 id="converting-between-symbols-and-strings">Converting between Symbols and Strings</h2>
|
||
|
||
<p>The function INTERN will “convert” a string to a symbol. Actually, it will check
|
||
whether the symbol denoted by the string (its first argument) is already
|
||
accessible in the package (its second, optional, argument which defaults to the
|
||
current package) and enter it, if necessary, into this package. It is beyond the
|
||
scope of this chapter to explain all the concepts involved and to address the
|
||
second return value of this function. See the CLHS chapter about packages for
|
||
details.</p>
|
||
|
||
<p>Note that the case of the string is relevant.</p>
|
||
|
||
<pre><code class="language-lisp">* (in-package "COMMON-LISP-USER")
|
||
#<The COMMON-LISP-USER package, 35/44 internal, 0/9 external>
|
||
* (intern "MY-SYMBOL")
|
||
MY-SYMBOL
|
||
NIL
|
||
* (intern "MY-SYMBOL")
|
||
MY-SYMBOL
|
||
:INTERNAL
|
||
* (export 'MY-SYMBOL)
|
||
T
|
||
* (intern "MY-SYMBOL")
|
||
MY-SYMBOL
|
||
:EXTERNAL
|
||
* (intern "My-Symbol")
|
||
|My-Symbol|
|
||
NIL
|
||
* (intern "MY-SYMBOL" "KEYWORD")
|
||
:MY-SYMBOL
|
||
NIL
|
||
* (intern "MY-SYMBOL" "KEYWORD")
|
||
:MY-SYMBOL
|
||
:EXTERNAL
|
||
</code></pre>
|
||
|
||
<p>To do the opposite, convert from a symbol to a string, use SYMBOL-NAME or
|
||
STRING.</p>
|
||
|
||
<pre><code class="language-lisp">* (symbol-name 'MY-SYMBOL)
|
||
"MY-SYMBOL"
|
||
* (symbol-name 'my-symbol)
|
||
"MY-SYMBOL"
|
||
* (symbol-name '|my-symbol|)
|
||
"my-symbol"
|
||
* (string 'howdy)
|
||
"HOWDY"
|
||
</code></pre>
|
||
|
||
<h2 id="converting-between-characters-and-strings">Converting between Characters and Strings</h2>
|
||
|
||
<p>You can use COERCE to convert a string of length 1 to a character. You can also
|
||
use COERCE to convert any sequence of characters into a string. You can not use
|
||
COERCE to convert a character to a string, though - you’ll have to use STRING
|
||
instead.</p>
|
||
|
||
<pre><code class="language-lisp">* (coerce "a" 'character)
|
||
#\a
|
||
* (coerce (subseq "cool" 2 3) 'character)
|
||
#\o
|
||
* (coerce "cool" 'list)
|
||
(#\c #\o #\o #\l)
|
||
* (coerce '(#\h #\e #\y) 'string)
|
||
"hey"
|
||
* (coerce (nth 2 '(#\h #\e #\y)) 'character)
|
||
#\y
|
||
* (defparameter *my-array* (make-array 5 :initial-element #\x))
|
||
*MY-ARRAY*
|
||
* *my-array*
|
||
#(#\x #\x #\x #\x #\x)
|
||
* (coerce *my-array* 'string)
|
||
"xxxxx"
|
||
* (string 'howdy)
|
||
"HOWDY"
|
||
* (string #\y)
|
||
"y"
|
||
* (coerce #\y 'string)
|
||
#\y can't be converted to type STRING.
|
||
[Condition of type SIMPLE-TYPE-ERROR]
|
||
</code></pre>
|
||
|
||
<h2 id="finding-an-element-of-a-string">Finding an Element of a String</h2>
|
||
|
||
<p>Use FIND, POSITION, and their -IF counterparts to find characters in a string.</p>
|
||
|
||
<pre><code class="language-lisp">* (find #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)
|
||
#\t
|
||
* (find #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)
|
||
#\T
|
||
* (find #\z "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)
|
||
NIL
|
||
* (find-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")
|
||
#\1
|
||
* (find-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :from-end t)
|
||
#\0
|
||
* (position #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)
|
||
17
|
||
* (position #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)
|
||
0
|
||
* (position-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")
|
||
37
|
||
* (position-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :from-end t)
|
||
43
|
||
</code></pre>
|
||
|
||
<p>Or use COUNT and friends to count characters in a string.</p>
|
||
|
||
<pre><code class="language-lisp">* (count #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equal)
|
||
2
|
||
* (count #\t "The Hyperspec contains approximately 110,000 hyperlinks." :test #'equalp)
|
||
3
|
||
* (count-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks.")
|
||
6
|
||
* (count-if #'digit-char-p "The Hyperspec contains approximately 110,000 hyperlinks." :start 38)
|
||
5
|
||
</code></pre>
|
||
|
||
<h2 id="finding-a-substring-of-a-string">Finding a Substring of a String</h2>
|
||
|
||
<p>The function SEARCH can find substrings of a string.</p>
|
||
|
||
<pre><code class="language-lisp">* (search "we" "If we can't be free we can at least be cheap")
|
||
3
|
||
* (search "we" "If we can't be free we can at least be cheap" :from-end t)
|
||
20
|
||
* (search "we" "If we can't be free we can at least be cheap" :start2 4)
|
||
20
|
||
* (search "we" "If we can't be free we can at least be cheap" :end2 5 :from-end t)
|
||
3
|
||
* (search "FREE" "If we can't be free we can at least be cheap")
|
||
NIL
|
||
* (search "FREE" "If we can't be free we can at least be cheap" :test #'char-equal)
|
||
15
|
||
</code></pre>
|
||
|
||
<h2 id="converting-a-string-to-a-number">Converting a String to a Number</h2>
|
||
|
||
<h3 id="to-an-integer-parse-integer">To an integer: parse-integer</h3>
|
||
|
||
<p>CL provides the <code>parse-integer</code> function to convert a string representation of an integer
|
||
to the corresponding numeric value. The second return value is the index into
|
||
the string where the parsing stopped.</p>
|
||
|
||
<pre><code class="language-lisp">* (parse-integer "42")
|
||
42
|
||
2
|
||
* (parse-integer "42" :start 1)
|
||
2
|
||
2
|
||
* (parse-integer "42" :end 1)
|
||
4
|
||
1
|
||
* (parse-integer "42" :radix 8)
|
||
34
|
||
2
|
||
* (parse-integer " 42 ")
|
||
42
|
||
3
|
||
* (parse-integer " 42 is forty-two" :junk-allowed t)
|
||
42
|
||
3
|
||
* (parse-integer " 42 is forty-two")
|
||
|
||
Error in function PARSE-INTEGER:
|
||
There's junk in this string: " 42 is forty-two".
|
||
</code></pre>
|
||
|
||
<p><code>parse-integer</code> doesn’t understand radix specifiers like <code>#X</code>, nor is there a
|
||
built-in function to parse other numeric types. You could use <code>read-from-string</code>
|
||
in this case.</p>
|
||
|
||
<h3 id="to-any-number-read-from-string">To any number: read-from-string</h3>
|
||
|
||
<p>Be aware that the full reader is in effect if you’re using this
|
||
function. This can lead to vulnerability issues.</p>
|
||
|
||
<pre><code class="language-lisp">* (read-from-string "#X23")
|
||
35
|
||
4
|
||
* (read-from-string "4.5")
|
||
4.5
|
||
3
|
||
* (read-from-string "6/8")
|
||
3/4
|
||
3
|
||
* (read-from-string "#C(6/8 1)")
|
||
#C(3/4 1)
|
||
9
|
||
* (read-from-string "1.2e2")
|
||
120.00001
|
||
5
|
||
* (read-from-string "symbol")
|
||
SYMBOL
|
||
6
|
||
* (defparameter *foo* 42)
|
||
*FOO*
|
||
* (read-from-string "#.(setq *foo* \"gotcha\")")
|
||
"gotcha"
|
||
23
|
||
* *foo*
|
||
"gotcha"
|
||
</code></pre>
|
||
|
||
<h3 id="to-a-float-the-parse-float-library">To a float: the parse-float library</h3>
|
||
|
||
<p>There is no built-in function similar to <code>parse-integer</code> to parse
|
||
other number types. The external library
|
||
<a href="https://github.com/soemraws/parse-float">parse-float</a> does exactly
|
||
that. It doesn’t use <code>read-from-string</code> so it is safe to use.</p>
|
||
|
||
<pre><code class="language-lisp">(ql:quickload "parse-float")
|
||
(parse-float:parse-float "1.2e2")
|
||
;; 120.00001
|
||
;; 5
|
||
</code></pre>
|
||
|
||
<p>LispWorks also has a <a href="http://www.lispworks.com/documentation/lw51/LWRM/html/lwref-228.htm">parse-float</a> function.</p>
|
||
|
||
<p>See also <a href="https://github.com/sharplispers/parse-number">parse-number</a>.</p>
|
||
|
||
<h2 id="converting-a-number-to-a-string">Converting a Number to a String</h2>
|
||
|
||
<p>The general function WRITE-TO-STRING or one of its simpler variants
|
||
PRIN1-TO-STRING or PRINC-TO-STRING may be used to convert a number to a
|
||
string. With WRITE-TO-STRING, the :base keyword argument may be used to change
|
||
the output base for a single call. To change the output base globally, set
|
||
<em>print-base</em> which defaults to 10. Remember in Lisp, rational numbers are
|
||
represented as quotients of two integers even when converted to strings.</p>
|
||
|
||
<pre><code class="language-lisp">* (write-to-string 250)
|
||
"250"
|
||
* (write-to-string 250.02)
|
||
"250.02"
|
||
* (write-to-string 250 :base 5)
|
||
"2000"
|
||
* (write-to-string (/ 1 3))
|
||
"1/3"
|
||
*
|
||
</code></pre>
|
||
|
||
<h2 id="comparing-strings">Comparing Strings</h2>
|
||
|
||
<p>The general functions EQUAL and EQUALP can be used to test whether two strings
|
||
are equal. The strings are compared element-by-element, either in a
|
||
case-sensitive manner (EQUAL) or not (EQUALP). There’s also a bunch of
|
||
string-specific comparison functions. You’ll want to use these if you’re
|
||
deploying implementation-defined attributes of characters. Check your vendor’s
|
||
documentation in this case.</p>
|
||
|
||
<p>Here are a few examples. Note that all functions that test for inequality return the position of the first mismatch as a generalized boolean. You can also use the generic sequence function MISMATCH if you need more versatility.</p>
|
||
|
||
<pre><code class="language-lisp">* (string= "Marx" "Marx")
|
||
T
|
||
* (string= "Marx" "marx")
|
||
NIL
|
||
* (string-equal "Marx" "marx")
|
||
T
|
||
* (string< "Groucho" "Zeppo")
|
||
0
|
||
* (string< "groucho" "Zeppo")
|
||
NIL
|
||
* (string-lessp "groucho" "Zeppo")
|
||
0
|
||
* (mismatch "Harpo Marx" "Zeppo Marx" :from-end t :test #'char=)
|
||
3
|
||
</code></pre>
|
||
|
||
<h2 id="string-formatting">String formatting</h2>
|
||
|
||
<p>The <code>format</code> function has a lot of directives to print strings,
|
||
numbers, lists, going recursively, even calling Lisp functions,
|
||
etc. We’ll focus here on a few things to print and format strings.</p>
|
||
|
||
<p>The need of our examples arise when we want to print many strings and
|
||
justify them. Let’s work with this list of movies:</p>
|
||
|
||
<pre><code class="language-lisp">(defparameter movies '(
|
||
(1 "Matrix" 5)
|
||
(10 "Matrix Trilogy swe sub" 3.3)
|
||
))
|
||
</code></pre>
|
||
|
||
<p>We want an aligned and justified result like this:</p>
|
||
|
||
<pre><code> 1 Matrix 5
|
||
10 Matrix Trilogy swe sub 3.3
|
||
</code></pre>
|
||
|
||
<p>We’ll use <code>mapcar</code> to iterate over our movies and experiment with the
|
||
format constructs.</p>
|
||
|
||
<pre><code class="language-lisp">(mapcar (lambda (it)
|
||
(format t "~a ~a ~a~%" (first it) (second it) (third it)))
|
||
movies)
|
||
</code></pre>
|
||
|
||
<p>which prints:</p>
|
||
|
||
<pre><code>1 Matrix 5
|
||
10 Matrix Trilogy swe sub 3.3
|
||
</code></pre>
|
||
|
||
<h3 id="structure-of-format">Structure of format</h3>
|
||
|
||
<p>Format directives start with <code>~</code>. A final character like <code>A</code> or <code>a</code>
|
||
(they are case insensitive) defines the directive. In between, it can
|
||
accept coma-separated options and parameters.</p>
|
||
|
||
<p>Print a tilde with <code>~~</code>, or 10 with <code>~10~</code>.</p>
|
||
|
||
<p>Other directives include:</p>
|
||
|
||
<ul>
|
||
<li><code>R</code>: Roman (e.g., prints in English): <code>(format t "~R" 20)</code> => “twenty”.</li>
|
||
<li><code>$</code>: monetary: <code>(format t "~$" 21982)</code> => 21982.00</li>
|
||
<li><code>D</code>, <code>B</code>, <code>O</code>, <code>X</code>: Decimal, Binary, Octal, Hexadecimal.</li>
|
||
<li><code>F</code>: fixed-format Floating point.</li>
|
||
<li><code>P</code>: plural: <code>(format nil "~D famil~:@P/~D famil~:@P" 7 1)</code> => “7 families/1 family”</li>
|
||
</ul>
|
||
|
||
<h3 id="basic-primitive-a-or-a-aesthetics">Basic primitive: ~A or ~a (Aesthetics)</h3>
|
||
|
||
<p><code>(format t "~a" movies)</code> is the most basic primitive.</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~a" movies)
|
||
;; => "((1 Matrix 5) (10 Matrix Trilogy swe sub 3.3))"
|
||
</code></pre>
|
||
|
||
<h3 id="newlines--and-">Newlines: ~% and ~&</h3>
|
||
|
||
<p><code>~%</code> is the newline character. <code>~10%</code> prints 10 newlines.</p>
|
||
|
||
<p><code>~&</code> does not print a newline if the output stream is already at one.</p>
|
||
|
||
<h3 id="tabs">Tabs</h3>
|
||
|
||
<p>with <code>~T</code>. Also <code>~10T</code> works.</p>
|
||
|
||
<p>Also <code>i</code> for indentation.</p>
|
||
|
||
<h3 id="justifying-text--add-padding-on-the-right">Justifying text / add padding on the right</h3>
|
||
|
||
<p>Use a number as parameter, like <code>~2a</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~20a" "yo")
|
||
;; "yo "
|
||
</code></pre>
|
||
|
||
<pre><code class="language-lisp">(mapcar (lambda (it)
|
||
(format t "~2a ~a ~a~%" (first it) (second it) (third it)))
|
||
movies)
|
||
</code></pre>
|
||
|
||
<pre><code>1 Matrix 5
|
||
10 Matrix Trilogy swe sub 3.3
|
||
</code></pre>
|
||
|
||
<p>So, expanding:</p>
|
||
|
||
<pre><code class="language-lisp">(mapcar (lambda (it)
|
||
(format t "~2a ~25a ~2a~%" (first it) (second it) (third it)))
|
||
movies)
|
||
</code></pre>
|
||
|
||
<pre><code>1 Matrix 5
|
||
10 Matrix Trilogy swe sub 3.3
|
||
</code></pre>
|
||
|
||
<p>text is justified on the right (this would be with option <code>:</code>).</p>
|
||
|
||
<h4 id="justifying-on-the-left-">Justifying on the left: @</h4>
|
||
|
||
<p>Use a <code>@</code> as in <code>~2@A</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~20@a" "yo")
|
||
;; " yo"
|
||
</code></pre>
|
||
|
||
<pre><code class="language-lisp">(mapcar (lambda (it)
|
||
(format nil "~2@a ~25@a ~2a~%" (first it) (second it) (third it)))
|
||
movies)
|
||
</code></pre>
|
||
|
||
<pre><code> 1 Matrix 5
|
||
10 Matrix Trilogy swe sub 3.3
|
||
</code></pre>
|
||
|
||
<h3 id="justifying-decimals">Justifying decimals</h3>
|
||
|
||
<p>In <code>~,2F</code>, 2 is the number of decimals and F the floats directive:
|
||
<code>(format t "~,2F" 20.1)</code> => “20.10”.</p>
|
||
|
||
<p>With <code>~2,2f</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(mapcar (lambda (it)
|
||
(format t "~2@a ~25a ~2,2f~%" (first it) (second it) (third it)))
|
||
movies)
|
||
</code></pre>
|
||
|
||
<pre><code> 1 Matrix 5.00
|
||
10 Matrix Trilogy swe sub 3.30
|
||
</code></pre>
|
||
|
||
<p>And we’re happy with this result.</p>
|
||
|
||
<h3 id="iteration">Iteration</h3>
|
||
|
||
<p>Create a string from a list with iteration construct <code>~{str~}</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~{~A, ~}" '(a b c))
|
||
;; "A, B, C, "
|
||
</code></pre>
|
||
|
||
<p>using <code>~^</code> to avoid printing the comma and space after the last element:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~{~A~^, ~}" '(a b c))
|
||
;; "A, B, C"
|
||
</code></pre>
|
||
|
||
<p><code>~:{str~}</code> is similar but for a list of sublists:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~:{~S are ~S. ~}" '((pigeons birds) (dogs mammals) (bees insects)))
|
||
;; "PIGEONS are BIRDS. DOGS are MAMMALS. BEES are INSECTS. "
|
||
</code></pre>
|
||
|
||
<p><code>~@{str~}</code> is similar to <code>~{str~}</code>, but instead of using one argument that is a list, all the remaining arguments are used as the list of arguments for the iteration:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~@{~S are ~S. ~}" 'pigeons 'birds 'dogs 'mammals 'bees 'insects)
|
||
;; "PIGEONS are BIRDS. DOGS are MAMMALS. BEES are INSECTS. "
|
||
</code></pre>
|
||
|
||
<h3 id="formatting-a-format-string-v-">Formatting a format string (<code>~v</code>, <code>~?</code>)</h3>
|
||
|
||
<p>Sometimes you want to justify a string, but the length is a variable
|
||
itself. You can’t hardcode its value as in <code>(format nil "~30a"
|
||
"foo")</code>. Enters the <code>v</code> directive. We can use it in place of the
|
||
comma-separated prefix parameters:</p>
|
||
|
||
<pre><code class="language-lisp">(let ((padding 30))
|
||
(format nil "~va" padding "foo"))
|
||
;; "foo "
|
||
</code></pre>
|
||
|
||
<p>Other times, you would like to insert a complete format directive
|
||
at run time. Enters the <code>?</code> directive.</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~?" "~30a" '("foo"))
|
||
;; ^ a list
|
||
</code></pre>
|
||
|
||
<p>or, using <code>~@?</code>:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~@?" "~30a" "foo" )
|
||
;; ^ not a list
|
||
</code></pre>
|
||
|
||
<p>Of course, it is always possible to format a format string beforehand:</p>
|
||
|
||
<pre><code class="language-lisp">(let* ((length 30)
|
||
(directive (format nil "~~~aa" length)))
|
||
(format nil directive "foo"))
|
||
</code></pre>
|
||
|
||
<h3 id="conditional-formatting">Conditional Formatting</h3>
|
||
|
||
<p>Choose one value out of many options by specifying a number:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~[dog~;cat~;bird~:;default~]" 0)
|
||
;; "dog"
|
||
|
||
(format nil "~[dog~;cat~;bird~:;default~]" 1)
|
||
;; "cat"
|
||
</code></pre>
|
||
|
||
<p>If the number is out of range, the default option (after <code>~:;</code>) is returned:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "~[dog~;cat~;bird~:;default~]" 9)
|
||
;; "default"
|
||
</code></pre>
|
||
|
||
<p>Combine it with <code>~:*</code> to implement irregular plural:</p>
|
||
|
||
<pre><code class="language-lisp">(format nil "I saw ~r el~:*~[ves~;f~:;ves~]." 0) ==> "I saw zero elves."
|
||
(format nil "I saw ~r el~:*~[ves~;f~:;ves~]." 1) ==> "I saw one elf."
|
||
(format nil "I saw ~r el~:*~[ves~;f~:;ves~]." 2) ==> "I saw two elves."
|
||
</code></pre>
|
||
|
||
<h2 id="capturing-what-is-is-printed-into-a-stream">Capturing what is is printed into a stream</h2>
|
||
|
||
<p>Inside <code>(with-output-to-string (mystream) …)</code>, everything that is
|
||
printed into the stream <code>mystream</code> is captured and returned as a
|
||
string:</p>
|
||
|
||
<pre><code class="language-lisp">(defun greet (name &key (stream t))
|
||
;; by default, print to standard output.
|
||
(format stream "hello ~a" name))
|
||
|
||
(let ((output (with-output-to-string (stream)
|
||
(greet "you" :stream stream))))
|
||
(format t "Output is: '~a'. It is indeed a ~a, aka a string.~&" output (type-of output)))
|
||
;; Output is: 'hello you'. It is indeed a (SIMPLE-ARRAY CHARACTER (9)), aka a string.
|
||
;; NIL
|
||
</code></pre>
|
||
|
||
<h2 id="cleaning-up-strings">Cleaning up strings</h2>
|
||
|
||
<p>The following examples use the
|
||
<a href="https://github.com/EuAndreh/cl-slug/">cl-slug</a> library which,
|
||
internally, iterates over the characters of the string and uses
|
||
<code>ppcre:regex-replace-all</code>.</p>
|
||
|
||
<pre><code>(ql:quickload "cl-slug")
|
||
</code></pre>
|
||
|
||
<p>Then it can be used with the <code>slug</code> prefix.</p>
|
||
|
||
<p>Its main function is to transform a string to a slug, suitable for a website’s url:</p>
|
||
|
||
<pre><code class="language-lisp">(slug:slugify "My new cool article, for the blog (V. 2).")
|
||
;; "my-new-cool-article-for-the-blog-v-2"
|
||
</code></pre>
|
||
|
||
<h3 id="removing-accentuated-letters">Removing accentuated letters</h3>
|
||
|
||
<p>Use <code>slug:asciify</code> to replace accentuated letters by their ascii equivalent:</p>
|
||
|
||
<pre><code class="language-lisp">(slug:asciify "ñ é ß ğ ö")
|
||
;; => "n e ss g o"
|
||
</code></pre>
|
||
|
||
<p>This function supports many (western) languages:</p>
|
||
|
||
<pre><code class="language-lisp">slug:*available-languages*
|
||
((:TR . "Türkçe (Turkish)") (:SV . "Svenska (Swedish)") (:FI . "Suomi (Finnish)")
|
||
(:UK . "українська (Ukrainian)") (:RU . "Ру́сский (Russian)") (:RO . "Română (Romanian)")
|
||
(:RM . "Rumàntsch (Romansh)") (:PT . "Português (Portuguese)") (:PL . "Polski (Polish)")
|
||
(:NO . "Norsk (Norwegian)") (:LT . "Lietuvių (Lithuanian)") (:LV . "Latviešu (Latvian)")
|
||
(:LA . "Lingua Latīna (Latin)") (:IT . "Italiano (Italian)") (:EL . "ελληνικά (Greek)")
|
||
(:FR . "Français (French)") (:EO . "Esperanto") (:ES . "Español (Spanish)") (:EN . "English")
|
||
(:DE . "Deutsch (German)") (:DA . "Dansk (Danish)") (:CS . "Čeština (Czech)")
|
||
(:CURRENCY . "Currency"))
|
||
</code></pre>
|
||
|
||
<h3 id="removing-punctuation">Removing punctuation</h3>
|
||
|
||
<p>Use <code>(str:remove-punctuation s)</code> or <code>(str:no-case s)</code> (same as
|
||
<code>(cl-change-case:no-case s)</code>):</p>
|
||
|
||
<pre><code class="language-lisp">(str:remove-punctuation "HEY! What's up ??")
|
||
;; "HEY What s up"
|
||
|
||
(str:no-case "HEY! What's up ??")
|
||
;; "hey what s up"
|
||
</code></pre>
|
||
|
||
<p>They strip the punctuation with one ppcre unicode regexp
|
||
(<code>(ppcre:regex-replace-all "[^\\p{L}\\p{N}]+"</code> where <code>p{L}</code> is the
|
||
“letter” category and <code>p{N}</code> any kind of numeric character).</p>
|
||
|
||
<h2 id="see-also">See also</h2>
|
||
|
||
<ul>
|
||
<li><a href="https://gist.github.com/WetHat/a49e6f2140b401a190d45d31e052af8f">Pretty printing table data</a>, in ASCII art, a tutorial as a Jupyter notebook.</li>
|
||
</ul>
|
||
|
||
|
||
<p class="page-source">
|
||
Page source: <a href="https://github.com/LispCookbook/cl-cookbook/blob/master/strings.md">strings.md</a>
|
||
</p>
|
||
</div>
|
||
|
||
<script type="text/javascript">
|
||
|
||
// Don't write the TOC on the index.
|
||
if (window.location.pathname != "/cl-cookbook/") {
|
||
$("#toc").toc({
|
||
content: "#content", // will ignore the first h1 with the site+page title.
|
||
headings: "h1,h2,h3,h4"});
|
||
}
|
||
|
||
$("#two-cols + ul").css({
|
||
"column-count": "2",
|
||
});
|
||
$("#contributors + ul").css({
|
||
"column-count": "4",
|
||
});
|
||
</script>
|
||
|
||
|
||
|
||
<div>
|
||
<footer class="footer">
|
||
<hr/>
|
||
© 2002–2021 the Common Lisp Cookbook Project
|
||
</footer>
|
||
|
||
</div>
|
||
<div id="toc-btn">T<br>O<br>C</div>
|
||
</div>
|
||
|
||
<script text="javascript">
|
||
HighlightLisp.highlight_auto({className: null});
|
||
</script>
|
||
|
||
<script type="text/javascript">
|
||
function duckSearch() {
|
||
var searchField = document.getElementById("searchField");
|
||
if (searchField && searchField.value) {
|
||
var query = escape("site:lispcookbook.github.io/cl-cookbook/ " + searchField.value);
|
||
window.location.href = "https://duckduckgo.com/?kj=b2&kf=-1&ko=1&q=" + query;
|
||
// https://duckduckgo.com/params
|
||
// kj=b2: blue header in results page
|
||
// kf=-1: no favicons
|
||
}
|
||
}
|
||
</script>
|
||
|
||
<script async defer data-domain="lispcookbook.github.io/cl-cookbook" src="https://plausible.io/js/plausible.js"></script>
|
||
|
||
</body>
|
||
</html>
|