2022-08-02 12:34:59 +02:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta name = "generator" content =
"HTML Tidy for HTML5 for Linux version 5.2.0">
< title > Regular Expressions< / title >
< meta charset = "utf-8" >
< meta name = "description" content = "A collection of examples of using Common Lisp" >
< meta name = "viewport" content =
"width=device-width, initial-scale=1">
2022-08-04 11:37:48 +02:00
< link rel = "icon" href =
"assets/cl-logo-blue.png"/>
2022-08-02 12:34:59 +02:00
< link rel = "stylesheet" href =
"assets/style.css">
< script type = "text/javascript" src =
"assets/highlight-lisp.js">
< / script >
< script type = "text/javascript" src =
"assets/jquery-3.2.1.min.js">
< / script >
< script type = "text/javascript" src =
"assets/jquery.toc/jquery.toc.min.js">
< / script >
< script type = "text/javascript" src =
"assets/toggle-toc.js">
< / script >
< link rel = "stylesheet" href =
"assets/github.css">
< link rel = "stylesheet" href = "https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity = "sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin = "anonymous" >
< / head >
< body >
< h1 id = "title-xs" > < a href = "index.html" > The Common Lisp Cookbook< / a > – Regular Expressions< / h1 >
< div id = "logo-container" >
< a href = "index.html" >
< img id = "logo" src = "assets/cl-logo-blue.png" / >
< / a >
< div id = "searchform-container" >
< form onsubmit = "duckSearch()" action = "javascript:void(0)" >
< input id = "searchField" type = "text" value = "" placeholder = "Search..." >
< / form >
< / div >
< div id = "toc-container" class = "toc-close" >
< div id = "toc-title" > Table of Contents< / div >
< ul id = "toc" class = "list-unstyled" > < / ul >
< / div >
< / div >
< div id = "content-container" >
< h1 id = "title-non-xs" > < a href = "index.html" > The Common Lisp Cookbook< / a > – Regular Expressions< / h1 >
<!-- Announcement we can keep for 1 month or more. I remove it and re - add it from time to time. -->
< p class = "announce" >
📹 < a href = "https://www.udemy.com/course/common-lisp-programming/?couponCode=6926D599AA-LISP4ALL" > NEW! Learn Lisp in videos and support our contributors with this 40% discount.< / a >
< / p >
< p class = "announce-neutral" >
📕 < a href = "index.html#download-in-epub" > Get the EPUB and PDF< / a >
< / p >
< div id = "content"
< p > The < a href = "http://www.lispworks.com/documentation/HyperSpec/index.html" > ANSI Common Lisp
standard< / a >
does not include facilities for regular expressions, but a couple of
libraries exist for this task, for instance:
< a href = "https://github.com/edicl/cl-ppcre" > cl-ppcre< / a > .< / p >
< p > See also the respective < a href = "http://www.cliki.net/Regular%20Expression" > Cliki:
regexp< / a > page for more
links.< / p >
< p > Note that some CL implementations include regexp facilities, notably
< a href = "http://clisp.sourceforge.net/impnotes.html#regexp" > CLISP< / a > and
< a href = "https://franz.com/support/documentation/current/doc/regexp.htm" > ALLEGRO
CL< / a > . If
in doubt, check your manual or ask your vendor.< / p >
< p > The description provided below is far from complete, so don’ t forget
to check the reference manual that comes along with the CL-PPCRE
library.< / p >
< h2 id = "ppcre" > PPCRE< / h2 >
< h3 id = "using-ppcre" > Using PPCRE< / h3 >
< p > < a href = "https://github.com/edicl/cl-ppcre" > CL-PPCRE< / a > (abbreviation for
Portable Perl-compatible regular expressions) is a portable regular
expression library for Common Lisp with a broad set of features and
good performance. It has been ported to a number of Common Lisp
implementations and can be easily installed (or added as a dependency)
via Quicklisp:< / p >
< pre > < code class = "language-lisp" > (ql:quickload "cl-ppcre")
< / code > < / pre >
< p > Basic operations with the CL-PPCRE library functions are described
below.< / p >
< h3 id = "looking-for-matching-patterns" > Looking for matching patterns< / h3 >
< p > The < code > scan< / code > function tries to match the given pattern and on success
returns four multiple-values values - the start of the match, the end
of the match, and two arrays denoting the beginnings and ends of
register matches. On failure returns < code > NIL< / code > .< / p >
< p > A regular expression pattern can be compiled with the < code > create-scanner< / code >
function call. A “scanner” will be created that can be used by other
functions.< / p >
< p > For example:< / p >
< pre > < code class = "language-lisp" > (let ((ptrn (ppcre:create-scanner "(a)*b")))
(ppcre:scan ptrn "xaaabd"))
< / code > < / pre >
< p > will yield the same results as:< / p >
< pre > < code class = "language-lisp" > (ppcre:scan "(a)*b" "xaaabd")
< / code > < / pre >
< p > but will require less time for repeated < code > scan< / code > calls as parsing the
expression and compiling it is done only once.< / p >
< h3 id = "replacing-text" > Replacing text< / h3 >
< pre > < code class = "language-lisp" > (ppcre:regex-replace "a" "abc" "A") ;; => "Abc"
;; or
(let ((pat (ppcre:create-scanner "a")))
(ppcre:regex-replace pat "abc" "A"))
< / code > < / pre >
< h3 id = "extracting-information" > Extracting information< / h3 >
< p > CL-PPCRE provides a several ways to extract matching fragments, among
them: the < code > scan-to-strings< / code > and < code > register-groups-bind< / code > functions.< / p >
< p > The < code > scan-to-strings< / code > function is similar to < code > scan< / code > but returns
substrings of target-string instead of positions. This function
returns two values on success: the whole match as a string plus an
array of substrings (or NILs) corresponding to the matched registers.< / p >
< p > The < code > register-groups-bind< / code > function tries to match the given pattern
against the target string and binds matching fragments with the given
variables.< / p >
< pre > < code class = "language-lisp" > (ppcre:register-groups-bind (first second third fourth)
("((a)|(b)|(c))+" "abababc" :sharedp t)
(list first second third fourth))
;; => ("c" "a" "b" "c")
< / code > < / pre >
< p > CL-PPCRE also provides a shortcut for calling a function before
assigning the matching fragment to the variable:< / p >
< pre > < code class = "language-lisp" > (ppcre:register-groups-bind (fname lname (#'parse-integer date month year))
("(\\w+)\\s+(\\w+)\\s+(\\d{1,2})\\.(\\d{1,2})\\.(\\d{4})" "Frank Zappa 21.12.1940")
(list fname lname (encode-universal-time 0 0 0 date month year 0)))
;; => ("Frank" "Zappa" 1292889600)
< / code > < / pre >
< h3 id = "syntactic-sugar" > Syntactic sugar< / h3 >
< p > It might be more convenient to use CL-PPCRE with the
< a href = "https://github.com/edicl/cl-interpol" > CL-INTERPOL< / a >
library. CL-INTERPOL is a library for Common Lisp which modifies the
reader in a way that introduces interpolation within strings similar
to Perl, Scala, or Unix Shell scripts.< / p >
< p > In addition to loading the CL-INTERPOL library, initialization call
must be made to properly configure the Lisp reader. This is
accomplished by either calling the < code > enable-interpol-syntax< / code > function
from the REPL or placing that call in the source file before using any
of its features:< / p >
< pre > < code class = "language-lisp" > (interpol:enable-interpol-syntax)
< / code > < / pre >
< p class = "page-source" >
Page source: < a href = "https://github.com/LispCookbook/cl-cookbook/blob/master/regexp.md" > regexp.md< / a >
< / p >
< / div >
< script type = "text/javascript" >
// Don't write the TOC on the index.
if (window.location.pathname != "/cl-cookbook/") {
$("#toc").toc({
content: "#content", // will ignore the first h1 with the site+page title.
headings: "h1,h2,h3,h4"});
}
$("#two-cols + ul").css({
"column-count": "2",
});
$("#contributors + ul").css({
"column-count": "4",
});
< / script >
< div >
< footer class = "footer" >
< hr / >
© 2002– 2021 the Common Lisp Cookbook Project
< / footer >
< / div >
< div id = "toc-btn" > T< br > O< br > C< / div >
< / div >
< script text = "javascript" >
HighlightLisp.highlight_auto({className: null});
< / script >
< script type = "text/javascript" >
function duckSearch() {
var searchField = document.getElementById("searchField");
if (searchField & & searchField.value) {
var query = escape("site:lispcookbook.github.io/cl-cookbook/ " + searchField.value);
window.location.href = "https://duckduckgo.com/?kj=b2& kf=-1& ko=1& q=" + query;
// https://duckduckgo.com/params
// kj=b2: blue header in results page
// kf=-1: no favicons
}
}
< / script >
< script async defer data-domain = "lispcookbook.github.io/cl-cookbook" src = "https://plausible.io/js/plausible.js" > < / script >
< / body >
< / html >