1
0
Fork 0
cl-sites/lispcookbook.github.io/cl-cookbook/io.html

313 lines
12 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Linux version 5.2.0">
<title>Input/Output</title>
<meta charset="utf-8">
<meta name="description" content="A collection of examples of using Common Lisp">
<meta name="viewport" content=
"width=device-width, initial-scale=1">
<link rel="icon" href=
"assets/cl-logo-blue.png"/>
<link rel="stylesheet" href=
"assets/style.css">
<script type="text/javascript" src=
"assets/highlight-lisp.js">
</script>
<script type="text/javascript" src=
"assets/jquery-3.2.1.min.js">
</script>
<script type="text/javascript" src=
"assets/jquery.toc/jquery.toc.min.js">
</script>
<script type="text/javascript" src=
"assets/toggle-toc.js">
</script>
<link rel="stylesheet" href=
"assets/github.css">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
</head>
<body>
<h1 id="title-xs"><a href="index.html">The Common Lisp Cookbook</a> &ndash; Input/Output</h1>
<div id="logo-container">
<a href="index.html">
<img id="logo" src="assets/cl-logo-blue.png"/>
</a>
<div id="searchform-container">
<form onsubmit="duckSearch()" action="javascript:void(0)">
<input id="searchField" type="text" value="" placeholder="Search...">
</form>
</div>
<div id="toc-container" class="toc-close">
<div id="toc-title">Table of Contents</div>
<ul id="toc" class="list-unstyled"></ul>
</div>
</div>
<div id="content-container">
<h1 id="title-non-xs"><a href="index.html">The Common Lisp Cookbook</a> &ndash; Input/Output</h1>
<!-- Announcement we can keep for 1 month or more. I remove it and re-add it from time to time. -->
<!-- <p class="announce"> -->
<!-- 📢 🤶 ⭐ -->
<!-- <a style="font-size: 120%" href="https://www.udemy.com/course/common-lisp-programming/?couponCode=LISPY-XMAS2023" title="This course is under a paywall on the Udemy platform. Several videos are freely available so you can judge before diving in. vindarel is (I am) the main contributor to this Cookbook."> Discover our contributor's Lisp course with this Christmas coupon.</a> -->
<!-- <strong> -->
<!-- Recently added: 18 videos on MACROS. -->
<!-- </strong> -->
<!-- <a style="font-size: 90%" href="https://github.com/vindarel/common-lisp-course-in-videos/">Learn more</a>. -->
<!-- </p> -->
<p class="announce">
📢 New videos: <a href="https://www.youtube.com/watch?v=h_noB1sI_e8">web dev demo part 1</a>, <a href="https://www.youtube.com/watch?v=xnwc7irnc8k">dynamic page with HTMX</a>, <a href="https://www.youtube.com/watch?v=Zpn86AQRVN8">Weblocks demo</a>
</p>
<p class="announce-neutral">
📕 <a href="index.html#download-in-epub">Get the EPUB and PDF</a>
</p>
<div id="content"
<p><a name="redir"></a></p>
<h2 id="redirecting-the-standard-output-of-your-program">Redirecting the Standard Output of your Program</h2>
<p>You do it like this:</p>
<pre><code class="language-lisp">(let ((*standard-output* &lt;some form generating a stream&gt;))
...)
</code></pre>
<p>Because
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/v_debug_.htm"><code>*STANDARD-OUTPUT*</code></a>
is a dynamic variable, all references to it during execution of the body of the
<code>LET</code> form refer to the stream that you bound it to. After exiting the <code>LET</code>
form, the old value of <code>*STANDARD-OUTPUT*</code> is restored, no matter if the exit
was by normal execution, a <code>RETURN-FROM</code> leaving the whole function, an
exception, or what-have-you. (This is, incidentally, why global variables lose
much of their brokenness in Common Lisp compared to other languages: since they
can be bound for the execution of a specific form without the risk of losing
their former value after the form has finished, their use is quite safe; they
act much like additional parameters that are passed to every function.)</p>
<p>If the output of the program should go to a file, you can do the following:</p>
<pre><code class="language-lisp">(with-open-file (*standard-output* "somefile.dat"
:direction :output
:if-exists :supersede)
...)
</code></pre>
<p><a href="http://www.lispworks.com/documentation/HyperSpec/Body/m_w_open.htm"><code>WITH-OPEN-FILE</code></a>
opens the file - creating it if necessary - binds <code>*STANDARD-OUTPUT*</code>, executes
its body, closes the file, and restores <code>*STANDARD-OUTPUT*</code> to its former
value. It doesnt get more comfortable than this!<a name="faith"></a></p>
<h2 id="faithful-output-with-character-streams">Faithful Output with Character Streams</h2>
<p>By <em>faithful output</em> I mean that characters with codes between 0 and 255 will be
written out as is. It means, that I can <code>(PRINC (CODE-CHAR 0..255) s)</code> to a
stream and expect 8-bit bytes to be written out, which is not obvious in the
times of Unicode and 16 or 32 bit character representations. It does <em>not</em>
require that the characters ä, ß, or þ must have their
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_char_c.htm"><code>CHAR-CODE</code></a>
in the range 0..255 - the implementation is free to use any code. But it does
require that no <code>#\Newline</code> to CRLF translation takes place, among others.</p>
<p>Common Lisp has a long tradition of distinguishing character from byte (binary)
I/O,
e.g. <a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_by.htm"><code>READ-BYTE</code></a>
and
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_cha.htm"><code>READ-CHAR</code></a>
are in the standard. Some implementations let both functions be called
interchangeably. Others allow either one or the other. (The
<a href="https://www.cliki.net/simple-stream">simple stream proposal</a> defines the
notion of a <em>bivalent stream</em> where both are possible.)</p>
<p>Varying element-types are useful as some protocols rely on the ability to send
8-Bit output on a channel. E.g. with HTTP, the header is normally ASCII and
ought to use CRLF as line terminators, whereas the body can have the MIME type
application/octet-stream, where CRLF translation would destroy the data. (This
is how the Netscape browser on MS-Windows destroys data sent by incorrectly
configured Webservers which declare unknown files as having MIME type
text/plain - the default in most Apache configurations).</p>
<p>What follows is a list of implementation dependent choices and behaviours and some code to experiment.</p>
<h3 id="sbcl">SBCL</h3>
<p>To load arbitrary bytes into a string, use the <code>:iso-8859-1</code> external format. For example:</p>
<pre><code class="language-lisp">(uiop:read-file-string "/path/to/file" :external-format :iso-8859-1)
</code></pre>
<h3 id="clisp">CLISP</h3>
<p>On CLISP, faithful output is possible using</p>
<pre><code class="language-lisp">:external-format
(ext:make-encoding :charset 'charset:iso-8859-1
:line-terminator :unix)
</code></pre>
<p>You can also use <code>(SETF (STREAM-ELEMENT-TYPE F) '(UNSIGNED-BYTE 8))</code>, where the
ability to <code>SETF</code> is a CLISP-specific extension. Using <code>:EXTERNAL-FORMAT :UNIX</code>
will cause portability problems, since the default character set on MS-Windows
is <code>CHARSET:CP1252</code>. <code>CHARSET:CP1252</code> doesnt allow output of e.g. <code>(CODE-CHAR
#x81)</code>:</p>
<pre><code>;*** - Character #\u0080 cannot be represented in the character set CHARSET:CP1252
</code></pre>
<p>Characters with code &gt; 127 cannot be represented in ASCII:</p>
<pre><code>;*** - Character #\u0080 cannot be represented in the character set CHARSET:ASCII
</code></pre>
<h3 id="allegrocl">AllegroCL</h3>
<p><code>#+(AND ALLEGRO UNIX) :DEFAULT</code> (untested) - seems enough on UNIX, but would not
work on the MS-Windows port of AllegroCL.</p>
<h3 id="lispworks">LispWorks</h3>
<p><code>:EXTERNAL-FORMAT '(:LATIN-1 :EOL-STYLE :LF)</code> (confirmed by Marc Battyani)</p>
<h3 id="example">Example</h3>
<p>Heres some sample code to play with:</p>
<pre><code class="language-lisp">(defvar *unicode-test-file* "faithtest-out.txt")
(defun generate-256 (&amp;key (filename *unicode-test-file*)
#+CLISP (charset 'charset:iso-8859-1)
external-format)
(let ((e (or external-format
#+CLISP (ext:make-encoding :charset charset
:line-terminator :unix))))
(describe e)
(with-open-file (f filename :direction :output
:external-format e)
(write-sequence
(loop with s = (make-string 256)
for i from 0 to 255
do (setf (char s i) (code-char i))
finally (return s))
f)
(file-position f))))
;(generate-256 :external-format :default)
;#+CLISP (generate-256 :external-format :unix)
;#+CLISP (generate-256 :external-format 'charset:ascii)
;(generate-256)
(defun check-256 (&amp;optional (filename *unicode-test-file*))
(with-open-file (f filename :direction :input
:element-type '(unsigned-byte 8))
(loop for i from 0
for c = (read-byte f nil nil)
while c
unless (= c i)
do (format t "~&amp;Position ~D found ~D(#x~X)." i c c)
when (and (= i 33) (= c 32))
do (let ((c (read-byte f)))
(format t "~&amp;Resync back 1 byte ~D(#x~X) - cause CRLF?." c c) ))
(file-length f)))
#| CLISP
(check-256 *unicode-test-file*)
(progn (generate-256 :external-format :unix) (check-256))
; uses UTF-8 -&gt; 385 bytes
(progn (generate-256 :charset 'charset:iso-8859-1) (check-256))
(progn (generate-256 :external-format :default) (check-256))
; uses UTF-8 + CRLF(on MS-Windows) -&gt; 387 bytes
(progn (generate-256 :external-format
(ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :mac)) (check-256))
(progn (generate-256 :external-format
(ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :dos)) (check-256))
|#
</code></pre>
<p><a name="bulk"></a></p>
<h2 id="fast-bulk-io">Fast Bulk I/O</h2>
<p>If you need to copy a lot of data and the source and destination are both
streams (of the same
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_e.htm#element_type">element type</a>),
its very fast to use
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_rd_seq.htm"><code>READ-SEQUENCE</code></a>
and
<a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_wr_seq.htm"><code>WRITE-SEQUENCE</code></a>:</p>
<pre><code class="language-lisp">(let ((buf (make-array 4096 :element-type (stream-element-type input-stream))))
(loop for pos = (read-sequence buf input-stream)
while (plusp pos)
do (write-sequence buf output-stream :end pos)))
</code></pre>
<p class="page-source">
Page source: <a href="https://github.com/LispCookbook/cl-cookbook/blob/master/io.md">io.md</a>
</p>
</div>
<script type="text/javascript">
// Don't write the TOC on the index.
if (window.location.pathname != "/cl-cookbook/") {
$("#toc").toc({
content: "#content", // will ignore the first h1 with the site+page title.
headings: "h1,h2,h3,h4"});
}
$("#two-cols + ul").css({
"column-count": "2",
});
$("#contributors + ul").css({
"column-count": "4",
});
</script>
<div>
<footer class="footer">
<hr/>
&copy; 2002&ndash;2023 the Common Lisp Cookbook Project
<div>
📹 Discover <a style="color: darkgrey; text-decoration: underline", href="https://www.udemy.com/course/common-lisp-programming/?referralCode=2F3D698BBC4326F94358">our contributor's Common Lisp video course on Udemy</a>
</div>
</footer>
</div>
<div id="toc-btn">T<br>O<br>C</div>
</div>
<script text="javascript">
HighlightLisp.highlight_auto({className: null});
</script>
<script type="text/javascript">
function duckSearch() {
var searchField = document.getElementById("searchField");
if (searchField && searchField.value) {
var query = escape("site:lispcookbook.github.io/cl-cookbook/ " + searchField.value);
window.location.href = "https://duckduckgo.com/?kj=b2&kf=-1&ko=1&q=" + query;
// https://duckduckgo.com/params
// kj=b2: blue header in results page
// kf=-1: no favicons
}
}
</script>
<script async defer data-domain="lispcookbook.github.io/cl-cookbook" src="https://plausible.io/js/plausible.js"></script>
</body>
</html>