183 lines
8.8 KiB
HTML
183 lines
8.8 KiB
HTML
<!DOCTYPE html>
|
|
<html>
|
|
<!-- Created by GNU Texinfo 7.1, https://www.gnu.org/software/texinfo/ -->
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<!-- This manual documents Guile version 3.0.10.
|
|
|
|
Copyright (C) 1996-1997, 2000-2005, 2009-2023 Free Software Foundation,
|
|
Inc.
|
|
|
|
Copyright (C) 2021 Maxime Devos
|
|
|
|
Copyright (C) 2024 Tomas Volf
|
|
|
|
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
any later version published by the Free Software Foundation; with no
|
|
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
|
|
copy of the license is included in the section entitled "GNU Free
|
|
Documentation License." -->
|
|
<title>Types and the Web (Guile Reference Manual)</title>
|
|
|
|
<meta name="description" content="Types and the Web (Guile Reference Manual)">
|
|
<meta name="keywords" content="Types and the Web (Guile Reference Manual)">
|
|
<meta name="resource-type" content="document">
|
|
<meta name="distribution" content="global">
|
|
<meta name="Generator" content=".texi2any-real">
|
|
<meta name="viewport" content="width=device-width,initial-scale=1">
|
|
|
|
<link href="index.html" rel="start" title="Top">
|
|
<link href="Concept-Index.html" rel="index" title="Concept Index">
|
|
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
|
<link href="Web.html" rel="up" title="Web">
|
|
<link href="URIs.html" rel="next" title="URIs">
|
|
<style type="text/css">
|
|
<!--
|
|
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
|
|
div.example {margin-left: 3.2em}
|
|
span:hover a.copiable-link {visibility: visible}
|
|
-->
|
|
</style>
|
|
<link rel="stylesheet" type="text/css" href="https://www.gnu.org/software/gnulib/manual.css">
|
|
|
|
|
|
</head>
|
|
|
|
<body lang="en">
|
|
<div class="subsection-level-extent" id="Types-and-the-Web">
|
|
<div class="nav-panel">
|
|
<p>
|
|
Next: <a href="URIs.html" accesskey="n" rel="next">Universal Resource Identifiers</a>, Up: <a href="Web.html" accesskey="u" rel="up"><abbr class="acronym">HTTP</abbr>, the Web, and All That</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
<hr>
|
|
<h4 class="subsection" id="Types-and-the-Web-1"><span>7.3.1 Types and the Web<a class="copiable-link" href="#Types-and-the-Web-1"> ¶</a></span></h4>
|
|
|
|
<p>It is a truth universally acknowledged, that a program with good use of
|
|
data types, will be free from many common bugs. Unfortunately, the
|
|
common practice in web programming seems to ignore this maxim. This
|
|
subsection makes the case for expressive data types in web programming.
|
|
</p>
|
|
<p>By “expressive data types”, we mean that the data types <em class="emph">say</em>
|
|
something about how a program solves a problem. For example, if we
|
|
choose to represent dates using SRFI 19 date records (see <a class="pxref" href="SRFI_002d19.html">SRFI-19 - Time/Date Library</a>),
|
|
this indicates that there is a part of the program that will always have
|
|
valid dates. Error handling for a number of basic cases, like invalid
|
|
dates, occurs on the boundary in which we produce a SRFI 19 date record
|
|
from other types, like strings.
|
|
</p>
|
|
<p>With regards to the web, data types are helpful in the two broad phases
|
|
of HTTP messages: parsing and generation.
|
|
</p>
|
|
<p>Consider a server, which has to parse a request, and produce a response.
|
|
Guile will parse the request into an HTTP request object
|
|
(see <a class="pxref" href="Requests.html">HTTP Requests</a>), with each header parsed into an appropriate Scheme
|
|
data type. This transition from an incoming stream of characters to
|
|
typed data is a state change in a program—the strings might parse, or
|
|
they might not, and something has to happen if they do not. (Guile
|
|
throws an error in this case.) But after you have the parsed request,
|
|
“client” code (code built on top of the Guile web framework) will not
|
|
have to check for syntactic validity. The types already make this
|
|
information manifest.
|
|
</p>
|
|
<p>This state change on the parsing boundary makes programs more robust,
|
|
as they themselves are freed from the need to do a number of common
|
|
error checks, and they can use normal Scheme procedures to handle a
|
|
request instead of ad-hoc string parsers.
|
|
</p>
|
|
<p>The need for types on the response generation side (in a server) is more
|
|
subtle, though not less important. Consider the example of a POST
|
|
handler, which prints out the text that a user submits from a form.
|
|
Such a handler might include a procedure like this:
|
|
</p>
|
|
<div class="example">
|
|
<pre class="example-preformatted">;; First, a helper procedure
|
|
(define (para . contents)
|
|
(string-append "<p>" (string-concatenate contents) "</p>"))
|
|
|
|
;; Now the meat of our simple web application
|
|
(define (you-said text)
|
|
(para "You said: " text))
|
|
|
|
(display (you-said "Hi!"))
|
|
-| <p>You said: Hi!</p>
|
|
</pre></div>
|
|
|
|
<p>This is a perfectly valid implementation, provided that the incoming
|
|
text does not contain the special HTML characters ‘<samp class="samp"><</samp>’, ‘<samp class="samp">></samp>’, or
|
|
‘<samp class="samp">&</samp>’. But this provision of a restricted character set is not
|
|
reflected anywhere in the program itself: we must <em class="emph">assume</em> that the
|
|
programmer understands this, and performs the check elsewhere.
|
|
</p>
|
|
<p>Unfortunately, the short history of the practice of programming does not
|
|
bear out this assumption. A <em class="dfn">cross-site scripting</em> (<abbr class="acronym">XSS</abbr>)
|
|
vulnerability is just such a common error in which unfiltered user input
|
|
is allowed into the output. A user could submit a crafted comment to
|
|
your web site which results in visitors running malicious Javascript,
|
|
within the security context of your domain:
|
|
</p>
|
|
<div class="example">
|
|
<pre class="example-preformatted">(display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
|
|
-| <p>You said: <script src="http://bad.com/nasty.js" /></p>
|
|
</pre></div>
|
|
|
|
<p>The fundamental problem here is that both user data and the program
|
|
template are represented using strings. This identity means that types
|
|
can’t help the programmer to make a distinction between these two, so
|
|
they get confused.
|
|
</p>
|
|
<p>There are a number of possible solutions, but perhaps the best is to
|
|
treat HTML not as strings, but as native s-expressions: as SXML. The
|
|
basic idea is that HTML is either text, represented by a string, or an
|
|
element, represented as a tagged list. So ‘<samp class="samp">foo</samp>’ becomes
|
|
‘<samp class="samp">"foo"</samp>’, and ‘<samp class="samp"><b>foo</b></samp>’ becomes ‘<samp class="samp">(b "foo")</samp>’.
|
|
Attributes, if present, go in a tagged list headed by ‘<samp class="samp">@</samp>’, like
|
|
‘<samp class="samp">(img (@ (src "http://example.com/foo.png")))</samp>’. See <a class="xref" href="SXML.html">SXML</a>, for
|
|
more information.
|
|
</p>
|
|
<p>The good thing about SXML is that HTML elements cannot be confused with
|
|
text. Let’s make a new definition of <code class="code">para</code>:
|
|
</p>
|
|
<div class="example">
|
|
<pre class="example-preformatted">(define (para . contents)
|
|
`(p ,@contents))
|
|
|
|
(use-modules (sxml simple))
|
|
(sxml->xml (you-said "Hi!"))
|
|
-| <p>You said: Hi!</p>
|
|
|
|
(sxml->xml (you-said "<i>Rats, foiled again!</i>"))
|
|
-| <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
|
|
</pre></div>
|
|
|
|
<p>So we see in the second example that HTML elements cannot be unwittingly
|
|
introduced into the output. However it is now perfectly acceptable to
|
|
pass SXML to <code class="code">you-said</code>; in fact, that is the big advantage of SXML
|
|
over everything-as-a-string.
|
|
</p>
|
|
<div class="example">
|
|
<pre class="example-preformatted">(sxml->xml (you-said (you-said "<Hi!>")))
|
|
-| <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
|
|
</pre></div>
|
|
|
|
<p>The SXML types allow procedures to <em class="emph">compose</em>. The types make
|
|
manifest which parts are HTML elements, and which are text. So you
|
|
needn’t worry about escaping user input; the type transition back to a
|
|
string handles that for you. <abbr class="acronym">XSS</abbr> vulnerabilities are a thing
|
|
of the past.
|
|
</p>
|
|
<p>Well. That’s all very nice and opinionated and such, but how do I use
|
|
the thing? Read on!
|
|
</p>
|
|
</div>
|
|
<hr>
|
|
<div class="nav-panel">
|
|
<p>
|
|
Next: <a href="URIs.html">Universal Resource Identifiers</a>, Up: <a href="Web.html"><abbr class="acronym">HTTP</abbr>, the Web, and All That</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|