emacs.d/clones/ruslanspivak.com/lsbasi-part3/index.html

683 lines
50 KiB
HTML
Raw Normal View History

2022-10-07 19:32:11 +02:00
<!DOCTYPE html>
<html lang="en"
xmlns:og="http://ogp.me/ns#"
xmlns:fb="https://www.facebook.com/2008/fbml">
<head>
<title>Lets Build A Simple Interpreter. Part 3. - Ruslan's Blog</title>
<!-- Using the latest rendering mode for IE -->
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="canonical" href="index.html">
<meta name="author" content="Ruslan Spivak" />
<meta name="description" content="I woke up this morning and I thought to myself: “Why do we find it so difficult to learn a new skill?” I dont think its just because of the hard work. I think that one of the reasons might be that we spend a lot of time …" />
<meta property="og:site_name" content="Ruslan's Blog" />
<meta property="og:type" content="article"/>
<meta property="og:title" content="Lets Build A Simple Interpreter. Part 3."/>
<meta property="og:url" content="https://ruslanspivak.com/lsbasi-part3/"/>
<meta property="og:description" content="I woke up this morning and I thought to myself: “Why do we find it so difficult to learn a new skill?” I dont think its just because of the hard work. I think that one of the reasons might be that we spend a lot of time …"/>
<meta property="article:published_time" content="2015-08-12" />
<meta property="article:section" content="blog" />
<meta property="article:author" content="Ruslan Spivak" />
<meta name="twitter:card" content="summary">
<meta name="twitter:domain" content="https://ruslanspivak.com">
<!-- Bootstrap -->
<link rel="stylesheet" href="../theme/css/bootstrap.min.css" type="text/css"/>
<link href="../theme/css/font-awesome.min.css" rel="stylesheet">
<link href="../theme/css/pygments/tango.css" rel="stylesheet">
<link href="../theme/css/typogrify.css" rel="stylesheet">
<link rel="stylesheet" href="../theme/css/style.css" type="text/css"/>
<link href="../static/custom.css" rel="stylesheet">
<link href="../feeds/all.atom.xml" type="application/atom+xml" rel="alternate"
title="Ruslan's Blog ATOM Feed"/>
</head>
<body>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="../index.html" class="navbar-brand">
Ruslan's Blog </a>
</div>
<div class="collapse navbar-collapse navbar-ex1-collapse">
<ul class="nav navbar-nav">
</ul>
<ul class="nav navbar-nav navbar-right">
<li><a href="../pages/about.html"><i class="fa fa-question"></i><span class="icon-label">About</span></a></li>
<li><a href="../archives.html"><i class="fa fa-th-list"></i><span class="icon-label">Archives</span></a></li>
</ul>
</div>
<!-- /.navbar-collapse -->
</div>
</div> <!-- /.navbar -->
<!-- Banner -->
<!-- End Banner -->
<div class="container">
<div class="row">
<div class="col-sm-9">
<section id="content">
<article>
<header class="page-header">
<h1>
<a href="index.html"
rel="bookmark"
title="Permalink to Lets Build A Simple Interpreter. Part 3.">
Let&#8217;s Build A Simple Interpreter. Part&nbsp;3.
</a>
</h1>
</header>
<div class="entry-content">
<div class="panel">
<div class="panel-body">
<footer class="post-info">
<span class="label label-default">Date</span>
<span class="published">
<i class="fa fa-calendar"></i><time datetime="2015-08-12T07:00:00-04:00"> Wed, August 12, 2015</time>
</span>
</footer><!-- /.post-info --> </div>
</div>
<p>I woke up this morning and I thought to myself: &#8220;Why do we find it so difficult to learn a new&nbsp;skill?&#8221;</p>
<p>I dont think its just because of the hard work. I think that one of the reasons might be that we spend a lot of time and hard work acquiring knowledge by reading and watching and not enough time translating that knowledge into a skill by practicing it. Take swimming, for example. You can spend a lot of time reading hundreds of books about swimming, talk for hours with experienced swimmers and coaches, watch all the training videos available, and you still will sink like a rock the first time you jump in the&nbsp;pool.</p>
<p>The bottom line is: it doesnt matter how well you think you know the subject - you have to put that knowledge into practice to turn it into a skill. To help you with the practice part I put exercises into <a href="../lsbasi-part1/index.html" title="Part 1">Part 1</a> and <a href="../lsbasi-part2/index.html" title="Part 2">Part 2</a> of the series. And yes, you will see more exercises in todays article and in future articles, I promise&nbsp;:)</p>
<p>Okay, lets get started with todays material, shall&nbsp;we?</p>
<p><br/>
So far, youve learned how to interpret arithmetic expressions that add or subtract two integers like &#8220;7 + 3&#8221; or &#8220;12 - 9&#8221;. Today Im going to talk about how to parse (recognize) and interpret arithmetic expressions that have any number of plus or minus operators in it, for example &#8220;7 - 3 + 2 -&nbsp;1&#8221;.</p>
<p>Graphically, the arithmetic expressions in this article can be represented with the following syntax&nbsp;diagram:</p>
<p><img alt="" src="lsbasi_part3_syntax_diagram.png" width="640"></p>
<p>What is a syntax diagram? A <strong>syntax diagram</strong> is a graphical representation of a programming languages syntax rules. Basically, a syntax diagram visually shows you which statements are allowed in your programming language and which are&nbsp;not.</p>
<p>Syntax diagrams are pretty easy to read: just follow the paths indicated by the arrows. Some paths indicate choices. And some paths indicate&nbsp;loops.</p>
<p>You can read the above syntax diagram as following: a term optionally followed by a plus or minus sign, followed by another term, which in turn is optionally followed by a plus or minus sign followed by another term and so on. You get the picture, literally. You might wonder what a <em>&#8220;term&#8221;</em> is. For the purpose of this article a <em>&#8220;term&#8221;</em> is just an&nbsp;integer.</p>
<p>Syntax diagrams serve two main&nbsp;purposes:</p>
<ul>
<li>They graphically represent the specification (grammar) of a programming&nbsp;language.</li>
<li>They can be used to help you write your parser - you can map a diagram to code by following simple&nbsp;rules.</li>
</ul>
<p>Youve learned that the process of recognizing a phrase in the stream of tokens is called <strong>parsing</strong>. And the part of an interpreter or compiler that performs that job is called a <strong>parser</strong>. Parsing is also called <strong>syntax analysis</strong>, and the parser is also aptly called, you guessed it right, a <strong>syntax analyzer</strong>.</p>
<p>According to the syntax diagram above, all of the following arithmetic expressions are&nbsp;valid:</p>
<ul>
<li>3</li>
<li>3 +&nbsp;4</li>
<li>7 - 3 + 2 -&nbsp;1</li>
</ul>
<p>Because syntax rules for arithmetic expressions in different programming languages are very similar we can use a Python shell to &#8220;test&#8221; our syntax diagram. Launch your Python shell and see for&nbsp;yourself:</p>
<div class="highlight"><pre><span></span>&gt;&gt;&gt; <span class="m">3</span>
<span class="m">3</span>
&gt;&gt;&gt; <span class="m">3</span> + <span class="m">4</span>
<span class="m">7</span>
&gt;&gt;&gt; <span class="m">7</span> - <span class="m">3</span> + <span class="m">2</span> - <span class="m">1</span>
<span class="m">5</span>
</pre></div>
<p>No surprises&nbsp;here.</p>
<p>The expression &#8220;3 + &#8221; is not a valid arithmetic expression though because according to the syntax diagram the plus sign must be followed by a <em>term</em> (integer), otherwise its a syntax error. Again, try it with a Python shell and see for&nbsp;yourself:</p>
<div class="highlight"><pre><span></span>&gt;&gt;&gt; <span class="m">3</span> +
File <span class="s2">&quot;&lt;stdin&gt;&quot;</span>, line <span class="m">1</span>
<span class="m">3</span> +
^
SyntaxError: invalid syntax
</pre></div>
<p>Its great to be able to use a Python shell to do some testing but lets map the above syntax diagram to code and use our own interpreter for testing, all&nbsp;right?</p>
<p>You know from the previous articles (<a href="../lsbasi-part1/index.html" title="Part 1">Part 1</a> and <a href="../lsbasi-part2/index.html" title="Part 2">Part 2</a>) that the <em>expr</em> method is where both our parser and interpreter live. Again, the parser just recognizes the structure making sure that it corresponds to some specifications and the interpreter actually evaluates the expression once the parser has successfully recognized (parsed)&nbsp;it.</p>
<p>The following code snippet shows the parser code corresponding to the diagram. The rectangular box from the syntax diagram (<em>term</em>) becomes a <em>term</em> method that parses an integer and the <em>expr</em> method just follows the syntax diagram&nbsp;flow:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">term</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">expr</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="c1"># set current token to the first token taken from the input</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="ow">in</span> <span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">):</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="k">if</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">PLUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">PLUS</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">MINUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">MINUS</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
</pre></div>
<p>You can see that <em>expr</em> first calls the <em>term</em> method. Then the <em>expr</em> method has a <em>while</em> loop which can execute zero or more times. And inside the loop the parser makes a choice based on the token (whether its a plus or minus sign). Spend some time proving to yourself that the code above does indeed follow the syntax diagram flow for arithmetic&nbsp;expressions.</p>
<p>The parser itself does not interpret anything though: if it recognizes an expression its silent and if it doesnt, it throws out a syntax error. Lets modify the <em>expr</em> method and add the interpreter&nbsp;code:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">term</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Return an INTEGER token value&quot;&quot;&quot;</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">)</span>
<span class="k">return</span> <span class="n">token</span><span class="o">.</span><span class="n">value</span>
<span class="k">def</span> <span class="nf">expr</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Parser / Interpreter &quot;&quot;&quot;</span>
<span class="c1"># set current token to the first token taken from the input</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="n">result</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="ow">in</span> <span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">):</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="k">if</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">PLUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">PLUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">MINUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">MINUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">return</span> <span class="n">result</span>
</pre></div>
<p>Because the interpreter needs to evaluate an expression the <em>term</em> method was modified to return an integer value and the <em>expr</em> method was modified to perform addition and subtraction at the appropriate places and return the result of interpretation. Even though the code is pretty straightforward I recommend spending some time studying&nbsp;it.</p>
<p>Les get moving and see the complete code of the interpreter now,&nbsp;okay?</p>
<p>Here is the source code for your new version of the calculator that can handle valid arithmetic expressions containing integers and any number of addition and subtraction&nbsp;operators:</p>
<div class="highlight"><pre><span></span><span class="c1"># Token types</span>
<span class="c1">#</span>
<span class="c1"># EOF (end-of-file) token is used to indicate that</span>
<span class="c1"># there is no more input left for lexical analysis</span>
<span class="n">INTEGER</span><span class="p">,</span> <span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">,</span> <span class="n">EOF</span> <span class="o">=</span> <span class="s1">&#39;INTEGER&#39;</span><span class="p">,</span> <span class="s1">&#39;PLUS&#39;</span><span class="p">,</span> <span class="s1">&#39;MINUS&#39;</span><span class="p">,</span> <span class="s1">&#39;EOF&#39;</span>
<span class="k">class</span> <span class="nc">Token</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">type</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span>
<span class="c1"># token type: INTEGER, PLUS, MINUS, or EOF</span>
<span class="bp">self</span><span class="o">.</span><span class="n">type</span> <span class="o">=</span> <span class="nb">type</span>
<span class="c1"># token value: non-negative integer value, &#39;+&#39;, &#39;-&#39;, or None</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="fm">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;String representation of the class instance.</span>
<span class="sd"> Examples:</span>
<span class="sd"> Token(INTEGER, 3)</span>
<span class="sd"> Token(PLUS, &#39;+&#39;)</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="s1">&#39;Token({type}, {value})&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
<span class="nb">type</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">type</span><span class="p">,</span>
<span class="n">value</span><span class="o">=</span><span class="nb">repr</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="p">)</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="fm">__str__</span><span class="p">()</span>
<span class="k">class</span> <span class="nc">Interpreter</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">text</span><span class="p">):</span>
<span class="c1"># client string input, e.g. &quot;3 + 5&quot;, &quot;12 - 5 + 3&quot;, etc</span>
<span class="bp">self</span><span class="o">.</span><span class="n">text</span> <span class="o">=</span> <span class="n">text</span>
<span class="c1"># self.pos is an index into self.text</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1"># current token instance</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">None</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">pos</span><span class="p">]</span>
<span class="c1">##########################################################</span>
<span class="c1"># Lexer code #</span>
<span class="c1">##########################################################</span>
<span class="k">def</span> <span class="nf">error</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="s1">&#39;Invalid syntax&#39;</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">advance</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Advance the `pos` pointer and set the `current_char` variable.&quot;&quot;&quot;</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">&gt;</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">None</span> <span class="c1"># Indicates end of input</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">pos</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">skip_whitespace</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isspace</span><span class="p">():</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">integer</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Return a (multidigit) integer consumed from the input.&quot;&quot;&quot;</span>
<span class="n">result</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isdigit</span><span class="p">():</span>
<span class="n">result</span> <span class="o">+=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">get_next_token</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Lexical analyzer (also known as scanner or tokenizer)</span>
<span class="sd"> This method is responsible for breaking a sentence</span>
<span class="sd"> apart into tokens. One token at a time.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isspace</span><span class="p">():</span>
<span class="bp">self</span><span class="o">.</span><span class="n">skip_whitespace</span><span class="p">()</span>
<span class="k">continue</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isdigit</span><span class="p">():</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">integer</span><span class="p">())</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;+&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="s1">&#39;+&#39;</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;-&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">MINUS</span><span class="p">,</span> <span class="s1">&#39;-&#39;</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">error</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">EOF</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="c1">##########################################################</span>
<span class="c1"># Parser / Interpreter code #</span>
<span class="c1">##########################################################</span>
<span class="k">def</span> <span class="nf">eat</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">token_type</span><span class="p">):</span>
<span class="c1"># compare the current token type with the passed token</span>
<span class="c1"># type and if they match then &quot;eat&quot; the current token</span>
<span class="c1"># and assign the next token to the self.current_token,</span>
<span class="c1"># otherwise raise an exception.</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">token_type</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">error</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">term</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Return an INTEGER token value.&quot;&quot;&quot;</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">)</span>
<span class="k">return</span> <span class="n">token</span><span class="o">.</span><span class="n">value</span>
<span class="k">def</span> <span class="nf">expr</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Arithmetic expression parser / interpreter.&quot;&quot;&quot;</span>
<span class="c1"># set current token to the first token taken from the input</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="n">result</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="ow">in</span> <span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">):</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="k">if</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">PLUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">PLUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">MINUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">MINUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">return</span> <span class="n">result</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="c1"># To run under Python3 replace &#39;raw_input&#39; call</span>
<span class="c1"># with &#39;input&#39;</span>
<span class="n">text</span> <span class="o">=</span> <span class="nb">raw_input</span><span class="p">(</span><span class="s1">&#39;calc&gt; &#39;</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">EOFError</span><span class="p">:</span>
<span class="k">break</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">text</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">interpreter</span> <span class="o">=</span> <span class="n">Interpreter</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">interpreter</span><span class="o">.</span><span class="n">expr</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">&#39;__main__&#39;</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
<p>Save the above code into the <em>calc3.py</em> file or download it directly from <a href="https://github.com/rspivak/lsbasi/blob/master/part3/calc3.py">GitHub</a>. Try it out. See for yourself that it can handle arithmetic expressions that you can derive from the syntax diagram I showed you&nbsp;earlier.</p>
<p>Here is a sample session that I ran on my&nbsp;laptop:</p>
<div class="highlight"><pre><span></span>$ python calc3.py
calc&gt; <span class="m">3</span>
<span class="m">3</span>
calc&gt; <span class="m">7</span> - <span class="m">4</span>
<span class="m">3</span>
calc&gt; <span class="m">10</span> + <span class="m">5</span>
<span class="m">15</span>
calc&gt; <span class="m">7</span> - <span class="m">3</span> + <span class="m">2</span> - <span class="m">1</span>
<span class="m">5</span>
calc&gt; <span class="m">10</span> + <span class="m">1</span> + <span class="m">2</span> - <span class="m">3</span> + <span class="m">4</span> + <span class="m">6</span> - <span class="m">15</span>
<span class="m">5</span>
calc&gt; <span class="m">3</span> +
Traceback <span class="o">(</span>most recent call last<span class="o">)</span>:
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">147</span>, in &lt;module&gt;
main<span class="o">()</span>
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">142</span>, in main
<span class="nv">result</span> <span class="o">=</span> interpreter.expr<span class="o">()</span>
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">123</span>, in expr
<span class="nv">result</span> <span class="o">=</span> result + self.term<span class="o">()</span>
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">110</span>, in term
self.eat<span class="o">(</span>INTEGER<span class="o">)</span>
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">105</span>, in eat
self.error<span class="o">()</span>
File <span class="s2">&quot;calc3.py&quot;</span>, line <span class="m">45</span>, in error
raise Exception<span class="o">(</span><span class="s1">&#39;Invalid syntax&#39;</span><span class="o">)</span>
Exception: Invalid syntax
</pre></div>
<p><br/>
Remember those exercises I mentioned at the beginning of the article: here they are, as promised&nbsp;:)</p>
<p><img alt="" src="lsbasi_part3_exercises.png" width="280"></p>
<ul>
<li>Draw a syntax diagram for arithmetic expressions that contain only multiplication and division, for example &#8220;7 * 4 / 2 * 3&#8221;. Seriously, just grab a pen or a pencil and try to draw&nbsp;one.</li>
<li>Modify the source code of the calculator to interpret arithmetic expressions that contain only multiplication and division, for example &#8220;7 * 4 / 2 *&nbsp;3&#8221;.</li>
<li>Write an interpreter that handles arithmetic expressions like &#8220;7 - 3 + 2 - 1&#8221; from scratch. Use any programming language youre comfortable with and write it off the top of your head without looking at the examples. When you do that, think about components involved: a <em>lexer</em> that takes an input and converts it into a stream of tokens, a <em>parser</em> that feeds off the stream of the tokens provided by the <em>lexer</em> and tries to recognize a structure in that stream, and an <em>interpreter</em> that generates results after the <em>parser</em> has successfully parsed (recognized) a valid arithmetic expression. String those pieces together. Spend some time translating the knowledge youve acquired into a working interpreter for arithmetic&nbsp;expressions.</li>
</ul>
<p><strong>Check your&nbsp;understanding.</strong></p>
<ol>
<li>What is a syntax&nbsp;diagram?</li>
<li>What is syntax&nbsp;analysis?</li>
<li>What is a syntax&nbsp;analyzer?</li>
</ol>
<p><br/>
Hey, look! You read all the way to the end. Thanks for hanging out here today and dont forget to do the exercises. :) Ill be back next time with a new article - stay&nbsp;tuned.</p>
<p>Here is a list of books I recommend that will help you in your study of interpreters and&nbsp;compilers:</p>
<ol>
<li>
<p><a href="http://www.amazon.com/gp/product/193435645X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=193435645X&linkCode=as2&tag=russblo0b-20&linkId=MP4DCXDV6DJMEJBL">Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=193435645X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/0470177071/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0470177071&linkCode=as2&tag=russblo0b-20&linkId=UCLGQTPIYSWYKRRM">Writing Compilers and Interpreters: A Software Engineering Approach</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=0470177071" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/052182060X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=052182060X&linkCode=as2&tag=russblo0b-20&linkId=ZSKKZMV7YWR22NMW">Modern Compiler Implementation in Java</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=052182060X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/1461446988/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1461446988&linkCode=as2&tag=russblo0b-20&linkId=PAXWJP5WCPZ7RKRD">Modern Compiler Design</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=1461446988" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/0321486811/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0321486811&linkCode=as2&tag=russblo0b-20&linkId=GOEGDQG4HIHU56FQ">Compilers: Principles, Techniques, and Tools (2nd Edition)</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=0321486811" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
</ol>
<p><br/>
<p>If you want to get my newest articles in your inbox, then enter your email address below and click "Get Updates!"</p>
<!-- Begin MailChimp Signup Form -->
<link href="https://cdn-images.mailchimp.com/embedcode/classic-081711.css"
rel="stylesheet" type="text/css">
<style type="text/css">
#mc_embed_signup {
background: #f5f5f5;
clear: left;
font: 18px Helvetica,Arial,sans-serif;
}
#mc_embed_signup form {
text-align: center;
padding: 20px 0 10px 3%;
}
#mc_embed_signup .mc-field-group input {
display: inline;
width: 40%;
}
#mc_embed_signup div.response {
width: 100%;
}
</style>
<div id="mc_embed_signup">
<form
action="https://ruslanspivak.us4.list-manage.com/subscribe/post?u=7dde30eedc045f4670430c25f&amp;id=6f69f44e03"
method="post"
id="mc-embedded-subscribe-form"
name="mc-embedded-subscribe-form"
class="validate"
target="_blank" novalidate>
<div id="mc_embed_signup_scroll">
<div class="mc-field-group">
<label for="mce-NAME">Enter Your First Name *</label>
<input type="text" value="" name="NAME" class="required" id="mce-NAME">
</div>
<div class="mc-field-group">
<label for="mce-EMAIL">Enter Your Best Email *</label>
<input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</div>
<!-- real people should not fill this in and expect good things - do not remove this or risk form bot signups-->
<div style="position: absolute; left: -5000px;"><input type="text" name="b_7dde30eedc045f4670430c25f_6f69f44e03" tabindex="-1" value=""></div>
<div class="clear"><input type="submit" value="Get Updates!" name="subscribe" id="mc-embedded-subscribe" class="button" style="background-color: rgb(63, 146, 236);"></div>
</div>
</form>
</div>
<!-- <script type='text/javascript' src='//s3.amazonaws.com/downloads.mailchimp.com/js/mc-validate.js'></script><script type='text/javascript'>(function($) {window.fnames = new Array(); window.ftypes = new Array();fnames[1]='NAME';ftypes[1]='text';fnames[0]='EMAIL';ftypes[0]='email';}(jQuery));var $mcj = jQuery.noConflict(true);</script> -->
<!--End mc_embed_signup-->
</p>
<p><br/>
<strong>All articles in this series:</strong>
<ul>
<li>
<a href="../lsbasi-part1/index.html">Let's Build A Simple Interpreter. Part 1.</a>
</li>
<li>
<a href="../lsbasi-part2/index.html">Let's Build A Simple Interpreter. Part 2.</a>
</li>
<li>
<a href="index.html">Let's Build A Simple Interpreter. Part 3.</a>
</li>
<li>
<a href="../lsbasi-part4/index.html">Let's Build A Simple Interpreter. Part 4.</a>
</li>
<li>
<a href="../lsbasi-part5/index.html">Let's Build A Simple Interpreter. Part 5.</a>
</li>
<li>
<a href="../lsbasi-part6/index.html">Let's Build A Simple Interpreter. Part 6.</a>
</li>
<li>
<a href="../lsbasi-part7/index.html">Let's Build A Simple Interpreter. Part 7: Abstract Syntax Trees</a>
</li>
<li>
<a href="../lsbasi-part8/index.html">Let's Build A Simple Interpreter. Part 8.</a>
</li>
<li>
<a href="../lsbasi-part9/index.html">Let's Build A Simple Interpreter. Part 9.</a>
</li>
<li>
<a href="../lsbasi-part10/index.html">Let's Build A Simple Interpreter. Part 10.</a>
</li>
<li>
<a href="../lsbasi-part11/index.html">Let's Build A Simple Interpreter. Part 11.</a>
</li>
<li>
<a href="../lsbasi-part12/index.html">Let's Build A Simple Interpreter. Part 12.</a>
</li>
<li>
<a href="../lsbasi-part13.html">Let's Build A Simple Interpreter. Part 13: Semantic Analysis</a>
</li>
<li>
<a href="../lsbasi-part14/index.html">Let's Build A Simple Interpreter. Part 14: Nested Scopes and a Source-to-Source Compiler</a>
</li>
<li>
<a href="../lsbasi-part15/index.html">Let's Build A Simple Interpreter. Part 15.</a>
</li>
<li>
<a href="../lsbasi-part16/index.html">Let's Build A Simple Interpreter. Part 16: Recognizing Procedure Calls</a>
</li>
<li>
<a href="../lsbasi-part17.html">Let's Build A Simple Interpreter. Part 17: Call Stack and Activation Records</a>
</li>
<li>
<a href="../lsbasi-part18/index.html">Let's Build A Simple Interpreter. Part 18: Executing Procedure Calls</a>
</li>
<li>
<a href="../lsbasi-part19/index.html">Let's Build A Simple Interpreter. Part 19: Nested Procedure Calls</a>
</li>
</ul>
</p>
</div>
<!-- /.entry-content -->
<hr/>
<section class="comments" id="comments">
<h2>Comments</h2>
<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'ruslanspivak'; // required: replace example with your forum shortname
var disqus_identifier = 'lets-build-a-simple-interpreter-part-3';
var disqus_url = 'https://ruslanspivak.com/lsbasi-part3/';
var disqus_config = function () {
this.language = "en";
};
/* * * DON'T EDIT BELOW THIS LINE * * */
(function () {
var dsq = document.createElement('script');
dsq.type = 'text/javascript';
dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by
Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</section>
</article>
</section>
</div>
<div class="col-sm-3" id="sidebar">
<aside>
<section class="well well-sm">
<ul class="list-group list-group-flush">
<li class="list-group-item"><h4><i class="fa fa-home fa-lg"></i><span class="icon-label">Social</span></h4>
<ul class="list-group" id="social">
<li class="list-group-item"><a href="https://github.com/rspivak/"><i class="fa fa-github-square fa-lg"></i> github</a></li>
<li class="list-group-item"><a href="https://twitter.com/rspivak"><i class="fa fa-twitter-square fa-lg"></i> twitter</a></li>
<li class="list-group-item"><a href="https://linkedin.com/in/ruslanspivak/"><i class="fa fa-linkedin-square fa-lg"></i> linkedin</a></li>
</ul>
</li>
<li class="list-group-item"><h4><i class="fa fa-home fa-lg"></i><span class="icon-label">Popular posts</span></h4>
<ul class="list-group" id="popularposts">
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part1/index.html">
Let's Build A Web Server. Part 1.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbasi-part1/index.html">
Let's Build A Simple Interpreter. Part 1.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part2/index.html">
Let's Build A Web Server. Part 2.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part3/index.html">
Let's Build A Web Server. Part 3.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbasi-part2/index.html">
Let's Build A Simple Interpreter. Part 2.
</a>
</li>
</ul>
</li>
<li class="list-group-item">
<h4>
<span>Disclaimer</span>
</h4>
<p id="disclaimer-text"> Some of the links on this site
have my Amazon referral id, which provides me with a small
commission for each sale. Thank you for your support.
</p>
</li>
</ul>
</section>
</aside>
</div>
</div>
</div>
<footer>
<div class="container">
<hr>
<div class="row">
<div class="col-xs-10">&copy; 2020 Ruslan Spivak
<!-- &middot; Powered by <a href="https://github.com/DandyDev/pelican-bootstrap3" target="_blank">pelican-bootstrap3</a>, -->
<!-- <a href="http://docs.getpelican.com/" target="_blank">Pelican</a>, -->
<!-- <a href="http://getbootstrap.com" target="_blank">Bootstrap</a> -->
<!-- -->
</div>
<div class="col-xs-2"><p class="pull-right"><i class="fa fa-arrow-up"></i> <a href="index.html#">Back to top</a></p></div>
</div>
</div>
</footer>
<script src="../theme/js/jquery.min.js"></script>
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="../theme/js/bootstrap.min.js"></script>
<!-- Enable responsive features in IE8 with Respond.js (https://github.com/scottjehl/Respond) -->
<script src="../theme/js/respond.min.js"></script>
<!-- Disqus -->
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'ruslanspivak'; // required: replace example with your forum shortname
/* * * DON'T EDIT BELOW THIS LINE * * */
(function () {
var s = document.createElement('script');
s.async = true;
s.type = 'text/javascript';
s.src = '//' + disqus_shortname + '.disqus.com/count.js';
(document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
}());
</script>
<!-- End Disqus Code -->
<!-- Google Analytics Universal -->
<script type="text/javascript">
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-2572871-3', 'auto');
ga('send', 'pageview');
</script>
<!-- End Google Analytics Universal Code -->
</body>
</html>