emacs.d/clones/ruslanspivak.com/lsbasi-part5/index.html

683 lines
50 KiB
HTML
Raw Normal View History

2022-10-07 19:32:11 +02:00
<!DOCTYPE html>
<html lang="en"
xmlns:og="http://ogp.me/ns#"
xmlns:fb="https://www.facebook.com/2008/fbml">
<head>
<title>Lets Build A Simple Interpreter. Part 5. - Ruslan's Blog</title>
<!-- Using the latest rendering mode for IE -->
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="canonical" href="index.html">
<meta name="author" content="Ruslan Spivak" />
<meta name="description" content="How do you tackle something as complex as understanding how to create an interpreter or compiler? In the beginning it all looks pretty much like a tangled mess of yarn that you need to untangle to get that perfect ball. The way to get there is to just untangle it …" />
<meta property="og:site_name" content="Ruslan's Blog" />
<meta property="og:type" content="article"/>
<meta property="og:title" content="Lets Build A Simple Interpreter. Part 5."/>
<meta property="og:url" content="https://ruslanspivak.com/lsbasi-part5/"/>
<meta property="og:description" content="How do you tackle something as complex as understanding how to create an interpreter or compiler? In the beginning it all looks pretty much like a tangled mess of yarn that you need to untangle to get that perfect ball. The way to get there is to just untangle it …"/>
<meta property="article:published_time" content="2015-10-14" />
<meta property="article:section" content="blog" />
<meta property="article:author" content="Ruslan Spivak" />
<meta name="twitter:card" content="summary">
<meta name="twitter:domain" content="https://ruslanspivak.com">
<!-- Bootstrap -->
<link rel="stylesheet" href="../theme/css/bootstrap.min.css" type="text/css"/>
<link href="../theme/css/font-awesome.min.css" rel="stylesheet">
<link href="../theme/css/pygments/tango.css" rel="stylesheet">
<link href="../theme/css/typogrify.css" rel="stylesheet">
<link rel="stylesheet" href="../theme/css/style.css" type="text/css"/>
<link href="../static/custom.css" rel="stylesheet">
<link href="../feeds/all.atom.xml" type="application/atom+xml" rel="alternate"
title="Ruslan's Blog ATOM Feed"/>
</head>
<body>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a href="../index.html" class="navbar-brand">
Ruslan's Blog </a>
</div>
<div class="collapse navbar-collapse navbar-ex1-collapse">
<ul class="nav navbar-nav">
</ul>
<ul class="nav navbar-nav navbar-right">
<li><a href="../pages/about.html"><i class="fa fa-question"></i><span class="icon-label">About</span></a></li>
<li><a href="../archives.html"><i class="fa fa-th-list"></i><span class="icon-label">Archives</span></a></li>
</ul>
</div>
<!-- /.navbar-collapse -->
</div>
</div> <!-- /.navbar -->
<!-- Banner -->
<!-- End Banner -->
<div class="container">
<div class="row">
<div class="col-sm-9">
<section id="content">
<article>
<header class="page-header">
<h1>
<a href="index.html"
rel="bookmark"
title="Permalink to Lets Build A Simple Interpreter. Part 5.">
Let&#8217;s Build A Simple Interpreter. Part&nbsp;5.
</a>
</h1>
</header>
<div class="entry-content">
<div class="panel">
<div class="panel-body">
<footer class="post-info">
<span class="label label-default">Date</span>
<span class="published">
<i class="fa fa-calendar"></i><time datetime="2015-10-14T07:00:00-04:00"> Wed, October 14, 2015</time>
</span>
</footer><!-- /.post-info --> </div>
</div>
<p>How do you tackle something as complex as understanding how to create an interpreter or compiler?
In the beginning it all looks pretty much like a tangled mess of yarn that you need to untangle to get that perfect&nbsp;ball.</p>
<p>The way to get there is to just untangle it one thread, one knot at a time. Sometimes, though, you might feel like you dont
understand something right away, but you have to keep going. It will eventually &#8220;click&#8221; if youre persistent enough, I promise
you (Gee, if I put aside 25 cents every time I didnt understand something right away I would have become rich a long time ago&nbsp;:).</p>
<p>Probably one of the best pieces of advice I could give you on your way to understanding how to create an interpreter
and compiler is to read the explanations in the articles, read the code, and then write code yourself, and
even write the same code several times over a period of time to make the material and code feel natural to you,
and only then move on to learn new topics. Do not rush, just slow down and take your time to understand the basic ideas deeply.
This approach, while seemingly slow, will pay off down the road. Trust&nbsp;me.</p>
<p>You will eventually get your perfect ball of yarn in the end. And, you know what? Even if it is not that perfect it is still
better than the alternative, which is to do nothing and not learn the topic or quickly skim over it and forget it in a couple of&nbsp;days.</p>
<p>Remember - just keep untangling: one thread, one knot at a time and practice what youve learned by writing code, a lot of&nbsp;it:</p>
<p><img alt="" src="lsbasi_part5_ballofyarn.png" width="640"></p>
<p>Today youre going to use all the knowledge youve gained from previous articles in the series and learn how to parse and interpret arithmetic expressions that have any number of addition, subtraction, multiplication, and division operators. You will write an interpreter that will be able to evaluate expressions like &#8220;14 + 2 * 3 - 6 /&nbsp;2&#8221;.</p>
<p>Before diving in and writing some code lets talk about the <strong>associativity</strong> and <strong>precedence</strong> of&nbsp;operators.</p>
<p>By convention 7 + 3 + 1 is the same as (7 + 3) + 1 and 7 - 3 - 1 is equivalent to (7 - 3) - 1. No surprises here. We all learned that at some point and have been taking it for granted since then. If we treated 7 - 3 - 1 as 7 - (3 - 1) the result would be unexpected 5 instead of the expected&nbsp;3.</p>
<p>In ordinary arithmetic and most programming languages addition, subtraction, multiplication, and division are <em>left-associative</em>:</p>
<div class="highlight"><pre><span></span>7 + 3 + 1 is equivalent to (7 + 3) + 1
7 - 3 - 1 is equivalent to (7 - 3) - 1
8 * 4 * 2 is equivalent to (8 * 4) * 2
8 / 4 / 2 is equivalent to (8 / 4) / 2
</pre></div>
<p>What does it mean for an operator to be <em>left-associative</em>?</p>
<p>When an operand like 3 in the expression 7 + 3 + 1 has plus signs on both sides, we need a convention to decide which operator applies to 3. Is it the one to the left or the one to the right of the operand 3? The operator + <em>associates</em> to the left because an operand that has plus signs on both sides belongs to the operator to its left and so we say that the operator + is <em>left-associative</em>. Thats why 7 + 3 + 1 is equivalent to (7 + 3) + 1 by the <em>associativity</em>&nbsp;convention.</p>
<p>Okay, what about an expression like 7 + 5 * 2 where we have different kinds of operators on both sides of the operand 5? Is the expression equivalent to 7 + (5 * 2) or (7 + 5) * 2? How do we resolve this&nbsp;ambiguity?</p>
<p>In this case, the associativity convention is of no help to us because it applies only to operators of one kind, either additive (+, -) or multiplicative (*, /). We need another convention to resolve the ambiguity when we have different kinds of operators in the same expression. We need a convention that defines relative <em>precedence</em> of&nbsp;operators.</p>
<p>And here it is: we say that if the operator * takes its operands before + does, then it has <em>higher precedence</em>. In the arithmetic that we know and use, multiplication and division have <em>higher precedence</em> than addition and subtraction. As a result the expression 7 + 5 * 2 is equivalent to 7 + (5 * 2) and the expression 7 - 8 / 4 is equivalent to 7 - (8 /&nbsp;4).</p>
<p>In a case where we have an expression with operators that have the same <em>precedence</em>, we just use the <em>associativity</em> convention and execute the operators from left to&nbsp;right:</p>
<div class="highlight"><pre><span></span>7 + 3 - 1 is equivalent to (7 + 3) - 1
8 / 4 * 2 is equivalent to (8 / 4) * 2
</pre></div>
<p>I hope you didnt think I wanted to bore you to death by talking so much about the associativity and precedence of operators. The nice thing about those conventions is that we can construct a grammar for arithmetic expressions from a table that shows the associativity and precedence of arithmetic operators. Then, we can translate the grammar into code by following the guidelines I outlined in <a href="../lsbasi-part4/index.html" title="Part 4">Part 4</a>, and our interpreter will be able to handle the precedence of operators in addition to&nbsp;associativity.</p>
<p>Okay, here is our precedence&nbsp;table:</p>
<p><img alt="" src="lsbasi_part5_precedence.png" width="640"></p>
<p>From the table, you can tell that operators + and - have the same precedence level and they are both left-associative. You can also see that operators * and / are also left-associative, have the same precedence among themselves but have higher-precedence than addition and subtraction&nbsp;operators.</p>
<p>Here are the rules for how to construct a grammar from the precedence&nbsp;table:</p>
<ol>
<li>For each level of precedence define a non-terminal. The body of a production for the non-terminal should contain arithmetic operators from that level and non-terminals for the next higher level of&nbsp;precedence.</li>
<li>Create an additional non-terminal <em>factor</em> for basic units of expression, in our case, integers. The general rule is that if you have N levels of precedence, you will need N + 1 non-terminals in total: one non-terminal for each level plus one non-terminal for basic units of&nbsp;expression.</li>
</ol>
<p>Onward!</p>
<p>Lets follow the rules and construct our&nbsp;grammar.</p>
<p>According to Rule 1 we will define two non-terminals: a non-terminal called <em>expr</em> for level 2 and a non-terminal called <em>term</em> for level 1.
And by following Rule 2 we will define a <em>factor</em> non-terminal for basic units of arithmetic expressions,&nbsp;integers.</p>
<p>The <em>start symbol</em> of our new grammar will be <em>expr</em> and the <em>expr</em> production will contain a body representing the use of operators from level 2, which in our case are operators + and - , and will contain <em>term</em> non-terminals for the next higher level of precedence, level&nbsp;1:</p>
<p><img alt="" src="lsbasi_part5_cfg_expr.png" width="640"></p>
<p>The <em>term</em> production will have a body representing the use of operators from level 1, which are operators * and / , and it will contain the non-terminal <em>factor</em> for the basic units of expression,&nbsp;integers:</p>
<p><img alt="" src="lsbasi_part5_cfg_term.png" width="640"></p>
<p>And the production for the non-terminal <em>factor</em> will&nbsp;be:</p>
<p><img alt="" src="lsbasi_part5_cfg_factor.png" width="340"></p>
<p>Youve already seen above productions as part of grammars and syntax diagrams from previous articles, but here we combine them into one grammar that takes care of the associativity and the precedence of&nbsp;operators:</p>
<p><img alt="" src="lsbasi_part5_grammar.png" width="640"></p>
<p>Here is a syntax diagram that corresponds to the grammar&nbsp;above:</p>
<p><img alt="" src="lsbasi_part5_syntaxdiagram.png" width="640"></p>
<p>Each rectangular box in the diagram is a &#8220;method call&#8221; to another diagram. If you take the expression 7 + 5 * 2 and start with the top diagram <em>expr</em> and walk your way down to the bottommost diagram <em>factor</em>, you should be able to see that <em>higher-precedence</em> operators * and / in the lower diagram execute before operators + and - in the higher&nbsp;diagram.</p>
<p>To drive the precedence of operators point home, lets take a look at the decomposition of the same arithmetic expression 7 + 5 * 2 done in accordance with our grammar and syntax diagrams above. This is just another way to show that <em>higher-precedence</em> operators execute before operators with <em>lower precedence</em>:</p>
<p><img alt="" src="lsbasi_part5_exprdecomp.png" width="640"></p>
<p>Okay, lets convert the grammar to code following guidelines from <a href="../lsbasi-part4/index.html" title="Part 4">Part 4</a> and see how our new interpreter works, shall&nbsp;we?</p>
<p>Here is the grammar&nbsp;again:</p>
<p><img alt="" src="lsbasi_part5_grammar.png" width="640"></p>
<p>And here is the complete code of a calculator that can handle valid arithmetic expressions containing integers and any number of addition, subtraction, multiplication, and division&nbsp;operators.</p>
<p>The following are the main changes compared with the code from <a href="../lsbasi-part4/index.html" title="Part 4">Part 4</a>:</p>
<ul>
<li>The <em>Lexer</em> class can now tokenize +, -, *, and / (Nothing new here, we just combined code from previous articles into one class that supports all those&nbsp;tokens)</li>
<li>Recall that each rule (production), <strong>R</strong>, defined in the grammar, becomes a method with the same name, and references to that rule become a method call: <strong>R()</strong>. As a result the <em>Interpreter</em> class now has three methods that correspond to non-terminals in the grammar: <em>expr</em>, <em>term</em>, and <em>factor</em>.</li>
</ul>
<p>Source&nbsp;code:</p>
<div class="highlight"><pre><span></span><span class="c1"># Token types</span>
<span class="c1">#</span>
<span class="c1"># EOF (end-of-file) token is used to indicate that</span>
<span class="c1"># there is no more input left for lexical analysis</span>
<span class="n">INTEGER</span><span class="p">,</span> <span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">,</span> <span class="n">MUL</span><span class="p">,</span> <span class="n">DIV</span><span class="p">,</span> <span class="n">EOF</span> <span class="o">=</span> <span class="p">(</span>
<span class="s1">&#39;INTEGER&#39;</span><span class="p">,</span> <span class="s1">&#39;PLUS&#39;</span><span class="p">,</span> <span class="s1">&#39;MINUS&#39;</span><span class="p">,</span> <span class="s1">&#39;MUL&#39;</span><span class="p">,</span> <span class="s1">&#39;DIV&#39;</span><span class="p">,</span> <span class="s1">&#39;EOF&#39;</span>
<span class="p">)</span>
<span class="k">class</span> <span class="nc">Token</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">type</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span>
<span class="c1"># token type: INTEGER, PLUS, MINUS, MUL, DIV, or EOF</span>
<span class="bp">self</span><span class="o">.</span><span class="n">type</span> <span class="o">=</span> <span class="nb">type</span>
<span class="c1"># token value: non-negative integer value, &#39;+&#39;, &#39;-&#39;, &#39;*&#39;, &#39;/&#39;, or None</span>
<span class="bp">self</span><span class="o">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="fm">__str__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;String representation of the class instance.</span>
<span class="sd"> Examples:</span>
<span class="sd"> Token(INTEGER, 3)</span>
<span class="sd"> Token(PLUS, &#39;+&#39;)</span>
<span class="sd"> Token(MUL, &#39;*&#39;)</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">return</span> <span class="s1">&#39;Token({type}, {value})&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
<span class="nb">type</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">type</span><span class="p">,</span>
<span class="n">value</span><span class="o">=</span><span class="nb">repr</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="p">)</span>
<span class="k">def</span> <span class="fm">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="fm">__str__</span><span class="p">()</span>
<span class="k">class</span> <span class="nc">Lexer</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">text</span><span class="p">):</span>
<span class="c1"># client string input, e.g. &quot;3 * 5&quot;, &quot;12 / 3 * 4&quot;, etc</span>
<span class="bp">self</span><span class="o">.</span><span class="n">text</span> <span class="o">=</span> <span class="n">text</span>
<span class="c1"># self.pos is an index into self.text</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">=</span> <span class="mi">0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">pos</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">error</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="s1">&#39;Invalid character&#39;</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">advance</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Advance the `pos` pointer and set the `current_char` variable.&quot;&quot;&quot;</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">pos</span> <span class="o">&gt;</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">None</span> <span class="c1"># Indicates end of input</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">text</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">pos</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">skip_whitespace</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isspace</span><span class="p">():</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">integer</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Return a (multidigit) integer consumed from the input.&quot;&quot;&quot;</span>
<span class="n">result</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isdigit</span><span class="p">():</span>
<span class="n">result</span> <span class="o">+=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">get_next_token</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Lexical analyzer (also known as scanner or tokenizer)</span>
<span class="sd"> This method is responsible for breaking a sentence</span>
<span class="sd"> apart into tokens. One token at a time.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isspace</span><span class="p">():</span>
<span class="bp">self</span><span class="o">.</span><span class="n">skip_whitespace</span><span class="p">()</span>
<span class="k">continue</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span><span class="o">.</span><span class="n">isdigit</span><span class="p">():</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">integer</span><span class="p">())</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;+&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="s1">&#39;+&#39;</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;-&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">MINUS</span><span class="p">,</span> <span class="s1">&#39;-&#39;</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;*&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">MUL</span><span class="p">,</span> <span class="s1">&#39;*&#39;</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_char</span> <span class="o">==</span> <span class="s1">&#39;/&#39;</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">advance</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">DIV</span><span class="p">,</span> <span class="s1">&#39;/&#39;</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">error</span><span class="p">()</span>
<span class="k">return</span> <span class="n">Token</span><span class="p">(</span><span class="n">EOF</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Interpreter</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">lexer</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">lexer</span> <span class="o">=</span> <span class="n">lexer</span>
<span class="c1"># set current token to the first token taken from the input</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">lexer</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">error</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="s1">&#39;Invalid syntax&#39;</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">eat</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">token_type</span><span class="p">):</span>
<span class="c1"># compare the current token type with the passed token</span>
<span class="c1"># type and if they match then &quot;eat&quot; the current token</span>
<span class="c1"># and assign the next token to the self.current_token,</span>
<span class="c1"># otherwise raise an exception.</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">token_type</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">current_token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">lexer</span><span class="o">.</span><span class="n">get_next_token</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">error</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">factor</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;factor : INTEGER&quot;&quot;&quot;</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">INTEGER</span><span class="p">)</span>
<span class="k">return</span> <span class="n">token</span><span class="o">.</span><span class="n">value</span>
<span class="k">def</span> <span class="nf">term</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;term : factor ((MUL | DIV) factor)*&quot;&quot;&quot;</span>
<span class="n">result</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">factor</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="ow">in</span> <span class="p">(</span><span class="n">MUL</span><span class="p">,</span> <span class="n">DIV</span><span class="p">):</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="k">if</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">MUL</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">MUL</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">factor</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">DIV</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">DIV</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">/</span> <span class="bp">self</span><span class="o">.</span><span class="n">factor</span><span class="p">()</span>
<span class="k">return</span> <span class="n">result</span>
<span class="k">def</span> <span class="nf">expr</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Arithmetic expression parser / interpreter.</span>
<span class="sd"> calc&gt; 14 + 2 * 3 - 6 / 2</span>
<span class="sd"> 17</span>
<span class="sd"> expr : term ((PLUS | MINUS) term)*</span>
<span class="sd"> term : factor ((MUL | DIV) factor)*</span>
<span class="sd"> factor : INTEGER</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">result</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">while</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span><span class="o">.</span><span class="n">type</span> <span class="ow">in</span> <span class="p">(</span><span class="n">PLUS</span><span class="p">,</span> <span class="n">MINUS</span><span class="p">):</span>
<span class="n">token</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">current_token</span>
<span class="k">if</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">PLUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">PLUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">token</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">MINUS</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">eat</span><span class="p">(</span><span class="n">MINUS</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">result</span> <span class="o">-</span> <span class="bp">self</span><span class="o">.</span><span class="n">term</span><span class="p">()</span>
<span class="k">return</span> <span class="n">result</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="c1"># To run under Python3 replace &#39;raw_input&#39; call</span>
<span class="c1"># with &#39;input&#39;</span>
<span class="n">text</span> <span class="o">=</span> <span class="nb">raw_input</span><span class="p">(</span><span class="s1">&#39;calc&gt; &#39;</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">EOFError</span><span class="p">:</span>
<span class="k">break</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">text</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">Lexer</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="n">interpreter</span> <span class="o">=</span> <span class="n">Interpreter</span><span class="p">(</span><span class="n">lexer</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">interpreter</span><span class="o">.</span><span class="n">expr</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">&#39;__main__&#39;</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
<p>Save the above code into the <em>calc5.py</em> file or download it directly from <a href="https://github.com/rspivak/lsbasi/blob/master/part5/calc5.py">GitHub</a>. As usual, try it out and see for yourself that the interpreter properly evaluates arithmetic expressions that have operators with different&nbsp;precedence.</p>
<p>Here is a sample session on my&nbsp;laptop:</p>
<div class="highlight"><pre><span></span>$ python calc5.py
calc&gt; <span class="m">3</span>
<span class="m">3</span>
calc&gt; <span class="m">2</span> + <span class="m">7</span> * <span class="m">4</span>
<span class="m">30</span>
calc&gt; <span class="m">7</span> - <span class="m">8</span> / <span class="m">4</span>
<span class="m">5</span>
calc&gt; <span class="m">14</span> + <span class="m">2</span> * <span class="m">3</span> - <span class="m">6</span> / <span class="m">2</span>
<span class="m">17</span>
</pre></div>
<p><br/>
Here are new exercises for&nbsp;today:</p>
<p><img alt="" src="lsbasi_part5_exercises.png" width="240"></p>
<ul>
<li>
<p>Write an interpreter as described in this article off the top of your head, without peeking into the code from the article. Write some tests for your interpreter, and make sure they&nbsp;pass.</p>
</li>
<li>
<p>Extend the interpreter to handle arithmetic expressions containing parentheses so that your interpreter could evaluate deeply nested arithmetic expressions like: 7 + 3 * (10 / (12 / (3 + 1) -&nbsp;1))</p>
</li>
</ul>
<p><br/>
<strong>Check your&nbsp;understanding.</strong></p>
<ol>
<li>What does it mean for an operator to be <em>left-associative</em>?</li>
<li>Are operators + and - <em>left-associative</em> or <em>right-associative</em>? What about * and /&nbsp;?</li>
<li>Does operator + have <em>higher precedence</em> than operator *&nbsp;?</li>
</ol>
<p><br/>
Hey, you read all the way to the end! Thats really great. Ill be back next time with a new article - stay tuned, be brilliant, and, as usual, dont forget to do the&nbsp;exercises.</p>
<p><br/>
Here is a list of books I recommend that will help you in your study of interpreters and&nbsp;compilers:</p>
<ol>
<li>
<p><a href="http://www.amazon.com/gp/product/193435645X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=193435645X&linkCode=as2&tag=russblo0b-20&linkId=MP4DCXDV6DJMEJBL">Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=193435645X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/0470177071/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0470177071&linkCode=as2&tag=russblo0b-20&linkId=UCLGQTPIYSWYKRRM">Writing Compilers and Interpreters: A Software Engineering Approach</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=0470177071" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/052182060X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=052182060X&linkCode=as2&tag=russblo0b-20&linkId=ZSKKZMV7YWR22NMW">Modern Compiler Implementation in Java</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=052182060X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/1461446988/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1461446988&linkCode=as2&tag=russblo0b-20&linkId=PAXWJP5WCPZ7RKRD">Modern Compiler Design</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=1461446988" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
<li>
<p><a href="http://www.amazon.com/gp/product/0321486811/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0321486811&linkCode=as2&tag=russblo0b-20&linkId=GOEGDQG4HIHU56FQ">Compilers: Principles, Techniques, and Tools (2nd Edition)</a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=russblo0b-20&l=as2&o=1&a=0321486811" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>
</li>
</ol>
<p><br/>
<p>If you want to get my newest articles in your inbox, then enter your email address below and click "Get Updates!"</p>
<!-- Begin MailChimp Signup Form -->
<link href="https://cdn-images.mailchimp.com/embedcode/classic-081711.css"
rel="stylesheet" type="text/css">
<style type="text/css">
#mc_embed_signup {
background: #f5f5f5;
clear: left;
font: 18px Helvetica,Arial,sans-serif;
}
#mc_embed_signup form {
text-align: center;
padding: 20px 0 10px 3%;
}
#mc_embed_signup .mc-field-group input {
display: inline;
width: 40%;
}
#mc_embed_signup div.response {
width: 100%;
}
</style>
<div id="mc_embed_signup">
<form
action="https://ruslanspivak.us4.list-manage.com/subscribe/post?u=7dde30eedc045f4670430c25f&amp;id=6f69f44e03"
method="post"
id="mc-embedded-subscribe-form"
name="mc-embedded-subscribe-form"
class="validate"
target="_blank" novalidate>
<div id="mc_embed_signup_scroll">
<div class="mc-field-group">
<label for="mce-NAME">Enter Your First Name *</label>
<input type="text" value="" name="NAME" class="required" id="mce-NAME">
</div>
<div class="mc-field-group">
<label for="mce-EMAIL">Enter Your Best Email *</label>
<input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</div>
<!-- real people should not fill this in and expect good things - do not remove this or risk form bot signups-->
<div style="position: absolute; left: -5000px;"><input type="text" name="b_7dde30eedc045f4670430c25f_6f69f44e03" tabindex="-1" value=""></div>
<div class="clear"><input type="submit" value="Get Updates!" name="subscribe" id="mc-embedded-subscribe" class="button" style="background-color: rgb(63, 146, 236);"></div>
</div>
</form>
</div>
<!-- <script type='text/javascript' src='//s3.amazonaws.com/downloads.mailchimp.com/js/mc-validate.js'></script><script type='text/javascript'>(function($) {window.fnames = new Array(); window.ftypes = new Array();fnames[1]='NAME';ftypes[1]='text';fnames[0]='EMAIL';ftypes[0]='email';}(jQuery));var $mcj = jQuery.noConflict(true);</script> -->
<!--End mc_embed_signup-->
</p>
<p><br/>
<strong>All articles in this series:</strong>
<ul>
<li>
<a href="../lsbasi-part1/index.html">Let's Build A Simple Interpreter. Part 1.</a>
</li>
<li>
<a href="../lsbasi-part2/index.html">Let's Build A Simple Interpreter. Part 2.</a>
</li>
<li>
<a href="../lsbasi-part3/index.html">Let's Build A Simple Interpreter. Part 3.</a>
</li>
<li>
<a href="../lsbasi-part4/index.html">Let's Build A Simple Interpreter. Part 4.</a>
</li>
<li>
<a href="index.html">Let's Build A Simple Interpreter. Part 5.</a>
</li>
<li>
<a href="../lsbasi-part6/index.html">Let's Build A Simple Interpreter. Part 6.</a>
</li>
<li>
<a href="../lsbasi-part7/index.html">Let's Build A Simple Interpreter. Part 7: Abstract Syntax Trees</a>
</li>
<li>
<a href="../lsbasi-part8/index.html">Let's Build A Simple Interpreter. Part 8.</a>
</li>
<li>
<a href="../lsbasi-part9/index.html">Let's Build A Simple Interpreter. Part 9.</a>
</li>
<li>
<a href="../lsbasi-part10/index.html">Let's Build A Simple Interpreter. Part 10.</a>
</li>
<li>
<a href="../lsbasi-part11/index.html">Let's Build A Simple Interpreter. Part 11.</a>
</li>
<li>
<a href="../lsbasi-part12/index.html">Let's Build A Simple Interpreter. Part 12.</a>
</li>
<li>
<a href="../lsbasi-part13.html">Let's Build A Simple Interpreter. Part 13: Semantic Analysis</a>
</li>
<li>
<a href="../lsbasi-part14/index.html">Let's Build A Simple Interpreter. Part 14: Nested Scopes and a Source-to-Source Compiler</a>
</li>
<li>
<a href="../lsbasi-part15/index.html">Let's Build A Simple Interpreter. Part 15.</a>
</li>
<li>
<a href="../lsbasi-part16/index.html">Let's Build A Simple Interpreter. Part 16: Recognizing Procedure Calls</a>
</li>
<li>
<a href="../lsbasi-part17.html">Let's Build A Simple Interpreter. Part 17: Call Stack and Activation Records</a>
</li>
<li>
<a href="../lsbasi-part18/index.html">Let's Build A Simple Interpreter. Part 18: Executing Procedure Calls</a>
</li>
<li>
<a href="../lsbasi-part19/index.html">Let's Build A Simple Interpreter. Part 19: Nested Procedure Calls</a>
</li>
</ul>
</p>
</div>
<!-- /.entry-content -->
<hr/>
<section class="comments" id="comments">
<h2>Comments</h2>
<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'ruslanspivak'; // required: replace example with your forum shortname
var disqus_identifier = 'lets-build-a-simple-interpreter-part-5';
var disqus_url = 'https://ruslanspivak.com/lsbasi-part5/';
var disqus_config = function () {
this.language = "en";
};
/* * * DON'T EDIT BELOW THIS LINE * * */
(function () {
var dsq = document.createElement('script');
dsq.type = 'text/javascript';
dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by
Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</section>
</article>
</section>
</div>
<div class="col-sm-3" id="sidebar">
<aside>
<section class="well well-sm">
<ul class="list-group list-group-flush">
<li class="list-group-item"><h4><i class="fa fa-home fa-lg"></i><span class="icon-label">Social</span></h4>
<ul class="list-group" id="social">
<li class="list-group-item"><a href="https://github.com/rspivak/"><i class="fa fa-github-square fa-lg"></i> github</a></li>
<li class="list-group-item"><a href="https://twitter.com/rspivak"><i class="fa fa-twitter-square fa-lg"></i> twitter</a></li>
<li class="list-group-item"><a href="https://linkedin.com/in/ruslanspivak/"><i class="fa fa-linkedin-square fa-lg"></i> linkedin</a></li>
</ul>
</li>
<li class="list-group-item"><h4><i class="fa fa-home fa-lg"></i><span class="icon-label">Popular posts</span></h4>
<ul class="list-group" id="popularposts">
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part1/index.html">
Let's Build A Web Server. Part 1.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbasi-part1/index.html">
Let's Build A Simple Interpreter. Part 1.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part2/index.html">
Let's Build A Web Server. Part 2.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbaws-part3/index.html">
Let's Build A Web Server. Part 3.
</a>
</li>
<li class="list-group-item"
style="font-size: 15px; word-break: normal;">
<a href="../lsbasi-part2/index.html">
Let's Build A Simple Interpreter. Part 2.
</a>
</li>
</ul>
</li>
<li class="list-group-item">
<h4>
<span>Disclaimer</span>
</h4>
<p id="disclaimer-text"> Some of the links on this site
have my Amazon referral id, which provides me with a small
commission for each sale. Thank you for your support.
</p>
</li>
</ul>
</section>
</aside>
</div>
</div>
</div>
<footer>
<div class="container">
<hr>
<div class="row">
<div class="col-xs-10">&copy; 2020 Ruslan Spivak
<!-- &middot; Powered by <a href="https://github.com/DandyDev/pelican-bootstrap3" target="_blank">pelican-bootstrap3</a>, -->
<!-- <a href="http://docs.getpelican.com/" target="_blank">Pelican</a>, -->
<!-- <a href="http://getbootstrap.com" target="_blank">Bootstrap</a> -->
<!-- -->
</div>
<div class="col-xs-2"><p class="pull-right"><i class="fa fa-arrow-up"></i> <a href="index.html#">Back to top</a></p></div>
</div>
</div>
</footer>
<script src="../theme/js/jquery.min.js"></script>
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="../theme/js/bootstrap.min.js"></script>
<!-- Enable responsive features in IE8 with Respond.js (https://github.com/scottjehl/Respond) -->
<script src="../theme/js/respond.min.js"></script>
<!-- Disqus -->
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'ruslanspivak'; // required: replace example with your forum shortname
/* * * DON'T EDIT BELOW THIS LINE * * */
(function () {
var s = document.createElement('script');
s.async = true;
s.type = 'text/javascript';
s.src = '//' + disqus_shortname + '.disqus.com/count.js';
(document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
}());
</script>
<!-- End Disqus Code -->
<!-- Google Analytics Universal -->
<script type="text/javascript">
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-2572871-3', 'auto');
ga('send', 'pageview');
</script>
<!-- End Google Analytics Universal Code -->
</body>
</html>