367 lines
No EOL
31 KiB
HTML
367 lines
No EOL
31 KiB
HTML
<!DOCTYPE html>
|
|
<html lang='en'><head><meta charset='utf-8' /><meta name='pinterest' content='nopin' /><link href='http://stevelosh.com/static/css/style.css' rel='stylesheet' type='text/css' /><link href='http://stevelosh.com/static/css/print.css' rel='stylesheet' type='text/css' media='print' /><title>CHIP-8 in Common Lisp: Disassembly / Steve Losh</title></head><body><header><a id='logo' href='http://stevelosh.com/'>Steve Losh</a><nav><a href='http://stevelosh.com/blog/'>Blog</a> - <a href='http://stevelosh.com/projects/'>Projects</a> - <a href='http://stevelosh.com/photography/'>Photography</a> - <a href='http://stevelosh.com/links/'>Links</a> - <a href='http://stevelosh.com/rss.xml'>Feed</a></nav></header><hr class='main-separator' /><main id='page-blog-entry'><article><h1><a href='index.html'>CHIP-8 in Common Lisp: Disassembly</a></h1><p class='date'>Posted on January 2nd, 2017.</p><p>In the previous posts we looked at how to emulate a <a href="https://en.wikipedia.org/wiki/CHIP-8">CHIP-8</a> CPU with Common
|
|
Lisp. After adding a screen, input, and sound the core of the emulator is
|
|
essentially complete.</p>
|
|
|
|
<p>I've been guiding you through the code step by step and it might look simple,
|
|
but that's only because I went down all the dead ends myself first. In
|
|
practice, when you're writing an emulator for a system you'll need a way to
|
|
debug the execution of code. The first step is to be able to <em>read</em> the code,
|
|
so let's look at how to add a disassembler to our simple CHIP-8 emulator.</p>
|
|
|
|
<p>The full series of posts so far:</p>
|
|
|
|
<ol>
|
|
<li><a href="../../../2016/12/chip8-cpu/index.html">CHIP-8 in Common Lisp: The CPU</a></li>
|
|
<li><a href="../../../2016/12/chip8-graphics/index.html">CHIP-8 in Common Lisp: Graphics</a></li>
|
|
<li><a href="../../../2016/12/chip8-input/index.html">CHIP-8 in Common Lisp: Input</a></li>
|
|
<li><a href="../../../2016/12/chip8-sound/index.html">CHIP-8 in Common Lisp: Sound</a></li>
|
|
<li><a href="index.html">CHIP-8 in Common Lisp: Disassembly</a></li>
|
|
<li><a href="../chip8-debugging-infrastructure/index.html">CHIP-8 in Common Lisp: Debugging Infrastructure</a></li>
|
|
<li><a href="../chip8-menus/index.html">CHIP-8 in Common Lisp: Menus</a></li>
|
|
</ol>
|
|
|
|
<p>The full emulator source is on <a href="https://bitbucket.org/sjl/cl-chip8">BitBucket</a> and <a href="https://github.com/sjl/cl-chip8">GitHub</a>.</p>
|
|
|
|
<ol class="table-of-contents"><li><a href="index.html#s1-disassembling-single-instructions">Disassembling Single Instructions</a></li><li><a href="index.html#s2-disassembling-entire-roms">Disassembling Entire ROMs</a></li><li><a href="index.html#s3-sprites">Sprites</a></li><li><a href="index.html#s4-result">Result</a></li><li><a href="index.html#s5-future">Future</a></li></ol>
|
|
|
|
<h2 id="s1-disassembling-single-instructions"><a href="index.html#s1-disassembling-single-instructions">Disassembling Single Instructions</a></h2>
|
|
|
|
<p>The first thing we'll need is a way to take a single instruction like <code>#x8055</code>
|
|
and turn it into something we can read. The easiest way to do this seemed to be
|
|
to copy the dispatch loop from the CPU emulator and turn it into a disassembly
|
|
function:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> disassemble-instruction <span class="paren2">(<span class="code">instruction</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">flet</span></i> <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">v <span class="paren5">(<span class="code">n</span>)</span> <span class="paren5">(<span class="code">symb 'v <span class="paren6">(<span class="code">format nil <span class="string">"~X"</span> n</span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code">_x__ <span class="paren6">(<span class="code">ldb <span class="paren1">(<span class="code">byte 4 8</span>)</span> instruction</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">__x_ <span class="paren6">(<span class="code">ldb <span class="paren1">(<span class="code">byte 4 4</span>)</span> instruction</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">___x <span class="paren6">(<span class="code">ldb <span class="paren1">(<span class="code">byte 4 0</span>)</span> instruction</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">__xx <span class="paren6">(<span class="code">ldb <span class="paren1">(<span class="code">byte 8 0</span>)</span> instruction</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">_xxx <span class="paren6">(<span class="code">ldb <span class="paren1">(<span class="code">byte 12 0</span>)</span> instruction</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren4">(<span class="code">case <span class="paren5">(<span class="code">logand #xF000 instruction</span>)</span>
|
|
<span class="paren5">(<span class="code">#x0000 <span class="paren6">(<span class="code">case instruction
|
|
<span class="paren1">(<span class="code">#x00E0 '<span class="paren2">(<span class="code">cls</span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x00EE '<span class="paren2">(<span class="code">ret</span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x1000 `<span class="paren6">(<span class="code">jp ,_xxx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x2000 `<span class="paren6">(<span class="code">call ,_xxx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x3000 `<span class="paren6">(<span class="code">se ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,__xx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x4000 `<span class="paren6">(<span class="code">sne ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,__xx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x5000 <span class="paren6">(<span class="code">case <span class="paren1">(<span class="code">logand #x000F instruction</span>)</span>
|
|
<span class="paren1">(<span class="code">#x0 `<span class="paren2">(<span class="code">se ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x6000 `<span class="paren6">(<span class="code">ld ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,__xx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x7000 `<span class="paren6">(<span class="code">add ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,__xx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x8000 <span class="paren6">(<span class="code">case <span class="paren1">(<span class="code">logand #x000F instruction</span>)</span>
|
|
<span class="paren1">(<span class="code">#x0 `<span class="paren2">(<span class="code">ld ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x1 `<span class="paren2">(<span class="code">or ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x2 `<span class="paren2">(<span class="code">and ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x3 `<span class="paren2">(<span class="code">xor ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x4 `<span class="paren2">(<span class="code">add ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x5 `<span class="paren2">(<span class="code">sub ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x6 `<span class="paren2">(<span class="code">shr ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x7 `<span class="paren2">(<span class="code">subn ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#xE `<span class="paren2">(<span class="code">shl ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#x9000 <span class="paren6">(<span class="code">case <span class="paren1">(<span class="code">logand #x000F instruction</span>)</span>
|
|
<span class="paren1">(<span class="code">#x0 `<span class="paren2">(<span class="code">sne ,<span class="paren3">(<span class="code">v _x__</span>)</span> ,<span class="paren3">(<span class="code">v __x_</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xA000 `<span class="paren6">(<span class="code">ld i ,_xxx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xB000 `<span class="paren6">(<span class="code">jp ,<span class="paren1">(<span class="code">v 0</span>)</span> ,_xxx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xC000 `<span class="paren6">(<span class="code">rnd ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,__xx</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xD000 `<span class="paren6">(<span class="code">drw ,<span class="paren1">(<span class="code">v _x__</span>)</span> ,<span class="paren1">(<span class="code">v __x_</span>)</span> ,___x</span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xE000 <span class="paren6">(<span class="code">case <span class="paren1">(<span class="code">logand #x00FF instruction</span>)</span>
|
|
<span class="paren1">(<span class="code">#x9E `<span class="paren2">(<span class="code">skp ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#xA1 `<span class="paren2">(<span class="code">sknp ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span>
|
|
<span class="paren5">(<span class="code">#xF000 <span class="paren6">(<span class="code">case <span class="paren1">(<span class="code">logand #x00FF instruction</span>)</span>
|
|
<span class="paren1">(<span class="code">#x07 `<span class="paren2">(<span class="code">ld ,<span class="paren3">(<span class="code">v _x__</span>)</span> dt</span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x0A `<span class="paren2">(<span class="code">ld ,<span class="paren3">(<span class="code">v _x__</span>)</span> k</span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x15 `<span class="paren2">(<span class="code">ld dt ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x18 `<span class="paren2">(<span class="code">ld st ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x1E `<span class="paren2">(<span class="code">add i ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x29 `<span class="paren2">(<span class="code">ld f ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x33 `<span class="paren2">(<span class="code">ld b ,<span class="paren3">(<span class="code">v _x__</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x55 `<span class="paren2">(<span class="code">ld <span class="paren3">(<span class="code">mem i</span>)</span> ,_x__</span>)</span></span>)</span>
|
|
<span class="paren1">(<span class="code">#x65 `<span class="paren2">(<span class="code">ld ,_x__ <span class="paren3">(<span class="code">mem i</span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>There are a lot of other ways we could have done this, like making a proper
|
|
parser or adding functionality to <code>define-opcode</code>, but since there's not that
|
|
many instructions I think this is reasonable. Now we can pass in a raw,
|
|
two-byte instruction and get out something readable:</p>
|
|
|
|
<pre><code>[SBCL] CHIP8> (disassemble-instruction #x8055)
|
|
(SUB V0 V5)
|
|
|
|
[SBCL] CHIP8> (disassemble-instruction #x4077)
|
|
(SNE V0 119)</code></pre>
|
|
|
|
<h2 id="s2-disassembling-entire-roms"><a href="index.html#s2-disassembling-entire-roms">Disassembling Entire ROMs</a></h2>
|
|
|
|
<p>Disassembling a single instruction will be useful, but it would also be nice to
|
|
disassemble an entire ROM at once to see what its code looks like. Let's make
|
|
a little helper function to handle that:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> dump-disassembly <span class="paren2">(<span class="code">array &optional <span class="paren3">(<span class="code">start 0</span>)</span> <span class="paren3">(<span class="code">end <span class="paren4">(<span class="code">length array</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren2">(<span class="code">iterate
|
|
<span class="paren3">(<span class="code">for i <span class="keyword">:from</span> start <span class="keyword">:below</span> end <span class="keyword">:by</span> 2</span>)</span>
|
|
<span class="paren3">(<span class="code">print-disassembled-instruction array i</span>)</span>
|
|
<span class="paren3">(<span class="code">sleep 0.001</span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>The <code>sleep</code> is there because Neovim's terminal seems to shit the bed if you dump
|
|
too much text at it at once. Computers are garbage.</p>
|
|
|
|
<p>Other that than, <code>dump-disassembly</code> is pretty straightforward: just iterate
|
|
through the array of instructions two bytes at a time and print the information.
|
|
Let's look at the printing function now:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> print-disassembled-instruction <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code">destructuring-bind <span class="paren3">(<span class="code">address instruction disassembly</span>)</span>
|
|
<span class="paren3">(<span class="code">instruction-information array index</span>)</span>
|
|
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code"><span class="special">*print-base*</span> 16</span>)</span></span>)</span>
|
|
<span class="paren4">(<span class="code">format t <span class="string">"~3,'0X: ~4,'0X ~24A~%"</span>
|
|
address
|
|
instruction
|
|
<span class="paren5">(<span class="code">or disassembly <span class="string">""</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>Once again we'll delegate to a helper function.
|
|
<code>print-disassembled-instruction</code> just handles the string formatting to dump an
|
|
instruction to the screen. Running it for a single instruction would print
|
|
something like:</p>
|
|
|
|
<pre class="lineart">
|
|
200: 8055 (SUB V0 V5)
|
|
^ ^ ^
|
|
| | |
|
|
| | Disassembly
|
|
| |
|
|
| Raw instruction
|
|
|
|
|
Address
|
|
</pre>
|
|
|
|
<p>The helper function <code>instruction-information</code> is simple, but we'll be using it
|
|
in the future for something else, so it's nice to have:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> instruction-information <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">instruction <span class="paren5">(<span class="code">retrieve-instruction array index</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code">list index
|
|
instruction
|
|
<span class="paren4">(<span class="code">disassemble-instruction instruction</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>It just takes an address and memory array and returns a list of:</p>
|
|
|
|
<ul>
|
|
<li>The address</li>
|
|
<li>The raw instruction at the address</li>
|
|
<li>The disassembly for that instruction</li>
|
|
</ul>
|
|
|
|
<p><code>retrieve-instruction</code> is simple (for now):</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> retrieve-instruction <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code">cat-bytes <span class="paren3">(<span class="code">aref array index</span>)</span>
|
|
<span class="paren3">(<span class="code">aref array <span class="paren4">(<span class="code">1+ index</span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>These functions <em>could</em> be combined into a single bigger function, but I'm
|
|
a strong believer in having each function do exactly one thing. And as we'll
|
|
see, each of these "simple" tasks is going to get more complicated later.</p>
|
|
|
|
<p>Now we can dump the disassembly for a ROM to see how it works:</p>
|
|
|
|
<pre><code>(run "roms/ufo.rom") ; stores the current chip struct in *c*
|
|
|
|
(dump-disassembly (chip-memory *c*) #x200 #x220)
|
|
200: A2CD (LD I 2CD)
|
|
202: 6938 (LD V9 38)
|
|
204: 6A08 (LD VA 8)
|
|
206: D9A3 (DRW V9 VA 3)
|
|
208: A2D0 (LD I 2D0)
|
|
20A: 6B00 (LD VB 0)
|
|
20C: 6C03 (LD VC 3)
|
|
20E: DBC3 (DRW VB VC 3)
|
|
210: A2D6 (LD I 2D6)
|
|
212: 641D (LD V4 1D)
|
|
214: 651F (LD V5 1F)
|
|
216: D451 (DRW V4 V5 1)
|
|
218: 6700 (LD V7 0)
|
|
21A: 680F (LD V8 F)
|
|
21C: 22A2 (CALL 2A2)
|
|
21E: 22AC (CALL 2AC)</code></pre>
|
|
|
|
<h2 id="s3-sprites"><a href="index.html#s3-sprites">Sprites</a></h2>
|
|
|
|
<p>Take a look at <code>print-disassembled-instruction</code> again:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> print-disassembled-instruction <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code">destructuring-bind <span class="paren3">(<span class="code">address instruction disassembly</span>)</span>
|
|
<span class="paren3">(<span class="code">instruction-information array index</span>)</span>
|
|
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code"><span class="special">*print-base*</span> 16</span>)</span></span>)</span>
|
|
<span class="paren4">(<span class="code">format t <span class="string">"~3,'0X: ~4,'0X ~24A~%"</span>
|
|
address
|
|
instruction
|
|
<span class="paren5">(<span class="code">or disassembly <span class="string">""</span></span>)</span></span>)</span></span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<p>Notice how we used <code>(or disassembly "")</code> instead of just passing in the
|
|
disassembly. If you look back at <code>disassemble-instruction</code> you'll see it uses
|
|
normal <code>case</code> statements, not <code>ecase</code>, so if the instruction doesn't match any
|
|
valid opcodes it will return <code>nil</code>.</p>
|
|
|
|
<p>The CHIP-8 (like <a href="https://en.wikipedia.org/wiki/Von_Neumann_architecture">most computers</a>) uses the same memory to hold both
|
|
program code (instructions) and data. Data includes things like player
|
|
health, score, location, and most importantly: the sprites that will be drawn on
|
|
the screen.</p>
|
|
|
|
<p>Unfortunately there's no way to know for sure whether a given memory location
|
|
contains an instruction (and thus needs to be disassembled) or is intended to be
|
|
a piece of data. Indeed, someone like <a href="http://www.catb.org/jargon/html/story-of-mel.html">Mel</a> could conceivably figure out
|
|
a way to make a particular sequence of instructions perform double duty as
|
|
a sprite! So our disassembler will just always show the disassembly for
|
|
anything that <em>might</em> be an instruction.</p>
|
|
|
|
<p>But with that said, we can probably make some educated guesses. If we had some
|
|
way to visualize what a hunk of memory would look like <em>if</em> it were rendered as
|
|
a sprite, we could probably figure out where most of the program's sprites are
|
|
kept. It's unlikely that any given sequence of instructions would just <em>happen</em>
|
|
to look like a ghost from Pac Man or something.</p>
|
|
|
|
<p>We could add a separate function to draw out the sprite data, but the CHIP-8's
|
|
sprites are so simple that we can just tack it on to the disassembly output.
|
|
<a href="http://devernay.free.fr/hacks/chip8/C8TECH10.HTM#2.4">Remember</a> that each byte of memory defines one eight-pixel-wide
|
|
row of a sprite, and that <code>DRW X, Y, Size</code> will draw <code>Size</code> rows of a sprite
|
|
using contiguous bytes in memory. So if memory contains something like this (at
|
|
the location specified by the index register):</p>
|
|
|
|
<pre><code>Address Data
|
|
#x300 #b11110000
|
|
#x301 #b00010000
|
|
#x302 #b11110000
|
|
#x303 #b00010000
|
|
#x304 #b11110000</code></pre>
|
|
|
|
<p>A <code>DRW X, Y, 5</code> instruction would draw a <code>3</code> sprite to the screen:</p>
|
|
|
|
<pre class="lineart">
|
|
████
|
|
█
|
|
████
|
|
█
|
|
████
|
|
</pre>
|
|
|
|
<p>It would be trivial to simply render the bits of any given instruction as spaces
|
|
and some other ASCII character and tack it onto the end of the disassembly, but
|
|
there's a snag: instructions are <em>two</em> bytes each, but each row in a sprite is
|
|
<em>one</em> byte long. Our sprites would get pretty mangled if we printed two of
|
|
their rows per line of disassembly — for example, <code>4</code> would look like this:</p>
|
|
|
|
<pre class="lineart"> byte 1 byte 2
|
|
1111111122222222
|
|
064: 9090 (SNE V0 V9) █ █ █ █
|
|
066: F010 ████ █
|
|
068: 10F0 (JP F0) █ ████
|
|
</pre>
|
|
|
|
<p>Not ideal. One option would be to make every instruction of disassembly two
|
|
lines long, but that's painful to read when trying to look at the code portions
|
|
of the ROM. We can get around this with a delightful little hack: using
|
|
characters from <a href="https://en.wikipedia.org/wiki/Block_Elements">Unicode Block Elements</a> to cram two rows of sprite data
|
|
into a single line of output. Let's start by defining a <code>bit-diagram</code> function
|
|
that will take a two-byte-wide integer and return an ASCII diagram of its bits:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> bit-diagram <span class="paren2">(<span class="code">integer</span>)</span>
|
|
<span class="paren2">(<span class="code">iterate <span class="paren3">(<span class="code">for high-bit <span class="keyword">:from</span> 15 <span class="keyword">:downto</span> 8</span>)</span>
|
|
<span class="paren3">(<span class="code">for low-bit <span class="keyword">:from</span> 7 <span class="keyword">:downto</span> 0</span>)</span>
|
|
<span class="paren3">(<span class="code">for hi = <span class="paren4">(<span class="code">logbitp high-bit integer</span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code">for lo = <span class="paren4">(<span class="code">logbitp low-bit integer</span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code">collect <span class="paren4">(<span class="code"><i><span class="symbol">cond</span></i> <span class="paren5">(<span class="code"><span class="paren6">(<span class="code">and hi lo</span>)</span> <span class="character">#\full_block</span></span>)</span>
|
|
<span class="paren5">(<span class="code">hi <span class="character">#\upper_half_block</span></span>)</span>
|
|
<span class="paren5">(<span class="code">lo <span class="character">#\lower_half_block</span></span>)</span>
|
|
<span class="paren5">(<span class="code">t <span class="character">#\space</span></span>)</span></span>)</span>
|
|
<span class="keyword">:result-type</span> 'string</span>)</span></span>)</span></span>)</span></span></code></pre>
|
|
|
|
<pre><code><span class="code"><span class="comment">; Example rows of sprite data:
|
|
</span><span class="comment">; 11110000
|
|
</span><span class="comment">; 11001100
|
|
</span>
|
|
<span class="paren1">(<span class="code">bit-diagram #b1111000011001100</span>)</span>
|
|
"██▀▀▄▄ "</span></code></pre>
|
|
|
|
<p>Now that we've got this we can easily add it into our disassembly functions:</p>
|
|
|
|
<pre><code><span class="code"><span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> instruction-information <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren3">(<span class="code"><span class="paren4">(<span class="code">instruction <span class="paren5">(<span class="code">retrieve-instruction array index</span>)</span></span>)</span></span>)</span>
|
|
<span class="paren3">(<span class="code">list index
|
|
instruction
|
|
<span class="paren4">(<span class="code">disassemble-instruction instruction</span>)</span>
|
|
<span class="paren4">(<span class="code">bit-diagram instruction</span>)</span></span>)</span></span>)</span></span>)</span> <span class="comment">; NEW
|
|
</span>
|
|
<span class="paren1">(<span class="code"><i><span class="symbol">defun</span></i> print-disassembled-instruction <span class="paren2">(<span class="code">array index</span>)</span>
|
|
<span class="paren2">(<span class="code">destructuring-bind <span class="paren3">(<span class="code">address instruction disassembly bits</span>)</span>
|
|
<span class="paren3">(<span class="code">instruction-information array index</span>)</span>
|
|
<span class="paren3">(<span class="code"><i><span class="symbol">let</span></i> <span class="paren4">(<span class="code"><span class="paren5">(<span class="code"><span class="special">*print-base*</span> 16</span>)</span></span>)</span>
|
|
<span class="comment">; NEW
|
|
</span> <span class="paren4">(<span class="code">format t <span class="string">"~3,'0X: ~4,'0X ~24A ~8A~%"</span>
|
|
address
|
|
instruction
|
|
<span class="paren5">(<span class="code">or disassembly <span class="string">""</span></span>)</span>
|
|
bits</span>)</span></span>)</span></span>)</span></span>)</span> <span class="comment">; NEW</span></span></code></pre>
|
|
|
|
<p>Now when we dump the disassembly of a ROM we'll also see what each instruction
|
|
would look like if drawn as a sprite. For program code this will tend to look
|
|
like garbage (unless some crazy person has managed to write code that also works
|
|
as sprites):</p>
|
|
|
|
<pre class="lineart" style="line-height: 1.0 !important;">
|
|
200: A2CD (LD I 2CD) █▄▀ ▄▄▀▄
|
|
202: 6938 (LD V9 38) ▀█▄█ ▀
|
|
204: 6A08 (LD VA 8) ▀▀ █ ▀
|
|
206: D9A3 (DRW V9 VA 3) █▀▄▀▀ ▄█
|
|
208: A2D0 (LD I 2D0) █▄▀▄ ▀
|
|
20A: 6B00 (LD VB 0) ▀▀ ▀ ▀▀
|
|
20C: 6C03 (LD VC 3) ▀▀ ▀▀▄▄
|
|
20E: DBC3 (DRW VB VC 3) ██ ▀▀ ██
|
|
</pre>
|
|
|
|
<p>But when we look at areas of the ROM that <em>do</em> contain sprites, they look pretty
|
|
recognizable:</p>
|
|
|
|
<pre class="lineart" style="line-height: 1.0 !important;">
|
|
050: F090 █▀▀█
|
|
052: 9090 (SNE V0 V9) █ █
|
|
054: F020 ▀▀█▀
|
|
056: 6020 (LD V0 20) ▀█
|
|
058: 2070 (CALL 70) ▄█▄
|
|
05A: F010 ▀▀▀█
|
|
05C: F080 █▀▀▀
|
|
05E: F0F0 ████
|
|
060: 10F0 (JP F0) ▄▄▄█
|
|
062: 10F0 (JP F0) ▄▄▄█
|
|
064: 9090 (SNE V0 V9) █ █
|
|
066: F010 ▀▀▀█
|
|
068: 10F0 (JP F0) ▄▄▄█
|
|
06A: 80F0 (LD V0 VF) █▄▄▄
|
|
06C: 10F0 (JP F0) ▄▄▄█
|
|
06E: F080 █▀▀▀
|
|
</pre>
|
|
|
|
<p>Human eyes are pretty good at picking out patterns, so when you're scrolling
|
|
through a disassembled ROM it's pretty easy to tell which sections are sprites
|
|
and which are data, even if it's not perfectly rendered.</p>
|
|
|
|
<h2 id="s4-result"><a href="index.html#s4-result">Result</a></h2>
|
|
|
|
<p>We've now got a way to dump the disassembly of a ROM to see what its code and
|
|
data look like.</p>
|
|
|
|
<p>We can also inspect the rest of our emulator's state at runtime with NREPL or
|
|
SLIME by running things like <code>(chip-program-counter *c*)</code>.</p>
|
|
|
|
<h2 id="s5-future"><a href="index.html#s5-future">Future</a></h2>
|
|
|
|
<p>Manually querying for information and dumping the disassembly isn't very
|
|
ergonomic, so in the future we'll look at adding:</p>
|
|
|
|
<ul>
|
|
<li>Debugging infrastructure like pausing and breakpoints</li>
|
|
<li>A graphical debugger/disassembly viewer</li>
|
|
</ul>
|
|
|
|
<p>As well as a few other niceties like menus for loading ROMs, etc.</p>
|
|
</article></main><hr class='main-separator' /><footer><nav><a href='https://github.com/sjl/'>GitHub</a> ・ <a href='https://twitter.com/stevelosh/'>Twitter</a> ・ <a href='https://instagram.com/thirtytwobirds/'>Instagram</a> ・ <a href='https://hg.stevelosh.com/.plan/'>.plan</a></nav></footer></body></html> |