212 lines
No EOL
12 KiB
HTML
212 lines
No EOL
12 KiB
HTML
<!DOCTYPE html>
|
|
<html lang='en'><head><meta charset='utf-8' /><meta name='pinterest' content='nopin' /><link href='../../../../static/css/style.css' rel='stylesheet' type='text/css' /><link href='../../../../static/css/print.css' rel='stylesheet' type='text/css' media='print' /><title>Going Paper-Free for $220 / Steve Losh</title></head><body><header><a id='logo' href='https://stevelosh.com/'>Steve Losh</a><nav><a href='../../../index.html'>Blog</a> - <a href='https://stevelosh.com/projects/'>Projects</a> - <a href='https://stevelosh.com/photography/'>Photography</a> - <a href='https://stevelosh.com/links/'>Links</a> - <a href='https://stevelosh.com/rss.xml'>Feed</a></nav></header><hr class='main-separator' /><main id='page-blog-entry'><article><h1><a href='index.html'>Going Paper-Free for $220</a></h1><p class='date'>Posted on May 26th, 2011.</p><p>It's 2011. Personal computers have been around and popular for well over a decade
|
|
now, and yet we still have to deal with a huge amount of physical paper.</p>
|
|
|
|
<p>I've been wanting to go paper-free for a long time now. The advantages are obvious:</p>
|
|
|
|
<ul>
|
|
<li>Paper takes up physical space in our homes that digital files don't.</li>
|
|
<li>Digital files, if properly encrypted, are far more secure than sheets of paper that
|
|
could be stolen.</li>
|
|
<li>Digital files can be searched in an instant, while papers have to be laboriously
|
|
sorted through.</li>
|
|
<li>Digital files can be backed up perfectly and easily.</li>
|
|
</ul>
|
|
|
|
<p>After reading <a href="http://ryanwaggoner.com/2010/11/how-i-filled-two-dumpsters-and-went-paperless-with-the-fujitsu-scansnap-s1500/">this article</a> I was psyched to scan and shred all the boxes of paper
|
|
sitting in my apartment, but the $420+ price tag was hard to swallow. I started
|
|
looking around for other options.</p>
|
|
|
|
<p>Here are the requirements I have for any paper-free system:</p>
|
|
|
|
<ul>
|
|
<li>The scanned files need to be OCR'ed so I can search them easily. I'm too lazy to
|
|
categorize and tag files manually.</li>
|
|
<li>I need to be able to scan files anywhere. If I'm out at dinner I want to be able
|
|
to snap a picture of my receipt and tear it up right there.</li>
|
|
<li>No "cloud" services allowed for unencrypted important documents. I simply don't
|
|
trust Google/Dropbox/etc enough to put my bank statements and such there.</li>
|
|
<li>Files need to be backed up securely in case my apartment burns down.</li>
|
|
<li>The entire process needs to be automated as much as possible, otherwise I'll get
|
|
lazy and not scan things.</li>
|
|
</ul>
|
|
|
|
<p>It's taken me a while, but I've finally got a system I'm happy with. This post will
|
|
describe each part and how they fit together. The total cost is about $220, $160 of
|
|
which is for a physical scanner.</p>
|
|
|
|
<p>Note: I use OS X and an iPhone, so this post will focus on that platform. However,
|
|
the important pieces of software will run on Windows and I'm sure there are
|
|
Windows/Android equivalents to the other pieces.</p>
|
|
|
|
<ol class="table-of-contents"><li><a href="index.html#s1-scanning-at-home">Scanning at Home</a></li><li><a href="index.html#s2-scanning-on-the-go">Scanning on the Go</a></li><li><a href="index.html#s3-ocr-ing-scanned-documents">OCR'ing Scanned Documents</a></li><li><a href="index.html#s4-gluing-everything-together">Gluing Everything Together</a></li><li><a href="index.html#s5-backing-up">Backing Up</a></li><li><a href="index.html#s6-destroying-the-originals">Destroying the Originals</a></li><li><a href="index.html#s7-summary">Summary</a></li></ol>
|
|
|
|
<h2 id="s1-scanning-at-home"><a href="index.html#s1-scanning-at-home">Scanning at Home</a></h2>
|
|
|
|
<p>The first step to becoming paper-free is obviously scanning your documents. There
|
|
are a lot of scanners out there, some more expensive than others. I eventually
|
|
settled on a <a href="http://www.getdoxie.com/">Doxie</a> for $160.</p>
|
|
|
|
<p>I chose the Doxie because:</p>
|
|
|
|
<ul>
|
|
<li>It's compact.</li>
|
|
<li>It runs with a single USB cable.</li>
|
|
<li>It's cross-platform.</li>
|
|
<li>Its software has a great, polished UI.</li>
|
|
<li>It has a "multiple-function button" that lets you control it without the
|
|
mouse/keyboard.</li>
|
|
</ul>
|
|
|
|
<p>The first, second, and last points mean that (with a USB extension cable) I can scan
|
|
documents while sitting on the couch and watching Netflix, which is critical for lazy
|
|
people like me.</p>
|
|
|
|
<p>I set Doxie to save scans on my Desktop. The scanning process is pretty simple so
|
|
I won't describe it here. Check out Doxie's documentation for more information.</p>
|
|
|
|
<p><strong>Note:</strong> When I first received my Doxie and tried to calibrate it, it simply made
|
|
a grinding noise and wouldn't feed the paper. I emailed their tech support and
|
|
within half an hour I got a response back saying they were shipping me a replacement
|
|
immediately.</p>
|
|
|
|
<p>When I got the replacement it worked like a charm. Their customer service was so
|
|
great that I'd still recommend the Doxie even though my first one was a dud.</p>
|
|
|
|
<h2 id="s2-scanning-on-the-go"><a href="index.html#s2-scanning-on-the-go">Scanning on the Go</a></h2>
|
|
|
|
<p>As I mentioned before, I want to be able to scan things while out and about with my
|
|
iPhone. There are a bunch of iPhone document-scanning apps out there. I settled on
|
|
<a href="http://itunes.apple.com/us/app/jotnot-scanner-pro/id307868751?mt=8">JotNot</a> for $7 because it has a decent UI and supports multiple-page PDFs.</p>
|
|
|
|
<p>JotNot's UI is pretty easy to get the hang of so I won't go over it here.</p>
|
|
|
|
<p>Once I finish scanning something I send the PDF to a <a href="http://www.dropbox.com/">Dropbox</a> folder called
|
|
"JotNot". </p>
|
|
|
|
<p>I know I said in my requirements that "cloud" services weren't allowed, but I make an
|
|
exception for non-critical things that I'd be scanning with my phone. I don't care if
|
|
Dropbox knows how much I spent on dinner.</p>
|
|
|
|
<h2 id="s3-ocr-ing-scanned-documents"><a href="index.html#s3-ocr-ing-scanned-documents">OCR'ing Scanned Documents</a></h2>
|
|
|
|
<p>The next step is to run the scanned PDFs through an OCR program so they can be
|
|
searched with Spotlight.</p>
|
|
|
|
<p>I looked at a lot of OCR software and finally settled on <a href="http://solutions.weblite.ca/pdfocrx/">PDF OCR X</a> for $30. It
|
|
has a simple interface, does a pretty good job at OCR'ing, has a free version so
|
|
I could try it out, and is cross-platform.</p>
|
|
|
|
<p>Using it is simple: you drag a PDF onto the app and select your desired settings
|
|
(make sure to choose "searchable PDF" as the output format). The app will think for
|
|
a while and then create a new PDF next to the old one with the searchable text
|
|
embedded.</p>
|
|
|
|
<p>Once you've done this once you should go into the preferences and change it to
|
|
non-interactive mode so that it won't prompt you for the settings every time you use
|
|
it.</p>
|
|
|
|
<h2 id="s4-gluing-everything-together"><a href="index.html#s4-gluing-everything-together">Gluing Everything Together</a></h2>
|
|
|
|
<p>So far we've got two folders with scanned PDFs and a method for OCR'ing them. The
|
|
next step is to automate the process.</p>
|
|
|
|
<p>I use an app called <a href="http://www.noodlesoft.com/hazel.php">Hazel</a> to do this. It's $21 for a license and well worth it.
|
|
We'll set up four rules to make our lives easier.</p>
|
|
|
|
<p>Before we start we need to create two folders somewhere (you can name them whatever
|
|
you like):</p>
|
|
|
|
<ul>
|
|
<li>Pending OCR: A folder to hold documents that are waiting to be OCR'ed.</li>
|
|
<li>Dead Trees: A folder to hold the final, OCR'ed versions of our documents.</li>
|
|
</ul>
|
|
|
|
<p>The first rule watches the Desktop for scans from Doxie. Any files placed on the
|
|
Desktop whose name starts with "Doxie Doc" will be renamed to include the current
|
|
date and time, and then moved to the "Pending OCR" folder.</p>
|
|
|
|
<p><img src="../../../../static/images/blog/2011/05/rules-1-doxie.png" alt="Rule 1 Screenshot" title="Rule 1"></p>
|
|
|
|
<p><strong>Note:</strong>: you'll need to click the <code>date created</code> bubble and then "Edit Date" to get
|
|
the time as well as the date into the filename.</p>
|
|
|
|
<p>The second rule watches the "JotNot" folder for scans from the iPhone app. Any PDFs
|
|
that appear in here (i.e. that are synced down from Dropbox) will be moved to the
|
|
"Pending OCR" folder. We don't need to rename them like we did with the Doxie scans
|
|
because JotNot already includes the date and time of scans in the filenames by
|
|
default.</p>
|
|
|
|
<p><img src="../../../../static/images/blog/2011/05/rules-2-jotnot.png" alt="Rule 2 Screenshot" title="Rule 2"></p>
|
|
|
|
<p>Now that we've got all of our scans going into the same folder (with unique names) we
|
|
can set up a rule to OCR them. The third rule watches the "Pending OCR" folder for
|
|
PDFs. When a PDF lands in the folder it will be moved to its final destination
|
|
folder ("Dead Trees" in my case) and then opened in PDF OCR X. Because I've put PDF
|
|
OCR X in non-interactive mode the files will automatically be OCR'ed without any
|
|
intervention from me.</p>
|
|
|
|
<p><img src="../../../../static/images/blog/2011/05/rules-3-ocr.png" alt="Rule 3 Screenshot" title="Rule 3"></p>
|
|
|
|
<p>The fourth and final rule watches for the OCR'ed copies of our scans and runs
|
|
a script to move the originals to the trash once the searchable versions are ready.
|
|
It doesn't delete the files completely because I want a safety net in case something
|
|
goes wrong.</p>
|
|
|
|
<p><img src="../../../../static/images/blog/2011/05/rules-4-clean.png" alt="Rule 4 Screenshot" title="Rule 4"></p>
|
|
|
|
<p><strong>Note:</strong> make sure you change the Shell to <code>/usr/bin/python</code>. Here's the text of
|
|
the script so you can copy and paste it:</p>
|
|
|
|
<pre><code>import sys, os
|
|
|
|
RM_CMD = r"""osascript -e 'tell app "Finder" to move the POSIX file "%s" to trash'"""
|
|
old_file = sys.argv[1].rsplit('.', 2)[0]
|
|
if os.path.exists(old_file):
|
|
os.system(RM_CMD % os.path.abspath(old_file))
|
|
</code></pre>
|
|
|
|
<p>Once these four rules are in place we can simply scan a document with Doxie or JotNot
|
|
and it will automatically be OCR'ed and placed in our "Dead Trees" folder, with no
|
|
intervention from us!</p>
|
|
|
|
<h2 id="s5-backing-up"><a href="index.html#s5-backing-up">Backing Up</a></h2>
|
|
|
|
<p>A while ago I was using Mozy for full backups. Recently they changed their pricing so
|
|
it was no longer unlimited. When that happened I switched to <a href="http://www.backblaze.com/">Backblaze</a> and
|
|
couldn't be happier.</p>
|
|
|
|
<p>Backblaze's UI is leaps and bounds above Mozy's, and they offer an option to generate
|
|
a secure encryption key for encrypting your backups. I highly recommend this, but be
|
|
sure to have a few copies of your key because you'll need it to restore your backups.</p>
|
|
|
|
<p>Backblaze is also only $5 per month (less if you pay for a year in advance) for
|
|
unlimited backups which is definitely a bargain. As a bonus, they just released
|
|
a <a href="http://blog.backblaze.com/2011/05/23/lost-your-computer-get-it-back-backblaze-launches-locate-my-computer/">"find my computer" feature</a> that's kind of like a lightweight
|
|
version of <a href="http://www.orbicule.com/undercover/">Undercover</a>, so it's an even better deal.</p>
|
|
|
|
<h2 id="s6-destroying-the-originals"><a href="index.html#s6-destroying-the-originals">Destroying the Originals</a></h2>
|
|
|
|
<p>Once the documents are scanned and backed up it's time to destroy the physical paper.
|
|
If you live in a rural area you could burn them for free.</p>
|
|
|
|
<p>Those of us that can't start random fires need a paper shredder. I use a shredder
|
|
I picked up a long time ago — any crosscut shredder will do the job.</p>
|
|
|
|
<h2 id="s7-summary"><a href="index.html#s7-summary">Summary</a></h2>
|
|
|
|
<p>After all of this I've now got a mostly-automated system that lets me go paper-free.
|
|
The costs are:</p>
|
|
|
|
<ul>
|
|
<li>Doxie Scanner: $160</li>
|
|
<li>JotNot: $7</li>
|
|
<li>PDF OCR X: $30</li>
|
|
<li>Hazel: $21</li>
|
|
<li>Backblaze: $5 per month</li>
|
|
</ul>
|
|
|
|
<p>For me the $218 initial cost is worth it. Now I can search all of my paper in
|
|
a instant and my apartment is much less cluttered. If you have the money to spare I'd
|
|
definitely consider trying it.</p>
|
|
</article></main><hr class='main-separator' /><footer><nav><a href='https://github.com/sjl/'>GitHub</a> ・ <a href='https://twitter.com/stevelosh/'>Twitter</a> ・ <a href='https://instagram.com/thirtytwobirds/'>Instagram</a> ・ <a href='https://hg.stevelosh.com/.plan/'>.plan</a></nav></footer></body></html> |