Book Scanning

I mentioned this on Twitter a couple of times over the past week and it seemed to generate a fair amount of interest. I figured a blog post would clear things up.

A number of months ago, I bought myself a Canon P-150 scanner (Amazon.co.uk) for scanning letters, receipts and so on. I found it rather addictive in a weird sort of way and since then have progressively moved to being completely paperless.

Being Paperless

My paperless workflow isn't that complex, really. It's based on being an Evernote premium subscriber. I scan pieces of paper and I put it in a notebook. For business receipts, I have one notebook for each financial year and a notebook called "Personal Archive" where everything else goes. When I want something, I search for it. It's very Gmail-inspired, and the Gmail approach has worked well for me for years now.

ARCHIVE ALL THE THINGS!

Since I've been on this road, I realised that there's essentially no limit to the stuff I can archive in Evernote in case I ever need it again. It's out of sight and it's retrievable by search. Evernote is increasingly becoming the 'fourth service' behind Twitter, Facebook and Instapaper for apps to share to. Here are some other routes for data into my Evernote.

I already wrote in detail about my Web-to-Evernote workflow. I've been using Pocket recently instead of Instapaper but sharing to Evernote is also supported there, so the workflow still works.

PDFs that I download in Safari are just opened directly in Evernote.

When I'm on the go, I've come to rely on Readdle's Scanner Pro iPhone app. Scanner Pro can snap a piece of paper, correct the keystoning effect and send the document to Evernote in a few taps. It's how I get rid of paper on the go.

I like Podcasts and, while they're mostly ephemeral, occasionally there are ones that I'll want to keep. I use Instacast on my iPhone to subscribe and listen to podcasts. Instacast supports "Open In" as a mechanism to send the downloaded audio file elsewhere. Opening a large MP3 file in Evernote on iOS just about works. Occasionally, the app will freak out if the file is large. As awesome as Evernote search is, they can't yet index the audio inside an MP3 file. For podcasts, I'll also add the title and episode text to the body of the note.

You Can't Beat the Smell of New Bits

As you probably know if you're a regular reader of this blog, I'm the kind of person who likes to push ideas as far as they can go. So now, I'm thinking: "what if everything went into Evernote?". I'm a big fan of eBooks, having owned a Kindle for many years, so I started wondering whether I could use my scan-to-Evernote workflow to convert my paper books into eBooks.

My first test was with a copy of Ground Control by Anna Minton which, by virtue of a 1-click accident, I had two copies of. I used a Stanley knife to cut the pages out and feed them into the Canon scanner. That worked well enough but the Stanley knife made quite a ragged cut on the binding edge of the paper. As a result, the pages tended to catch on each other and mis-feed through the scanner. This led to the scan taking a while.

My next try was with a copy of Nicholas Negroponte's "Being Digital" (ironically not available on Kindle). This time, I used a modelling scalpel to cut the pages out of the book. This was much more successful in creating a clean cut down the spine of the book. In scanning "Being Digital", I ended up with about three mis-feeds in the entire book.

Because the Canon P-150 scanner is a multi-page feeder scanner that scans both sides in one pass, the process was painless to scan the entire stack of pages. It's not quite as simple as putting the book in the scanner and walking away - the hopper isn't big enough to take the whole book - but it didn't take that long to scan the book in batches of about 25 pages.

The result being that, armed with a £200 scanner, a scalpel and a steel rule, I can turn a book from paper to digital in about 20 minutes. That's a pretty cool capability.

It was not my intent to take a paper book and turn it into a functional ePub file. All I wanted to do was make my book portable and readable on my iPad alongside all my Kindle books. That kind of precision book scanning to extract usable text definitely requires more advanced OCR software than I have at my disposal. Unless the book is absolutely not available digitally, it would probably be a better use of time and money to just re-buy it as an eBook.

The resulting book file was around 40MB in size, scanned at 200dpi. I dropped it in Evernote and synced it to my devices, where I was able to open it in the Kindle app and read it.

Scanned book in Kindle

I have also uploaded a couple of pages from the book to give you an idea of the quality. Download: Being Digital Sample PDF.