Google Turns Vacuum Cleaner Into Book Scanner

Google has spent years scanning books, recreating the world's leading libraries as digital files we can read and search on PCs and smartphones. The project has faced a fair amount of opposition -- including a lawsuit from authors and publishers who complained Google didn't have the legal right to put many of its scans online -- but whatever you think of Google's approach to copyright law, the project was an impressive feat, technically speaking.
Linear Book Scanner
Hamlet and more characters from The Bard's deep literary roster seek to destroy their author in <cite>Kill Shakespeare</cite>, out Sunday.
Images courtesy IDW
The Linear Book Scanner is an open source, DIY book scanning project created by Google engineer Dany Qumsiyeh.

Google has spent years scanning books, recreating the world's leading libraries as digital files we can read and search on PCs and smartphones. The project has faced a fair amount of opposition -- including a lawsuit from authors and publishers who complained Google didn't have the legal right to put many of its scans online -- but whatever you think of Google's approach to copyright law, the project was an impressive feat, technically speaking.

Many of the 7 million books scanned by Google were rare and out of print, and the trick was to scan them without damaging the pages or the binding. Though it's unclear how Google did this, the company has patented a scanner that avoids such damage with the help of an infrared projector and two infrared cameras.

That scanner is likely a rather expensive contraption, but now the company has built a dirt-cheap alternative. Google engineer Dany Qumsiyeh recently built a binder-friendly scanner that can digitize a 1,000-page book in about 90 minutes, and it cost him only about $1,500. Typically, if you want to scan a book without destroying the binding, you have to manually scan and photograph each page or fork over multiple thousands of dollars for a high-end scanner from companies like Kirtas or Treventus.

If you like, you can build one too. Google has open sourced the design. The primary components are a vacuum cleaner, an off-the-shelf Canon document scanner scanner, and some sheet metal. The device can simply use Canon's drivers and software, reducing the amount of custom code and electronics work that needs to be done.

Qumsiyeh explained how the scanner works in a presentation at Google in May, which you can find online at YouTube. As part of the open source release, Google has published a 21-page document on the device's design and construction.

The machine uses the vacuum to pull the book down the prism-shaped body of the device containing the scanner sensors. The pages are separated by suctioning the top sheet and using a piece of sheet metal to divide the page from the rest of the book. Qumsiyeh says it's possible for the pages to be stuck together, but he has taken pains to make the separator as efficient as possible.

One of the biggest concerns is that the machine will badly mangle the books if the device malfunctions. In the presentation, Qumsiyeh points out that even the high-end commercial machines still occasionally tear pages, though it's rare. He says that he spoke with a conservationist from Berkley about the risks, and was told that libraries are likely to think being able to preserve and disseminate these books is worth the risk of damaging them.

Qumsiyeh says that there are sensors in place to detect whether a page has been turned and, if not, stop the machine. He also hopes the machine can be improved to lower the risks of damaging a work, and to add features for detecting when pages have stuck together. The current model is just a prototype, and in the document Qumsiyeh suggests further improvements could be made to help the machine scan multiple books in parallel.

Google's scanner isn't the only open source book scanner out there. In 2009, artist/engineer Daniel Reetz published his own DIY Book Scanner plans, and launched a campaign to put a book scanner in every community in the world. Reetz estimates the cost of his machines to be $300. Qumsiyeh suggests that some of the cost of his machine could be shaved off by buying individual sensors instead of cannibalizing a document scanner, though it would require more work.

These projects are just two more examples of the burgeoning open source hardware movement, with projects ranging from computer-controlled beer brewing temperature regulators to aerial drones. These projects point to how cheaper components -- like the RaspberryPi computer on a chip or the Arduino programmable circuit board -- open source designs open a new world of innovation in hardware.