60 Million Captchas a Day

Products by on May 25, 2007 at 5:37 pm

That’s how many captchas are filled out by people proving themselves to be human.

Enter reCaptcha - they have figured out a way to make all of this captcha solving useful - digitizing books.

Each ReCaptcha captcha has two words:

  1. An unidentified word from a scanned book
  2. A known word

So, you get to digitize the world’s information by doing something that machines can’t. Assume an average book is 80K words (200 pgs by 400 words per page) and that OCR is 95% accurate: 4K captchas mean 1 book is digitized. That creates a potential of 15K books a day.

Cool. It kind of reminds me of the first SETI screensavers.

I’ve installed it on my comment form.

1 Comment

  1. Nik — June 9, 2007 @ 9:43 am

    I have been reading about this for sometime now. But this is the simplest explanation of what its about. Now I get it! What a great idea…

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. | Dave Naffziger's BlogDave & Iva Naffziger