Last week, while talking to a friend about how more and more books are becoming available online he mentioned reCaptcha. Now I didn’t know this one so when he explained that it uses the Captcha’s (those ‘are you human?’ tests you get when you comment on a website) to have the crowd digitize texts that are hard to OCR (“Optical Character Recognition”), I was mind boggled. I love it! by showing people words from actual texts that computers “can’t” read and cross check it with other users returns, they effectively transcribe hundreds of pages worth of text each day. Such a simple but effective idea. I love how internet and some quirky ingenuity is making things like this possible.
A similar concept to this is a DuoLingo. A simple idea whereby people can learn a new language by translating pieces of texts. The double edge here is that while you do that, and while you check others translations you are actually not just learning a new language but helping translating actual texts as well. I haven’t yet been able to test it myself as it’s in Beta and by invitation only for now and already overwhelmed by invitation requests. But the video looks promising.
Is this just great? Well yes, while it is altruistically used for the translation or capturing of texts that would otherwise never be translated/captured. But no doubtedly it will be used commercially as well…
Companies like Google (reCaptcha) and DuoLingo can sell this as a paid service for companies to have their old ‘paper’ documents and texts indexed and translated easily & quick and make big bucks of it.
Is that really a problem though? We seem to expect services on the web to be free but things like language tutors or having a bouncer to keep unwanted people out of your club (which is more or less what Captcha’s do) were never free so why is it that we expect this to be free on the web? At least, in this way, we get the services and someone else pays. I think that is actually not a bad deal!