In this post, I will try to present the different flavors of CAPTCHAs and try to analyze their security, usability and future directions.
reCAPTCHA: Human-Based Character Recognition via Web Security Measures
Let’s start by looking at text based CAPTCHAs. reCAPTCHA is the most popular and widely used text based CAPTCHA. Solving a CAPTCHA requires humans to perform a task that computers cannot yet. Research was focussed on whether such effort could be used to do something useful. The result was CAPTCHAs which helps in digitizing old printed material by asking users to decipher scanned words from books that computerized optical character recognition failed to recognize.
An example of reCAPTCHA is given below. As you can see there are 2 words in the CAPTCHA : An already known “control” word and a word that needs to be deciphered. The words are distorted to make sure that automated programs are not able to recognize it. If users correctly type the control word, the system assumes they are human and gains confidence that they also typed the other word correctly. To account for human error, each suspicious word is sent to multiple users.
It’s really a cool idea to harness wasted human effort to achieve something that the computers cannot do. Moreover, the words are known to fail with OCR techniques. So any automated recognition techniques would fail.
As the strength of OCR techniques improve, the level of distortion and noise will have to be increased in text based CAPTCHAs. This would affect the usability. So the future research will have to focus on other forms of CAPTCHA which could provide improved user experience.
NuCaptcha
NuCaptcha is one of the most popular video based CAPTCHAs. They claim to provide the best security and usability and also serves millions of CAPTCHAs every day. Following link shows an instance of NuCaptcha.
NuCaptcha Documentation Basic2 Outdoor_1
As we discussed in the class there are automated techniques to break these. Most of these techniques makes one of the following assumptions
- The color of the codeword characters is known to the attacker
- The codewords have a distinctly different trajectory from the non-codeword characters and other objects in the background
To make stronger video CAPTCHAs, the above conditions needs to be taken care. On the other hand, the usability of these CAPTCHAs are excellent. Research could also focus on CAPTCHAs based on Emerging images.
ESP-PIX and PICATCHA
Instead of typing letters, the user needs to authenticate himself as a human by recognizing what object is common in a set of images. ESP-PIX was the first example of a CAPTCHA based on image recognition. PICATCHA took one step further by using CAPTCHAs as a platform for advertising.
The usability of these CAPTCHAs is clearly much better than the other CAPTCHAs. But this needs to have a large database of images. Machine learning based techniques could break these CAPTCHAs. Image interaction CAPTCHAs face many potential problems which have not been fully studied.
Audio CAPTCHAs
Audio CAPTCHAs were introduced for visually impaired users who surf the Web using screen-reading programs. Typical audio CAPTCHAs consist of one or several speakers saying letters or digits at randomly spaced intervals. A user must correctly identify the digits or characters spoken in the audio file to pass the CAPTCHA. To make this test difficult for current computer systems, specifically automatic speech recognition (ASR) programs, background noise is injected into the audio files. [8] presents attacks using machine learning techniques to break about 70% of the audio CAPTCHAs.
Conclusion
Text based CAPTCHAs continue to be the most popularly used ones and they will continue to be so till OCR techniques can match a human being. Most of the newer techniques are vulnerable to automated attacks and needs to improve their security to be adopted.
References
- http://www.captcha.net/
- http://picatcha.com
- http://nucaptcha.com
- Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. In Science.
- Y. Xu, G. Reynaga, S. Chiasson, J.-M. Frahm, F. Monrose, P. van Oorschot, Security and Usability Challenges of Moving-Object CAPTCHAs: Decoding Codewords in Motion, USENIX Security 2012 (Seattle, WA, August 2012).
- http://server251.theory.cs.cmu.edu/cgi-bin/esp-pix/esp-pix
- http://www.ischool.berkeley.edu/files/student_projects/picatcha_mims_final_report_summary_0.pdf
- Jennifer Tam, Jiri Simsa, Sean Hyde, and Luis von Ahn. Breaking Audio CAPTCHAs. InAdvances in Neural Information Processing Systems (NIPS).
Great article. Have you also read about playthru captchas. It seems it is a pretty good way to overcome many previous problems that captchas suffer from. Altough I recently watched a clip on youtube on how to attack these captchas, It turned out playthru are not vulnerable to those attacks anymore.
Here is the video:
http://www.youtube.com/watch?v=q_EYl83vlIw
Thus, I talked to one of its founders and he told me “We currently track a number of different points of user interaction, and have a pattern recognition system, to defeat exactly the kind of attacks you’re talking about.”
You can see a brief demo in this YouTube video: https://www.youtube.com/watch?v=z35Q3TtJ-h4.
I went to the ESP-PIX serving website, and tried to use it. I think the average time to crack this type of captchas would be higher than for cracking the traditional text based catcha. So, I am little cynical about its usage. First, you have to look at multiple pictures, then relate them together and then go and find the specific word that came to your mind. Note, that the first word that comes to your mind may not be the word in the menu. For instance, the word “mice” came to my mind when I was looking at one captcha. First, I spent time looking for this word. Since this word was not in the menu, I had to go back and look at all the pictures again. There were two options in my head at that point: should I look for a word for related species like “rats”? Or is there some other higher level common theme that these pictures are trying to convey? Obviously all this thinking took some time. Finally, I went and looked at the menu again for rats, and there it was! Having to scroll through the menu again also was not exactly pleasant. The point is people think differently, and some people may waste more time than other in these type of captchas. Sometimes, the synonymous words may not be readily available in your head, and when you don’t know the word you are looking for, it even takes longer.
I think one way to improve its usability is to include possible words that people may guess. For instance, since mouse and rats look awfully similar, they should include both words in the menu and selecting either as response should be sufficient to be validated as a human. This would still be hard for a computer to do.