Toward Improved Audio CAPTCHAs Based on Auditory Perception and Language Understanding

Hendrik Meutzner, Santosh Gupta, Viet-Hung Nguyen, Thorsten Holz, Do­ro­thea Kolossa

ACM Transactions on Privacy and Security (TOPS), Volume 19, Issue 4, February 2017


A so-called completely automated public Turing test to tell computers and humans apart (CAPTCHA) represents a challenge-response test that is widely used on the Internet to distinguish human users from fraudulent computer programs, often referred to as bots. To enable access for visually impaired users, most Web sites utilize audio CAPTCHAs in addition to a conventional image-based scheme. Recent research has shown that most currently available audio CAPTCHAs are insecure, as they can be broken by means of machine learning at relatively low costs. Moreover, most audio CAPTCHAs suffer from low human success rates that arise from severe signal distortions.

This article proposes two different audio CAPTCHA schemes that systematically exploit differences between humans and computers in terms of auditory perception and language understanding, yielding a better trade-off between usability and security as compared to currently available schemes. Furthermore, we provide an elaborate analysis of Google’s prominent reCAPTCHA that serves as a baseline setting when evaluating our proposed CAPTCHA designs.


tags: Captcha