A CAPTCHA is a Completely Automated Public Test to tell Computers and Humans Apart. Typical CAPTCHAs present a challenge string consisting of a visually distorted sequence of letters and perhaps numbers, which in theory only a human can read. Attackers of CAPTCHAs have two primary points of leverage: Optical Character Recognition (OCR) can identify some characters, while nonuniform probabilities make other characters relatively easy to guess. This paper uses a mathematical theory of assurance to characterize the probability that a correct answer to a CAPTCHA is not just a lucky guess. We examine the three most common types of challenge strings, dictionary words, Markov text, and random strings, and find substantial weaknesses in each. We therefore propose improvements to Markov text, and new challenges based on the consonant-vowel-consonant (CVC) trigrams of psychology. Theory and experiment together quantify problems in current challenges and the improvements offered by modifications.
We propose a design methodology for "implicit" CAPTCHAs to relieve drawbacks of present technology. CAPTCHAs are tests administered automatically over networks that can distinguish between people and machines and thus protect web services from abuse by programs masquerading as human users. All existing CAPTCHAs' challenges require a significant conscious effort by the person answering them -- e.g. reading and typing a nonsense word -- whereas implicit CAPTCHAs may require as little as a single click. Many CAPTCHAs distract and interrupt users, since the challenge is perceived as an irrelevant intrusion; implicit CAPTCHAs can be woven into the expected sequence of browsing using cues tailored to the site. Most existing CAPTCHAs are vulnerable to "farming-out" attacks in which challenges are passed to a networked community of human readers; by contrast, implicit CAPTCHAs are not "fungible" (in the sense of easily answerable in isolation) since they are meaningful only in the specific context of the website that is protected. Many existing CAPTCHAs irritate or threaten users since they are obviously tests of skill: implicit CAPTCHAs appear to be elementary and inevitable acts of browsing. It can often be difficult to detect when CAPTCHAs are under attack: implicit CAPTCHAs can be designed so that certain failure modes are correlated with failed bot attacks. We illustrate these design principles with examples.