What is CAPTCHA and how does it work?
 
                    Deciphering squiggly letters, tapping on every image with a traffic light, or sliding a puzzle piece into place… We increasingly have to solve these small digital challenges, otherwise known as CAPTCHAs, when registering accounts, posting comments, or submitting online forms.
But what are they, exactly? And why do they exist? Read on to learn what a CAPTCHA is and the important role it plays in cybersecurity.
Why was CAPTCHA invented?
The history of CAPTCHA, which stands for “Completely Automated Public Turing test to tell Computers and Humans Apart,” goes back to the turn of the 21st century. The term was coined in 2003 by a group of researchers at Carnegie Mellon University to solve a specific problem.
Around that time, leading email service Yahoo was struggling with spammers. Malicious users were employing computer programs to create vast numbers of email accounts and misusing them to distribute spam messages far and wide.
Researchers at Carnegie Mellon wanted to help, so they devised a test in which users would have to interpret and type out a series of squiggly letters and numbers to prove they were real people when making new accounts. Yahoo quickly implemented this new technology to combat spam signups, and other platforms soon followed.
When and why you see CAPTCHA tests
While CAPTCHA was originally created to stop email spammers from creating accounts, it has since taken on a variety of additional applications. Most of us see CAPTCHA tests when logging into accounts or submitting online forms.
Logging into accounts
Many platforms, like online banks, email accounts, and social media sites, routinely ask users to complete a CAPTCHA when logging into their profiles. The CAPTCHA essentially provides an extra layer of login protection, ensuring that a real human user is attempting to log in instead of a robot.
This helps to prevent a range of automated, bot-based login attacks, like brute force attacks, in which bots rapidly try out a large number of user credentials to find the right ones and access an account.
Submitting forms
You may also encounter a CAPTCHA when clicking the “Submit” button on an online form, like a comment submission page, a contact form, or a survey.
This, again, helps to stop bot activity. Bots may be used to spam comment sections with unwanted messaging or even phishing links. They could also be used to skew or influence polls, contests, or online surveys. CAPTCHAs prevent this from happening.
How does CAPTCHA work?
The purpose of any CAPTCHA test is to discern whether a user is a real person or a computer program, like a bot. The tests come in numerous forms, but all have this same end goal in mind. Here’s a closer look at how, exactly, that works.
User vs. bot: How the test identifies you
CAPTCHAs essentially work on the principle that there are certain problems or puzzles that are quite easy for humans but very challenging or almost impossible for bots to solve. It then presents these puzzles to the user, like asking them to type out a series of symbols or answer a relatively simple math problem.
A famous CAPTCHA example is when you see a sequence of letters and numbers that are distorted in some way. They might look like they’ve been scribbled onto the screen at random angles, for example, or are obscured by colors and visual elements. While humans can read and repeat these sequences quite easily, most bots cannot.
So, if the user is able to pass the test, it essentially proves to the system that they are, most likely, a human.
Behind the scenes: Server logic and triggers
Fortunately, you don’t always have to complete CAPTCHAs when logging into accounts, entering information into forms, making online purchases, and so on. Many of them only appear in specific circumstances when website servers notice signs of unusual or possibly suspicious behavior, such as:
IP address anomalies
Sites may ask you to complete a CAPTCHA if they notice you’re using a suspicious, shared, or proxy IP address. This can also affect VPN users, since a VPN masks your real IP address with one of its own.
If a site notices your IP address change suddenly or receives a large number of requests from users sharing the same IP, it may see that as suspicious. This is because bots often hide behind proxies or VPNs to carry out their activities discreetly.
If you’re experiencing this issue often with your VPN, switching servers may help. You can also try clearing your browser history or cache, obtaining a dedicated IP address, or switching to a different browser.
Suspicious browser behavior
CAPTCHAs may also appear if a site detects that a user is behaving suspiciously or showing signs of bot-like activity. This might happen if the user is completing forms very quickly, clicking buttons in a very calculated, precise way, or sending multiple successive requests to a server.
All of these are potential signs of bot behavior, and CAPTCHAs can be deployed to slow or stop the bot in its tracks. This can help to stop bot-based scams in fields like e-commerce, such as networks of bots being used to buy large numbers of event tickets or limited-edition products for scammers to resell at a higher rate.
Sudden traffic spikes
Certain cyberattacks involve the use of botnets to flood a website with lots of requests all at once, like distributed denial of service (DDoS) attacks. CAPTCHAs can serve as DDoS mitigation tools, deploying automatically if a site detects a sudden, unexpected rise in traffic.
Types of CAPTCHA
CAPTCHAs started off as simple text-based tests but have evolved over the years into numerous forms, incorporating audio, images, and other elements.
Text-based CAPTCHA
The original CAPTCHA test, a text-based CAPTCHA, uses either words, phrases, or random sequences of letters and numbers, which have been distorted somehow or displayed in a way that makes them difficult for bots to read. Different techniques are used to create text-based CAPTCHAs, like the Gimpy technique, which involves a random selection of words from a dictionary, or Simard’s HIP, which selects random letters and numbers, then distorts them with unusual arcs and colors. The user simply has to read the text and type it into the box provided to solve the test.
Different techniques are used to create text-based CAPTCHAs, like the Gimpy technique, which involves a random selection of words from a dictionary, or Simard’s HIP, which selects random letters and numbers, then distorts them with unusual arcs and colors. The user simply has to read the text and type it into the box provided to solve the test.
However, while you’re still likely to come across CAPTCHAs based on text distortion now and then, Google’s 2014 research revealed just how ineffective they had become at distinguishing humans from bots. Google’s AI researchers applied advanced machine learning algorithms to the kinds of distorted text CAPTCHAs commonly used online. The results were striking: Their system was able to solve the most distorted CAPTCHAs with over 99% accuracy. In comparison, human success rates hovered around 33%, largely due to increasing complexity making them frustrating for real users.
Image-based CAPTCHA (reCAPTCHA v2)
These CAPTCHAs present the user with images and instruct them to click those that match a particular theme or select ones that don’t fit with the others. Well-known examples include having to click all images in a series that contain crosswalks or fire hydrants or being asked to click specific squares of a larger picture. Image-based CAPTCHAs were designed to replace text-based ones and have done so on many platforms, with some users finding them easier to understand and solve.
Image-based CAPTCHAs were designed to replace text-based ones and have done so on many platforms, with some users finding them easier to understand and solve.
They’re also believed to be trickier for bots to solve, due to the fact that they involve both semantic understanding and image recognition.
Audio CAPTCHA
Audio CAPTCHA tests play a small sound file, often with a voice reading out a series of letters or numbers. The user then has to enter what they heard into a text box. Often, the audio also has some level of background noise, which makes it difficult for bots to understand what is being said.
Audio CAPTCHAs are less common than text- and image-based variants but provide a helpful alternative for people with visual impairments.
Math and logic puzzles
Some CAPTCHAs pose simple math or logic problems to the user. They might ask you to enter a missing word in a sentence, solve an addition problem, or slide a puzzle piece into the right position, for example. Again, these puzzles are generally very straightforward for the average person, but computer programs can struggle to understand what is being asked of them. Based on the user’s response, the CAPTCHA can accurately discern between a human or bot.
Again, these puzzles are generally very straightforward for the average person, but computer programs can struggle to understand what is being asked of them. Based on the user’s response, the CAPTCHA can accurately discern between a human or bot.
No CAPTCHA reCAPTCHA
No CAPTCHA reCAPTCHA is a very simple form of CAPTCHA that was introduced by Google back in 2014. It simply asks the user to click a box beside the message “I’m not a robot.” Both bots and humans are capable of clicking the box. However, bots will almost inevitably always click right in the center, while humans tend to be less predictable, with slight variations in position, timing, and movement leading up to the click. Based on these factors (mouse movement, timing, click characteristics, scrolling behaviors, etc.), the CAPTCHA can make an informed assumption about the user’s true nature.
Both bots and humans are capable of clicking the box. However, bots will almost inevitably always click right in the center, while humans tend to be less predictable, with slight variations in position, timing, and movement leading up to the click. Based on these factors (mouse movement, timing, click characteristics, scrolling behaviors, etc.), the CAPTCHA can make an informed assumption about the user’s true nature.
If the interaction seems suspicious in any way, a follow-up CAPTCHA test will be deployed.
Invisible CAPTCHA (reCAPTCHA v3)
Like the name suggests, you can’t see invisible CAPTCHAs, nor do you need to interact with them directly or solve any particular puzzle. They run in the background, tracking a user’s activity and assigning them a rating based on their actions.
The score is based on interactions between users and the site, though Google hasn’t revealed much information about how, exactly, each score is calculated.
However, multiple sources state that the system involves collecting and processing elements of user data, which may include the visitor’s IP address, mouse movements, keyboard inputs, plus the browser and operating system they’re using.
For instance, in 2023, the French data protection authority, the CNIL, fined NS Cards France €105,000 (roughly $114,000) following an investigation that revealed that this online payment solution provider used Google’s reCAPTCHA technology without informing users that their device and application data would be collected and sent to Google for analysis.
Honeypot CAPTCHA
An easy way to think of a honeypot CAPTCHA is like a bot trap. These CAPTCHA tests are invisible to humans but visible to bots. Bots see them as empty fields in forms and will often attempt to interact with them and fill them in, which real human users wouldn’t do, since they can’t see the fields to begin with.
Time-based forms
Time-based CAPTCHAs attempt to tell humans apart from bots by timing how long it takes them to fill out a form or enter data into a field. Bots tend to perform these actions much faster than human users. So, if the CAPTCHA detects an unnaturally rapid response, it will tend to assume that the user must be a bot.
CAPTCHA and AI: What is the Turing test?
As mentioned earlier, CAPTCHA stands for “Completely Automated Public Turing test to tell Computers and Humans Apart.” The Turing test, or imitation game, is a way to test a machine’s ability to behave like humans. It was created by British mathematician, computer scientist, and philosopher Alan Turing in 1949.
Turing’s original concept for the test involved three separate terminals: one operated by a computer and two by humans. One human asks questions, while the other human and the computer provide answers. Based on their responses, the questioner must determine which is the machine.
CAPTCHA works on a similar principle. It presents a challenge and judges the response to decide whether it came from a human or a bot. The key difference is that, in CAPTCHA, the role of the evaluator is performed by software rather than a person.
Real-world AI applications from CAPTCHA data
Companies like Google have used CAPTCHA tests to train AI models and improve machine learning for years. They collect data based on how users respond to different CAPTCHA puzzles and problems and then feed that data into artificial intelligence systems, improving their semantic understanding, image recognition, and other capabilities.
One of the most common early examples of CAPTCHA tests was a text-based CAPTCHA that asked users to type a sequence of two words. Unbeknownst to the person solving it, one of the words was an image of a word from a real, physical book. Each time a user completed a CAPTCHA by typing both words, they were actually helping to transcribe and digitize physical books and documents. This process helped digitize the entire Google Books archive, along with millions of old New York Times articles.
As machine learning and AI technology started to emerge, and image-based CAPTCHAs became more prevalent, Google came up with another idea.
It gathered data from completed CAPTCHA tests to improve AI image recognition. For example, a CAPTCHA could present users with a bunch of different images and ask them to click the ones containing cars. Google could then feed that data into its machine learning models to help AI become better at spotting cars in photographs.
This same technology has also been used to improve Google Maps results (for example, by improving the algorithm’s ability to read house number plaques), provide Google Image Search results, and even allow users to search through their Google Photos libraries for pictures containing specific items or elements. It’s even being incorporated into autonomous cars, helping them spot and identify street signs.
Pros and cons of CAPTCHA
CAPTCHA brings both benefits and drawbacks for everyday users and site owners.
Benefits for site owners
Site owners largely benefit from implementing CAPTCHA technology. It helps them in blocking bot activity, stopping spam and fake accounts, preventing fraud, and more. CAPTCHAs can also be seen as a sign of credibility, making a site appear more secure.
Usability and accessibility issues
Many people find CAPTCHAs annoying or inconvenient, since they often delay them from accessing sites or content they want to see. People may not always understand the need for CAPTCHAs, which can enhance their frustration, and some users, particularly those with visual impairments or other issues, may find certain types of CAPTCHA tests particularly difficult to solve.
Negative impact on conversions
Because CAPTCHAs take time and effort to complete, they can negatively impact website conversion rates, making it harder for site owners to make sales, generate leads, and gain customers.
A 2010 Stanford University study found that certain types of CAPTCHAs (audio CAPTCHAs in particular) were so frustrating for participants in their tests that they would give up on solving them altogether in 50% of cases.
Alternatives to CAPTCHA
Given the notable downsides of CAPTCHAs, site owners may want to consider alternative means of securing their sites and blocking bots.
Social login CAPTCHA
Social sign-in buttons are an increasingly popular and secure form of anti-bot protection and website security. These buttons often appear when users want to access a platform or piece of content, asking the user to sign in with a social account on platforms like Google or Facebook.
Since it takes large amounts of time and effort to make individual social media profiles for bots, the social login button can prove invaluable in preventing fraud, spam, and scam activity.
On the downside, social logins can be seen as a form of privacy infringement. Some users don’t necessarily want to log into their social profiles and essentially reveal their identity when accessing certain sites or platforms.
Behavioral analysis
Behavioral analysis tools track patterns of user behavior and look for suspicious signs of bot activity.
For example, they’re able to see how a user behaves on a site, like where they move the mouse cursor and how often they click a button or press a key on their keyboard. They can also track a user’s movements through a site and see how long it takes for them to complete certain actions.
Bots tend to do all of these things in very quick, calculated, and precise ways, while humans may take more time or appear more random and unpredictable in their actions. This helps analytical tools tell bots and people apart.
Multi-factor authentication (MFA)
MFA and two-factor authentication (2FA) are helpful tools for preventing bots from creating and logging into user accounts. It provides an extra layer of login security, asking the user to enter a passkey or use biometrics to enter their account, which bots are not able to do.
FAQ: Common questions about CAPTCHA
How do I get rid of CAPTCHA?
You can’t get rid of CAPTCHA, since it’s a design feature of many modern sites. However, you can reduce your risks of encountering CAPTCHA tests by browsing at a normal speed, disabling any unnecessary browser extensions, and clearing your cache, cookies, and browser history on a regular basis. This should help to reduce some of the triggers behind CAPTCHA tests.
Why does Google show CAPTCHA?
CAPTCHA tests help to prevent various forms of bot activity, including fraud, spam, and cyberattacks. They’re designed to make sites safer and prevent malicious behavior. Google also uses CAPTCHA data to improve some of its services and train AI.
What does “invalid CAPTCHA” mean?
If you see the invalid CAPTCHA message, it usually means that the CAPTCHA verification has failed, and the system can’t be sure if you’re a human or a bot. This might happen due to technical issues or incorrect input. Refreshing the page or restarting your browser may help to fix the issue.
Can bots bypass CAPTCHA?
Yes, some bots are able to bypass CAPTCHA tests, and CAPTCHAs vary widely in terms of how difficult they are for bots to solve. This is one of the reasons behind the evolution of CAPTCHA technology over the years, with developers looking for various ways to strengthen their tests and prevent bots from passing them.
Is CAPTCHA safe and private?
Legitimate CAPTCHA services are generally safe to use, yes. However, privacy levels can vary, since some methods, like Google’s reCAPTCHA, can involve the collection of user data, sometimes without the user being aware or being asked for consent.
What is CAPTCHA vs. reCAPTCHA?
CAPTCHA is a broad term, referring to any online authentication test for telling humans and bots apart. reCAPTCHA is a specific type of CAPTCHA technology, created by Luis von Ahn, who was also a pioneer of initial CAPTCHA technology, and later acquired by Google. reCAPTCHA is generally more secure than conventional CAPTCHA, using more advanced algorithms and technologies, and tends to be more convenient for users, with tests that are faster to complete or run in the background.
Take the first step to protect yourself online. Try ExpressVPN risk-free.
Get ExpressVPN 
             
             
             
     
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
         
         
         
        
Comments
The most helpful thing for me is often a magnifying glass. The pictures are often blurry or too small to see what I am looking for.
Sometimes now I'm not getting the CAPTCHA I simply denied access to the site. How do I prevent this (see: Ticketmaster)?