Bot or Human? Creating the invisible Turing Test for the Internet

AI systems have detectable behavioral signatures that can be used to improve bot detection. Roundtable's Proof-of-Human API verifies proof-of-human invisibly, continuously, and instantaneously.

1
Want to see behavioral differences in action? Skip to Section 2 for interactive keystroke and mouse movement demos, or Section 3 for a cognitive psychology experiment.

Google reCAPTCHA v3 boasts a commanding market share in bot detection today. It claims to analyze patterns of user behavior across the web, including mouse movements, typing patterns, and browsing history. However, reCAPTCHA v3 fails to detect AI agents using real browser environments.

To illustrate, we tested OpenAI's Operator agent with reCAPTCHA v3. Despite Operator's non-humanlike interaction patterns (such as perfectly centered mouse clicks and repeatedly pasted text), reCAPTCHA v3 assigned a high "human" score and did not flag the agent as suspicious:

Figure 1: OpenAI's Operator agent passing reCAPTCHA v3 despite non-humanlike interaction patterns, demonstrating the limitations of current bot detection systems.

Today, LLMs from companies like OpenAI and Anthropic repeatedly pass as humans in the classic Turing Test , necessitating new approaches that — for example — focus on behavioral patterns and cognitive signatures.

2

Behavioral methods leverage the unique patterns in how humans physically interact with computers . For example, human keystroke dynamics are irregular and context-dependent. Bots, by contrast, often paste text instantly or simulate key-by-key typing with unnatural regularity. Similarly, human mouse movements are characterized by micro-adjustments, overshoots, and corrections, while bots tend to move in straight lines or teleport between points. These differences are not only visually apparent but also quantifiable.

Try out the interactive demos below to see the difference between human and bot behavior:

⌨️

Keystroke Dynamics Analysis

Compare bot vs human typing patterns in real-time. See how AI agents exhibit different keystroke timing signatures compared to natural human typing.

Figure 2: Interactive keystroke latency tracker comparing bot and human typing patterns.
🖱️

Mouse Movement Analysis

Compare bot vs human mouse movement patterns during form interactions. Observe how AI agents exhibit different cursor trajectories compared to natural human movements.

Figure 3: Interactive mouse movement analysis comparing bot and human patterns during form filling.
3

How much can these behavioral patterns be spoofed? This remains an ongoing question, but the evidence to date is optimistic. Academic studies have found behavioral biometrics to be robust against attacks under adversarial conditions , and industry validation from top financial institutions demonstrates real-world resilience.

The underlying reason appears to be cost complexity. After all, fraud is an economic game. Traditional credentials like passwords or device fingerprints are static, finite, and easily replayed, whereas behavioral signatures encode fine-grained variations that are difficult to reverse-engineer. While AI agents can theoretically simulate these patterns, the effort likely outweighs other alternatives.

To further illustrate the point, we can extend the challenge: can a bot completely replicate human cognitive psychology?

Take for example the Stroop task . It's a classic psychology experiment where humans select the color a word is written it and not what the word says. Humans typically show slower responses when the meaning of a word conflicts with its color (e.g., the word "BLUE" written in green), reflecting an overriding of automatic behavior . Bots and AI agents, by contrast, are not subject to such interference and can respond with consistent speed regardless of stimuli.

Try out the interactive demo below to see the difference between human and bot behavior:

🧠

Cognitive Psychology Analysis

Compare bot vs human cognitive interference using the classic Stroop task. Observe how humans show slower responses during conflicting stimuli while bots maintain consistent performance.

Figure 4: Interactive Stroop Task comparing bot vs. human reaction time and interference.

The Stroop and other canonical experimental paradigms provide an additional obstacle to AI agents. To completely replicate human cognitive psychology, an AI agent would not only need to simulate our cognitive outputs, but our cognitive processes as well. These processes are a function of our neural and environment constraints. While someone can easily create a Stroop Bot that replicates human biases, fully replicating end-to-end human processing is a hard and unsolved problem.

4

We first started identifying bots in graduate school for online data collection and model training. We would create simple tasks that humans could complete but bots would struggle with. One example was the Boston Temperature Test: guessing the twelve monthly average high temperatures in Boston.

Humans err in predictable ways but follow macro seasonal patterns, whereas bots and AI agents were either completely random or too perfect. Figure 5 plots the individual user curves estimated for each agent type.

Temperature plots showing human vs bot responses
Figure 5: User temperature estimates grouped by agent type. Black line shows ground truth. Humans err in predictable ways but follow macro seasonal patterns, whereas bots and AI agents were completely random or too perfect.

Once we have a mapping between agent type and temperature estimates, we can use it to create a confidence score for any given user.

While we can't bombard Internet users with experimental stimuli (e.g. Stroop or the traditional CAPTCHA ), we can measure similar patterns across normal web interactions. For example, we can collect keystroke dynamics for users, bootstrap an initial dataset with artificial lab experiments like the Stroop or the Boston Temperature Test, and then create a direct mapping between keystroke dynamics and proof-of-human (Figure 6).

Keystroke dynamics analysis showing human vs bot typing patterns
Figure 6: Keystroke dynamics analysis demonstrating the mapping between typing patterns and proof-of-human verification.

Similar detection logic can be applied using other behavioral patterns like mouse movement, click behavior, and scroll tracking or even more cognitively demanding stimuli like those that appear on an e-commerce site or a React app.

We offer this behavioral and cognitive approach for bot detection and cybersecurity. Rather than focus on privacy invasive methods like biometrics scans or cookie tracking, the Roundtable Proof-of-Human API presents online agents an economic challenge: replicate the full range of human cognition naturally and continuously.

About the Authors

Mayank Agrawal and Mathew Hardy are co-founders of Roundtable Technologies Inc., where they work on building behavioral and cognitive proof-of-human systems.

Keystroke Dynamics Analysis

Bot Pattern:

Your Pattern:

Mouse Movement Analysis

Sign Up (Bot)

Bot Movement
Bot Jitter

Sign Up (Human)

Human Movement
Human Jitter

Cognitive Psychology Analysis

Stroop Task - Bot vs Human

Task: Click the button that matches the color of the word, not what the word says. For example, if you see the word "BLUE" in green, click the green button.

Bot Task
Trial 0/10
---
Human Task
Trial 0/10
---