Cloudflare has a new feature—available to free users as well—that uses AI to generate random pages to feed to AI web crawlers:
Instead of simply blocking bots, Cloudflare’s new system lures them into a “maze” of realistic-looking but irrelevant pages, wasting the crawler’s computing resources. The approach is a notable shift from the standard block-and-defend strategy used by most website protection services. Cloudflare says blocking bots sometimes backfires because it alerts the crawler’s operators that they’ve been detected.
“When we detect unauthorized crawling, rather than blocking the request, we will link to a series of AI-generated pages that are convincing enough to entice a crawler to traverse them,” writes Cloudflare. “But while real looking, this content is not actually the content of the site we are protecting, so the crawler wastes time and resources.”
The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).
It’s basically an AI-generated honeypot. And AI scraping is a growing problem:
The scale of AI crawling on the web appears substantial, according to Cloudflare’s data that lines up with anecdotal reports we’ve heard from sources. The company says that AI crawlers generate more than 50 billion requests to their network daily, amounting to nearly 1 percent of all web traffic they process. Many of these crawlers collect website data to train large language models without permission from site owners….
Presumably the crawlers will now have to up both their scraping stealth and their ability to filter out AI-generated content like this. Which means the honeypots will have to get better at detecting scrapers and more stealthy in their fake content. This arms race is likely to go back and forth, wasting a lot of energy in the process.
More Stories
Friday Squid Blogging: Two-Man Giant Squid
The Brooklyn indie art-punk group, Two-Man Giant Squid, just released a new album. As usual, you can also use this...
Cyber Agencies Warn of Fast Flux Threat Bypassing Network Defenses
A joint cybersecurity advisory warns organizations globally about the defense gap in detecting and blocking fast flux techniques, which are...
Troy Hunt Gets Phished
In case you need proof that anyone, even people who do cybersecurity for a living, Troy Hunt has a long,...
Tj-actions Supply Chain Attack Traced Back to Single GitHub Token Compromise
The threat actors initially attempted to compromise projects associated with the Coinbase cryptocurrency exchange, said Palo Alto Networks Read More
Chinese State Hackers Exploiting Newly Disclosed Ivanti Flaw
Mandiant warned that Chinese espionage actor UNC5221 is actively exploiting a critical Ivanti vulnerability, which can lead to remote code...
Major Online Platform for Child Exploitation Dismantled
An international law enforcement operation has shut down Kidflix, a platform for child sexual exploitation with 1.8m registered users Read...