Cloudflare has a new feature—available to free users as well—that uses AI to generate random pages to feed to AI web crawlers:
Instead of simply blocking bots, Cloudflare’s new system lures them into a “maze” of realistic-looking but irrelevant pages, wasting the crawler’s computing resources. The approach is a notable shift from the standard block-and-defend strategy used by most website protection services. Cloudflare says blocking bots sometimes backfires because it alerts the crawler’s operators that they’ve been detected.
“When we detect unauthorized crawling, rather than blocking the request, we will link to a series of AI-generated pages that are convincing enough to entice a crawler to traverse them,” writes Cloudflare. “But while real looking, this content is not actually the content of the site we are protecting, so the crawler wastes time and resources.”
The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).
It’s basically an AI-generated honeypot. And AI scraping is a growing problem:
The scale of AI crawling on the web appears substantial, according to Cloudflare’s data that lines up with anecdotal reports we’ve heard from sources. The company says that AI crawlers generate more than 50 billion requests to their network daily, amounting to nearly 1 percent of all web traffic they process. Many of these crawlers collect website data to train large language models without permission from site owners….
Presumably the crawlers will now have to up both their scraping stealth and their ability to filter out AI-generated content like this. Which means the honeypots will have to get better at detecting scrapers and more stealthy in their fake content. This arms race is likely to go back and forth, wasting a lot of energy in the process.
More Stories
Friday Squid Blogging: Squid Werewolf Hacking Group
In another rare squid/cybersecurity intersection, APT37 is also known as “Squid Werewolf.” As usual, you can also use this squid...
Solar Power System Vulnerabilities Could Result in Blackouts
Forescout researchers found multiple vulnerabilities in leading solar power system manufacturers, which could be exploited to cause emergencies and blackouts...
Nine in Ten Healthcare Organizations Use the Most Vulnerable IoT Devices
Claroty revealed that 89% of healthcare organizations use the top 1% of riskiest Internet-of-Medical-Things (IoMT) devices Read More
VanHelsing ransomware: what you need to know
First reported earlier in March 2025, VanHelsing is a new ransomware-as-a-service operation. Read more in my article on the Tripwire...
Trump CISA Cuts Threaten US Election Integrity, Experts Warn
Expert speakers discussed the impact of reported cutbacks to CISA on the ability of local officials to protect against surging...
Morphing Meerkat PhaaS Platform Spoofs 100+ Brands
A PhaaS platform, dubbed 'Morphing Meerkat,' uses DNS MX records to spoof over 100 brands and steal credentials, according to...