Cloudflare has a new feature—available to free users as well—that uses AI to generate random pages to feed to AI web crawlers:
Instead of simply blocking bots, Cloudflare’s new system lures them into a “maze” of realistic-looking but irrelevant pages, wasting the crawler’s computing resources. The approach is a notable shift from the standard block-and-defend strategy used by most website protection services. Cloudflare says blocking bots sometimes backfires because it alerts the crawler’s operators that they’ve been detected.
“When we detect unauthorized crawling, rather than blocking the request, we will link to a series of AI-generated pages that are convincing enough to entice a crawler to traverse them,” writes Cloudflare. “But while real looking, this content is not actually the content of the site we are protecting, so the crawler wastes time and resources.”
The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).
It’s basically an AI-generated honeypot. And AI scraping is a growing problem:
The scale of AI crawling on the web appears substantial, according to Cloudflare’s data that lines up with anecdotal reports we’ve heard from sources. The company says that AI crawlers generate more than 50 billion requests to their network daily, amounting to nearly 1 percent of all web traffic they process. Many of these crawlers collect website data to train large language models without permission from site owners….
Presumably the crawlers will now have to up both their scraping stealth and their ability to filter out AI-generated content like this. Which means the honeypots will have to get better at detecting scrapers and more stealthy in their fake content. This arms race is likely to go back and forth, wasting a lot of energy in the process.
More Stories
Friday Squid Blogging: Squid Facts on Your Phone
Text “SQUID” to 1-833-SCI-TEXT for daily squid facts. The website has merch. As usual, you can also use this squid...
Law Enforcement Crackdowns Drive Novel Ransomware Affiliate Schemes
Increased law enforcement pressure has forced ransomware groups like DragonForce and Anubis to move away from traditional affiliate models Read...
SAP Fixes Critical Vulnerability After Evidence of Exploitation
A maximum severity flaw affecting SAP NetWeaver has been exploited by threat actors Read More
M&S Shuts Down Online Orders Amid Ongoing Cyber Incident
British retailer M&S continues to tackle a cyber incident with online orders now paused for customers Read More
Security Experts Flag Chrome Extension Using AI Engine to Act Without User Input
Researchers have found a Chrome extension that can act on the user’s behalf by using a popular AI agent orchestration...
Cryptocurrency Thefts Get Physical
Long story of a $250 million cryptocurrency theft that, in a complicated chain events, resulted in a pretty brutal kidnapping....