There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes:
Do AI detectors work?
In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.
Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
To elaborate on our research into the shortcomings of detectors, one of our key findings was that these tools sometimes suggest that human-written content was generated by AI.
When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
There were also indications that it could disproportionately impact students who had learned or were learning English as a second language and students whose writing was particularly formulaic or concise.
Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.
There is some good research in watermarking LLM-generated text, but the watermarks are not generally robust.
I don’t think the detectors are going to win this arms race.
More Stories
Friday Squid Blogging: Protecting Cephalopods in Medical Research
From Nature: Cephalopods such as octopuses and squid could soon receive the same legal protection as mice and monkeys do...
Russian Company Offers $20M For Non-NATO Mobile Exploits
Operation Zero will pay $20m for exploits like RCE, LPE and SBX, integral to a full-chain attack Read More
Microsoft’s Bing AI Faces Malware Threat From Deceptive Ads
Malwarebytes said the goal of these tactics is to lure victims into downloading malicious software Read More
Phishing, Smishing Surge Targets US Postal Service
The surge in these attacks has prompted DomainTools to delve into their origins and implications Read More
Three men found guilty of laundering $2.5 million in Target gift card tech support scam
Three Californian residents have been convicted of laundering millions of dollars tricked out of older adults who had fallen victim...
ZeroFont trick makes users think that message has been scanned for threats
Attackers are using the "ZeroFont" technique to manipulate the preview of a message to suggest it had already been scanned...