This is interesting research:
In this paper, we develop a new mechanism for detecting audio deepfakes using techniques from the field of articulatory phonetics. Specifically, we apply fluid dynamics to estimate the arrangement of the human vocal tract during speech generation and show that deepfakes often model impossible or highly-unlikely anatomical arrangements. When parameterized to achieve 99.9% precision, our detection mechanism achieves a recall of 99.5%, correctly identifying all but one deepfake sample in our dataset.
From an article by two of the researchers:
The first step in differentiating speech produced by humans from speech generated by deepfakes is understanding how to acoustically model the vocal tract. Luckily scientists have techniques to estimate what someone—or some being such as a dinosaur—would sound like based on anatomical measurements of its vocal tract.
We did the reverse. By inverting many of these same techniques, we were able to extract an approximation of a speaker’s vocal tract during a segment of speech. This allowed us to effectively peer into the anatomy of the speaker who created the audio sample.
From here, we hypothesized that deepfake audio samples would fail to be constrained by the same anatomical limitations humans have. In other words, the analysis of deepfaked audio samples simulated vocal tract shapes that do not exist in people.
Our testing results not only confirmed our hypothesis but revealed something interesting. When extracting vocal tract estimations from deepfake audio, we found that the estimations were often comically incorrect. For instance, it was common for deepfake audio to result in vocal tracts with the same relative diameter and consistency as a drinking straw, in contrast to human vocal tracts, which are much wider and more variable in shape.
This is, of course, not the last word. Deepfake generators will figure out how to use these techniques to create harder-to-detect fake voices. And the deepfake detectors will figure out another, better, detection technique. And the arms race will continue.
Slashdot thread.
More Stories
The AI Fix #30: ChatGPT reveals the devastating truth about Santa (Merry Christmas!)
In episode 30 of The AI Fix, AIs are caught lying to avoid being turned off, Apple’s AI flubs a...
US and Japan Blame North Korea for $308m Crypto Heist
A joint US-Japan alert attributed North Korean hackers with a May 2024 crypto heist worth $308m from Japan-based company DMM...
Spyware Maker NSO Group Found Liable for Hacking WhatsApp
A judge has found that NSO Group, maker of the Pegasus spyware, has violated the US Computer Fraud and Abuse...
Spyware Maker NSO Group Liable for WhatsApp User Hacks
A US judge has ruled in favor of WhatsApp in a long-running case against commercial spyware-maker NSO Group Read More
Major Biometric Data Farming Operation Uncovered
Researchers at iProov have discovered a dark web group compiling identity documents and biometric data to bypass KYC checks Read...
Ransomware Attack Exposes Data of 5.6 Million Ascension Patients
US healthcare giant Ascension revealed that 5.6 million individuals have had their personal, medical and financial information breached in a...