What is it about?
Audio deepfakes are a serious threat to the cybersecurity as they can be used to spread misinformation or extort data and money. We show that transferring audio signals into visual domain allows successful identification of synthetically generated speech samples.
Featured Image
Photo by Catherine Breslin on Unsplash
Why is it important?
Our new approach utilizes prototype-based deep neural network (PIPNet) and well-established datasets. Therefore it is suited to handle most popular vector of attacks that are currently used. Modern detectors can bridge the security gap that we observe nowadays.
Perspectives
I hope that this article can inspire other scientists to seek information "outside the box". I believe there is a huge potential in looking for artifacts in visual representations of audio signals.
Alicja Martinek
NASK
Read the Original
This page is a summary of: Real until proven fake - Source-Level Audio Deepfake Detection (with PIPNet), November 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3746252.3761659.
You can read the full text:
Resources
Contributors
The following have contributed to this page







