Deepfake detection expressions UC Riverside Research 2022

Computer scientists can now detect manipulated facial expressions in deepfake videos with higher accuracy than other current methodologies allow for. In experiments on two challenging facial manipulation data sets, the new tech accurately pin-pointed 99% of manipulated expressions in video clips. The achievement heralds a new era in the development of automated tools for detecting manipulated videos.

Identity swaps vs. facial expressions

Prior to this point, researchers had created tools capable of detecting deepfake identity swaps with relative accuracy. For example, tools could generally determine the authenticity of a video showing your chief executive (yes, that is the real chief executive talking or no, that is not the real chief executive talking). But tools had a tougher time discerning whether a genuine video of your chief executive had been manipulated to show inaccurate facial expressions.

While the detection of inaccurate facial expressions may seem trivial, consider the power of facial expressions in person-to-person communications. Facial expressions communicate emotions, intentions, and even action requests. Smiles vs. scowls can suggest entirely different preferred sets of business deliverables or desired outcomes.

 Significance of the work

Developments in the deepfake world have made it relatively simple to swap one talking head for another. The isolation and manipulation of a person’s facial expressions is also possible. But, up until this point, few accurate methods surfaced that allowed for the tagging of videos or faces where only the facial expression had been tampered with. This discovery, by University of California Riverside researchers, is considered significant.

UC Riverside method

How does the UC Riverside method of deepfake detection work? It divides the task into two components within a deep neural network. The first component observes facial expressions and sends information about the regions that contain the expression -such as the eyes, nose or mouth- into a second component of the system; known as the encoder-decoder. The encoder-decoder architecture maintains responsibility for manipulation detection and localization.

The aforementioned framework, known as Expression Manipulation Detection (EMD) can both detect and localize the specific areas within an image that has been manipulated. In other words, it can create ‘heat maps’ of specific areas of the face subjected to manipulation.

Further details

Experimental analyses reveal that the Expression Manipulation Detection methodology averages better performance than other tools in the detection of facial expression manipulations and in detection of identity swaps. According to UC Riverside researchers, EMD accurately detected 99% of manipulated videos, indicating a significant breakthrough in the detection of manipulated content.

Real-world applications

The detection of genuine or falsified emotional expressions is useful in a variety of disciplines, including image processing, cyber security, robotics, psychological studies and virtual reality development.

Learn more about deepfake detection

For more about the researchers’ work, see the paper entitled, “Detection and Localization of Facial Expression Manipulations,” which was presented at the 2022 Winter Conference on Applications of Computer vision. Or learn more about deepfakes in’s interview with CEO of Cyabra, Dan Brahmy.

Lastly, to receive more cutting-edge cyber security news, best practices and analyses, please sign up for the newsletter. 

Back to top button