8 December 2023 Exploring spatial–temporal features fusion model for Deepfake video detection
Jiujiu Wu, Jiyu Zhou, Danyu Wang, Lin Wang
Author Affiliations +
Abstract

The rapid development of Deepfake technology has posed significant challenges in detecting fake videos. In response to the existing problems in reference frame selection, spatial–temporal feature mining, and fusion in face-swapping video detection techniques, we propose a face-swapping video detection model based on spatial–temporal feature fusion. First, key frame sequences are selected using interframe facial edge region differences. Then, the key frame sequences are separately input into the spatial branch to extract hidden artifacts and the temporal branch to extract inconsistent information. Finally, the spatial–temporal features are fused using a self-attention mechanism and input into a classifier to achieve detection results. To validate the effectiveness of the proposed model, we conducted experiments on the Faceforensics++ and Celeb-DF open-source Deepfake datasets. The experimental results demonstrate that the proposed model achieves better detection accuracy and higher-ranking generalization performance than state-of-the-art competitors.

© 2023 SPIE and IS&T
Jiujiu Wu, Jiyu Zhou, Danyu Wang, and Lin Wang "Exploring spatial–temporal features fusion model for Deepfake video detection," Journal of Electronic Imaging 32(6), 063025 (8 December 2023). https://doi.org/10.1117/1.JEI.32.6.063025
Received: 8 June 2023; Accepted: 27 November 2023; Published: 8 December 2023
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Feature fusion

Performance modeling

Feature extraction

Data modeling

Education and training

Machine learning

Back to Top