Unsupervised Learning-Based Framework for Deepfake Video Detection

With the continuous development of computer hardware equipment and deep learning technology, it is easier for people to swap faces in videos by currently-emerging multimedia tampering tools, such as the most popular deepfake. It would bring a series of new threats of security. Although many forensic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2023, Vol.25, p.4785-4799
Hauptverfasser: Zhang, Li, Qiao, Tong, Xu, Ming, Zheng, Ning, Xie, Shichuang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the continuous development of computer hardware equipment and deep learning technology, it is easier for people to swap faces in videos by currently-emerging multimedia tampering tools, such as the most popular deepfake. It would bring a series of new threats of security. Although many forensic researches have focused on this new type of manipulation and achieved high detection accuracy, most of which are based on supervised learning mechanism with requiring a large number of labeled samples for training. In this paper, we first develop a novel unsupervised detection manner for identifying deepfake videos. The main fundamental behind our proposed method is that the face region in the real video is taken by the camera while its counterpart in the deepfake video is usually generated by the computer; the provenance of two videos is totally different. Specifically, our method includes two clustering stages based on Photo-Response Non-Uniformity (PRNU) and noiseprint feature. Firstly, the PRNU fingerprint of each video frame is extracted, which is used to cluster the full-size identical source video (regardless of its real or fake). Secondly, we extract the noiseprint from the face region of the video, which is used to identify (re-cluster for the task of binary classification) the deepfake sample in each cluster. Numerical experiments verify our proposed unsupervised method performs very well on our own dataset and the benchmark FF++ dataset. More importantly, its performance rivals that of the supervised-based state-of-the-art detectors.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2022.3182509