SimSwap++: Towards Faster and High-Quality Identity Swapping

Face identity editing (FIE) shows great value in AI content creation. Low-resolution FIE approaches have achieved tremendous progress, but high-quality FIE struggles. Two major challenges hinder higher-resolution and higher-performance development of FIE: lack of high-resolution dataset and unaccept...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2024-01, Vol.46 (1), p.1-18
Hauptverfasser: Chen, Xuanhong, Ni, Bingbing, Liu, Yutian, Liu, Naiyuan, Zeng, Zhilin, Wang, Hang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Face identity editing (FIE) shows great value in AI content creation. Low-resolution FIE approaches have achieved tremendous progress, but high-quality FIE struggles. Two major challenges hinder higher-resolution and higher-performance development of FIE: lack of high-resolution dataset and unacceptable complexity forbidding for mobile platforms. To address both issues, we establish a novel large-scale, high-quality dataset tailored for FIE. Based on our SimSwap [1], we propose an upgraded version named SimSwap++ with significantly boosted model efficiency. SimSwap++ features two major innovations for high-performance model compression. Firstly, a novel computational primitive named Conditional Dynamic Convolution (CD-Conv) is proposed to address the inefficiency of conditional schemes (e.g., AdaIN) in tiny models. CD-Conv achieves anisotropic processing and injection with significantly lower complexity compared to standard conditional operators, e.g., modulated convolution. Secondly, a Morphable Knowledge Distillation (MKD) is presented to further trim the overall model. Unlike conventional homogeneous teacher-student structures, MKD is designed to be heterogeneous and mutually compensable, endowing the student with the multi-path morphable property; thus, our student maximally inherits the teacher' knowledge after distillation while further reducing its complexity through structure re-parameterization. Extensive experiments demonstrate that our SimSwap++ achieves state-of-the-art performance (97.55\% ID accuracy on FaceForensics++) with extremely low complexity (2.5 GFLOPs).
ISSN:0162-8828
2160-9292
1939-3539
DOI:10.1109/TPAMI.2023.3307156