Audio-driven speaker video synthesis method and system fused with neural radiation field

The invention provides an audio-driven speaker video synthesis method and system fused with a neural radiation field, and the method comprises the steps: obtaining a video data set in an environment, randomly selecting the video data set in a period of time, and analyzing a video sequence and an aud...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: FENG SIWEI, ZHU YUEBING, LI YONGYUAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides an audio-driven speaker video synthesis method and system fused with a neural radiation field, and the method comprises the steps: obtaining a video data set in an environment, randomly selecting the video data set in a period of time, and analyzing a video sequence and an audio sequence from the video data set; extracting face features from the video sequence and extracting audio features from the audio sequence; constructing an audio conditional implicit function F theta, putting the extracted face features and audio feature parameters into the constructed audio conditional implicit function F theta for training, and calculating the color value and volume density of the audio; and according to the color value and the volume density of the audio, using a volume rendering technology to render visual face and background information from the dynamic neural radiation field, and synthesizing a high-fidelity voice speaker video corresponding to the audio signal. According to the method, the