Method and device for generating speech video using audio signal

A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extra...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hwang, Guembuel, Jang, Seyoung, Park, Sungwoo, Chae, Gyeongsu
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY MUSICAL INSTRUMENTS PHYSICS PICTORIAL COMMUNICATION, e.g. TELEVISION SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.