ELECTRONIC DEVICE AND SPEECH RECOGNITION METHOD THEREFOR, AND MEDIUM

Embodiments of this application provide an electronic device, a speech recognition method therefor, and a medium, and relate to a speech recognition technology in the field of artificial intelligence (Artificial Intelligence, AI). The speech recognition method in this application includes: obtaining...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LU, Yuewan, QIN, Lei, LIU, Hao, ZHANG, Lele
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Embodiments of this application provide an electronic device, a speech recognition method therefor, and a medium, and relate to a speech recognition technology in the field of artificial intelligence (Artificial Intelligence, AI). The speech recognition method in this application includes: obtaining a facial depth image and a to-be-recognized voice of a user, where the facial depth image is an image collected by using a depth camera; recognizing a mouth shape feature from the facial depth image, and recognizing a voice feature from a to-be-recognized audio; and fusing the voice feature and the mouth shape feature into an audio-video feature, and recognizing, based on the audio-video feature, a voice uttered by the user. According to the method, because the mouth shape feature extracted from the facial depth image is not affected by light of an environment, the mouth shape feature can more accurately reflect a mouth shape change obtained when the user utters the voice. The mouth shape feature extracted from the facial depth image and the voice feature are fused, so that speech recognition accuracy can be improved.