Phoneme segmentation method and system based on pre-training model and sequence modeling

The invention relates to a phoneme segmentation method and system based on a pre-training model and sequence modeling, and belongs to the technical field of natural language processing. The invention provides a phoneme segmentation method, which comprises the following steps of: extracting acoustic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	DONG LING, YU ZHENGTAO, YANG SHANGLONG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention relates to a phoneme segmentation method and system based on a pre-training model and sequence modeling, and belongs to the technical field of natural language processing. The invention provides a phoneme segmentation method, which comprises the following steps of: extracting acoustic characteristics in an original audio sequence through a pre-training model and further enhancing context representation so as to capture a long-term dependency relationship in the audio sequence, learning a label dependency relationship in the audio sequence by utilizing sequence modeling, and decoding a label sequence with the highest selection probability as final output. The system comprises corresponding system modules used for executing the method. Experimental results show that on the same data set, compared with an existing method, the method has better segmentation performance. 本发明涉及基于预训练模型与序列建模的音素分割方法及系统，属于自然语言处理技术领域。本发明提出一种音素分割方法，通过预训练模型提取原始音频序列中的声学特征并进一步增强上下文表征，以捕捉音频序列中的长期依赖关系，利用序列建模学习音频序列中的标签依赖关系，解码选择概率