SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES

Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to genera...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	BASOGLU, Christopher Hakan, CHANG, Shuangyu, BEHRE, Piyush, TAN, Sharman W, PATHAK, Sayan Dev, WU, Jian, PARIHAR, Naveen, SHARMA, Eva, LIU, Yang, LIN, Edward C, KHALIL, Hosam Adel, AGARWAL, Amit Kumar
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to generate a segment break at the potential segmentation boundary. One or more look-ahead words occurring after the potential segmentation boundary are identified, wherein an acoustic segmentation score and a language segmentation score associated with the potential segmentation boundary and the one or more look-ahead words are generated. Systems then either refrain from generating a segment break at the potential segmentation boundary or generate the segment break at the potential segmentation boundary based on the acoustic and/or language segmentation score at least meeting or exceeding a segmentation score threshold. La présente invention concerne des systèmes et des procédés pour une segmentation audio intelligente à l'aide de caractéristiques acousto-linguistiques basées sur l'anticipation. Par exemple, la présente invention concerne des systèmes et des procédés pour obtenir un audio, traiter l'audio, identifier une limite de segmentation potentielle à l'intérieur de l'audio, et déterminer s'il faut ou non générer une coupure de segment au niveau de la limite de segmentation potentielle. Un ou plusieurs mots d'anticipation se produisant après la limite de segmentation potentielle sont identifiés, un score de segmentation acoustique et un score de segmentation linguistique associés à la limite de segmentation potentielle et au ou aux mots d'anticipation étant générés. Les systèmes s'abstiennent alors de générer une coupure de segment au niveau de la limite de segmentation potentielle ou génèrent la coupure de segment au niveau de la limite de segmentation potentielle sur la base du score de segmentation acoustique et/ou linguistique atteignant ou dépassant au moins un seuil de score de segmentation.