IMPLEMENTATION OF UNSUPERVISED TOPIC SEGMENTATION IN A DATA COMMUNICATIONS ENVIRONMENT

A method is provided in one example embodiment and includes extracting sentences from data, which comprises a speech transcript; tokenizing the plurality of sentences to develop for each of the plurality of sentences a sentence vector and at least one feature vector; and performing topic segmentatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GADDE VENKATA RAMANA RAO, DIAO QIAN
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A method is provided in one example embodiment and includes extracting sentences from data, which comprises a speech transcript; tokenizing the plurality of sentences to develop for each of the plurality of sentences a sentence vector and at least one feature vector; and performing topic segmentation on the speech transcript using the sentence vectors and feature vectors, the topic segmentation resulting in a listing of segments corresponding to the speech transcript. In certain embodiments, the feature vector may be at least one of a cue word feature vector, a speaker change feature vector, and a scene change feature vector.