Story segmentation and detection of commercials in broadcast news video

The Informedia Digital Library Project allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hauptmann, A.G., Witbrock, M.J.
Format: Tagungsbericht
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Informedia Digital Library Project allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can segment the broadcast into video paragraphs, or stories, that are useful for information retrieval. In previous papers we have shown that speech recognition is sufficient for information retrieval of pre-segmented video news stories. We now address the issue of segmentation and demonstrate that a fully automatic system can extract story boundaries using available audio, video and closed-captioning cues. The story segmentation step for the Informedia Digital Video Library splits full-length news broadcasts into individual news stories. During this phase the system also labels commercials as separate "stories". We explain how the Informedia system takes advantage of the closed captioning frequently broadcast with the news, how it extracts timing information by aligning the closed-captions with the result of the speech recognition, and how the system integrates closed-caption cues with the results of image and audio processing.
ISSN:1092-9959
2378-7104
DOI:10.1109/ADL.1998.670392