Multimedia document summarization
Multimedia document summarization techniques are described. That is, given a document that includes text and a set of images, various implementations generate a summary by extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and num...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multimedia document summarization techniques are described. That is, given a document that includes text and a set of images, various implementations generate a summary by extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and number/size of images in the summary. In one embodiment a given document is divided into elements. One class of elements pertains to a first content type such as text units, while another class of elements pertains to a second different content type such as image units. Budget constraints associated with the elements are ascertained and can include the size/number of images, and number of sentences, words, or characters. In a second approach, a graph-based approach is utilized. Specifically, a graph is created whose nodes represent different content, e.g, text elements or image elements. Each element has a corresponding reward which is based on an inherent value of the element without considering other elements in a corresponding document. The graph based approach tries to ensure maximum cross cohesion between segments of the different types (text and images) while also ensuring diversity of content and coverage of information overall. |
---|