TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages

TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages - or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence & distribution. The algorithm is fully implemented & is shown to produce segmentat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational linguistics - Association for Computational Linguistics 1997-03, Vol.23 (1), p.33-64
1. Verfasser: Hearst, Marti A
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages - or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence & distribution. The algorithm is fully implemented & is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many text analysis tasks, including information retrieval & summarization. 3 Tables, 6 Figures, 79 References. Adapted from the source document
ISSN:0891-2017