TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages
TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages - or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence & distribution. The algorithm is fully implemented & is shown to produce segmentat...
Gespeichert in:
Veröffentlicht in: | Computational linguistics - Association for Computational Linguistics 1997-03, Vol.23 (1), p.33-64 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages - or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence & distribution. The algorithm is fully implemented & is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many text analysis tasks, including information retrieval & summarization. 3 Tables, 6 Figures, 79 References. Adapted from the source document |
---|---|
ISSN: | 0891-2017 |