Text Segmentation by Cross Segment Attention
Document and discourse segmentation are two fundamental NLP tasks pertaining to breaking up text into constituents, which are commonly used to help downstream tasks such as information retrieval or text summarization. In this work, we propose three transformer-based architectures and provide compreh...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Document and discourse segmentation are two fundamental NLP tasks pertaining
to breaking up text into constituents, which are commonly used to help
downstream tasks such as information retrieval or text summarization. In this
work, we propose three transformer-based architectures and provide
comprehensive comparisons with previously proposed approaches on three standard
datasets. We establish a new state-of-the-art, reducing in particular the error
rates by a large margin in all cases. We further analyze model sizes and find
that we can build models with many fewer parameters while keeping good
performance, thus facilitating real-world applications. |
---|---|
DOI: | 10.48550/arxiv.2004.14535 |