A Novel Computational and Modeling Foundation for Automatic Coherence Assessment
Coherence is an essential property of well-written texts, that refers to the way textual units relate to one another. In the era of generative AI, coherence assessment is essential for many NLP tasks; summarization, generation, long-form question-answering, and more. However, in NLP {coherence} is a...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Coherence is an essential property of well-written texts, that refers to the
way textual units relate to one another. In the era of generative AI, coherence
assessment is essential for many NLP tasks; summarization, generation,
long-form question-answering, and more. However, in NLP {coherence} is an
ill-defined notion, not having a formal definition or evaluation metrics, that
would allow for large-scale automatic and systematic coherence assessment. To
bridge this gap, in this work we employ the formal linguistic definition of
\citet{Reinhart:1980} of what makes a discourse coherent, consisting of three
conditions -- {\em cohesion, consistency} and {\em relevance} -- and formalize
these conditions as respective computational tasks. We hypothesize that (i) a
model trained on all of these tasks will learn the features required for
coherence detection, and that (ii) a joint model for all tasks will exceed the
performance of models trained on each task individually. On two benchmarks for
coherence scoring rated by humans, one containing 500 automatically-generated
short stories and another containing 4k real-world texts, our experiments
confirm that jointly training on the proposed tasks leads to better performance
on each task compared with task-specific models, and to better performance on
assessing coherence overall, compared with strong baselines. We conclude that
the formal and computational setup of coherence as proposed here provides a
solid foundation for advanced methods of large-scale automatic assessment of
coherence. |
---|---|
DOI: | 10.48550/arxiv.2310.00598 |