Enhancing Abstractiveness of Summarization Models through Calibrated Distillation
Sequence-level knowledge distillation reduces the size of Seq2Seq models for more efficient abstractive summarization. However, it often leads to a loss of abstractiveness in summarization. In this paper, we propose a novel approach named DisCal to enhance the level of abstractiveness (measured by n...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sequence-level knowledge distillation reduces the size of Seq2Seq models for
more efficient abstractive summarization. However, it often leads to a loss of
abstractiveness in summarization. In this paper, we propose a novel approach
named DisCal to enhance the level of abstractiveness (measured by n-gram
overlap) without sacrificing the informativeness (measured by ROUGE) of
generated summaries. DisCal exposes diverse pseudo summaries with two
supervision to the student model. Firstly, the best pseudo summary is
identified in terms of abstractiveness and informativeness and used for
sequence-level distillation. Secondly, their ranks are used to ensure the
student model to assign higher prediction scores to summaries with higher
ranks. Our experiments show that DisCal outperforms prior methods in
abstractive summarization distillation, producing highly abstractive and
informative summaries. |
---|---|
DOI: | 10.48550/arxiv.2310.13760 |