MGCoT: Multi-Grained Contextual Transformer for table-based text generation

Recent advances in Transformer have led to the revolution of table-based text generation. However, most existing Transformer-based architectures ignore the rich contexts among input tokens distributed in multi-level units (e.g., cell, row, or column), leading to sometimes unfaithful text generation...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2024-09, Vol.250, p.123742, Article 123742
Hauptverfasser:	Mo, Xianjie, Xiang, Yang, Pan, Youcheng, Hou, Yongshuai, Luo, Ping
Format:	Artikel
Sprache:	eng
Schlagworte:	Abstractive table question answering Multi-grained contexts Table-to-text generation Transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recent advances in Transformer have led to the revolution of table-based text generation. However, most existing Transformer-based architectures ignore the rich contexts among input tokens distributed in multi-level units (e.g., cell, row, or column), leading to sometimes unfaithful text generation that fails to establish accurate association relationships and misses vital information. In this paper, we propose Multi-Grained Contextual Transformer (MGCoT), a novel architecture that fully capitalizes on the multi-grained contexts among input tokens and thus strengthens the capacity of table-based text generation. The key primitive, Multi-Grained Contexts (MGCo) module, involves two components: a local context sub-module that adaptively gathers neighboring tokens to form the token-wise local context features, and a global context sub-module that consistently aggregates tokens from a broader range to form the shared global context feature. The former aims at modeling the short-range dependencies that reflect the salience of tokens within similar fine-grained units (e.g., cell and row) attending to the query token, while the latter aims at capturing the long-range dependencies that reflect the significance of each token within similar coarse-grained units (e.g., multiple rows or columns). Based on the fused multi-grained contexts, MGCoT can flexibly and holistically model the content of a table across multi-level structures. On three benchmark datasets, ToTTo, FeTaQA, and Tablesum, MGCoT outperforms strong baselines by a large margin on the quality of the generated texts, demonstrating the effectiveness of multi-grained context modeling. Our source codes are available at https://github.com/Cedric-Mo/MGCoT. •The contexts for each token in a table are various from the structural perspective.•Forming the local contexts allows models to capture contexts in a dynamic range.•Forming the shared global context allows models to capture the consensus.•Models can flexibly and holistically comprehend a table via multi-grained contexts.
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2024.123742