Enhancing Structure Preservation in Coreference Resolution by Constrained Graph Encoding

Coreference resolution is a challenging yet practical problem. Most previous methods are designed to better utilize sequential features of language but can hardly capture the structural associations between mentions. In addition, it is often observed that during long-term training, the embeddings pr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.2557-2567
Hauptverfasser:	Fan, Chuang, Li, Jiaming, Luo, Xuan, Xu, Ruifeng
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Adaptive constraints Annotations Coding Constraints coreference resolution Encoding graph convolutional network Language Learning Predictive models Semantics structural information Syntactics Syntax Task analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Coreference resolution is a challenging yet practical problem. Most previous methods are designed to better utilize sequential features of language but can hardly capture the structural associations between mentions. In addition, it is often observed that during long-term training, the embeddings projected from unrelated mentions tend to move closer or even mix together, which increases the difficulty of learning decision boundaries. To tackle these issues: i) We propose a general graph schema derived from diverse knowledge sources (e.g., lemma, type, and semantic roles) to directly link mentions, so that rich information can be exchanged via the relevant connections; ii) We impose two adaptive constraints during graph encoding to regularize the embedding space. One is used to force different sub-modules to generate consistent predictions for the same mention pairs, and the other aims to make the learned embeddings corresponding to unrelated mentions more distinguishable while those of coreferential mentions more similar. Results on two public datasets (ECB+ and ACE05) show that our model consistently outperforms state-of-the-art baselines under different settings with p-value less than 0.01 in t-test, especially learning effectively from the limited labeled data.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2022.3193222