Learning Context-Aware Convolutional Filters for Implicit Discourse Relation Classification

Implicit discourse relation classification (IDRC) is considered the most difficult component of shallow discourse parsing as the relation prediction in the absence of necessary clues requires a deep understanding of the context information of the sentences. Convolutional neural networks (CNNs) have...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2021, Vol.29, p.2421-2433
Hauptverfasser: Munir, Kashif, Zhao, Hai, Li, Zuchao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Implicit discourse relation classification (IDRC) is considered the most difficult component of shallow discourse parsing as the relation prediction in the absence of necessary clues requires a deep understanding of the context information of the sentences. Convolutional neural networks (CNNs) have emerged as an important encoding block for sentences in natural language processing (NLP). CNNs use a specific set of filters for the inputs which may lead to the partial coverage of contextual clues. Furthermore, conventional CNNs may not allow the initial communication between the sentences which is a crucial step for IDRC. We present an adaptive convolution approach for IDRC that utilizes context aware filters for the convolution operation. The goal is to abstract the context of sentences in the filters and let them interact with sentence representations, i.e. learning the representations through learned filters. Our model acts as a cross questioning agent by generating filters from one argument and convolving them with the other for the IDRC task. This process is analogous to the attention mechanism because both methods aim at abstracting contextual information. Different from the attention mechanism, our approach directly encodes the contextual representations in the form of filters and allows the initial communication between arguments during encoding. Furthermore, the adaptive convolution can also work alongside the attention mechanism to enhance the representational ability of the adaptive CNN encoder. Experiments on PDTB 2.0 and CDTB datasets show that our approach outperforms all the baselines by a fair margin and achieves excellent results.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2021.3096041