DESYR: Definition and Syntactic Representation Based Claim Detection on the Web
The formulation of a claim rests at the core of argument mining. To demarcate between a claim and a non-claim is arduous for both humans and machines, owing to latent linguistic variance between the two and the inadequacy of extensive definition-based formalization. Furthermore, the increase in the...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The formulation of a claim rests at the core of argument mining. To demarcate
between a claim and a non-claim is arduous for both humans and machines, owing
to latent linguistic variance between the two and the inadequacy of extensive
definition-based formalization. Furthermore, the increase in the usage of
online social media has resulted in an explosion of unsolicited information on
the web presented as informal text. To account for the aforementioned, in this
paper, we proposed DESYR. It is a framework that intends on annulling the said
issues for informal web-based text by leveraging a combination of hierarchical
representation learning (dependency-inspired Poincare embedding),
definition-based alignment, and feature projection. We do away with fine-tuning
computer-heavy language models in favor of fabricating a more domain-centric
but lighter approach. Experimental results indicate that DESYR builds upon the
state-of-the-art system across four benchmark claim datasets, most of which
were constructed with informal texts. We see an increase of 3 claim-F1 points
on the LESA-Twitter dataset, an increase of 1 claim-F1 point and 9 macro-F1
points on the Online Comments(OC) dataset, an increase of 24 claim-F1 points
and 17 macro-F1 points on the Web Discourse(WD) dataset, and an increase of 8
claim-F1 points and 5 macro-F1 points on the Micro Texts(MT) dataset. We also
perform an extensive analysis of the results. We make a 100-D pre-trained
version of our Poincare-variant along with the source code. |
---|---|
DOI: | 10.48550/arxiv.2108.08759 |