Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence
Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeli...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unsupervised constituency parsing has been explored much but is still far
from being solved. Conventional unsupervised constituency parser is only able
to capture the unlabeled structure of sentences. Towards unsupervised full
constituency parsing, we propose an unsupervised and training-free labeling
procedure by exploiting the property of a recently introduced metric,
Neighboring Distribution Divergence (NDD), which evaluates semantic similarity
between sentences before and after editions. For implementation, we develop NDD
into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their
labels in sentences. We show that DP-NDD not only labels constituents precisely
but also inducts more accurate unlabeled constituency trees than all previous
unsupervised methods with simpler rules. With two frameworks for labeled
constituency trees inference, we set both the new state-of-the-art for
unlabeled F1 and strong baselines for labeled F1. In contrast with the
conventional predicting-and-evaluating scenario, our method acts as an
plausible example to inversely apply evaluating metrics for prediction. |
---|---|
DOI: | 10.48550/arxiv.2110.15931 |