Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence

Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeli...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Peng, Letian, Li, Zuchao, Zhao, Hai
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Peng, Letian Li, Zuchao Zhao, Hai
description	Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new state-of-the-art for unlabeled F1 and strong baselines for labeled F1. In contrast with the conventional predicting-and-evaluating scenario, our method acts as an plausible example to inversely apply evaluating metrics for prediction.
doi_str_mv	10.48550/arxiv.2110.15931
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2110_15931</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2110_15931</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-2528b15f17cdfe38b01104d2f2e7afa6375c61f74596c8498a5c2653c4d4df483</originalsourceid><addsrcrecordid>eNotj0tOwzAYhL1hgQoHYIUvkBK_nSUKFJAqyqKsI8eP9JeCU9lJoLcnLaxGM9KM5kPojpRrroUoH0z6gXlNyRIQUTFyjXafMU9Hn2bI3uHN1Pe4HmIeYZx8tCf8YVKG2OFvGA_43UN3aId0Dp4gjwnaaYQhLmb2qVsK_gZdBdNnf_uvK7TfPO_r12K7e3mrH7eFkYoUVFDdEhGIsi54pttyucQdDdQrE4xkSlhJguKiklbzShthqRTMcsdd4Jqt0P3f7IWoOSb4MunUnMmaCxn7BZ5BSe8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence</title><source>arXiv.org</source><creator>Peng, Letian ; Li, Zuchao ; Zhao, Hai</creator><creatorcontrib>Peng, Letian ; Li, Zuchao ; Zhao, Hai</creatorcontrib><description>Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new state-of-the-art for unlabeled F1 and strong baselines for labeled F1. In contrast with the conventional predicting-and-evaluating scenario, our method acts as an plausible example to inversely apply evaluating metrics for prediction.</description><identifier>DOI: 10.48550/arxiv.2110.15931</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2021-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2110.15931$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2110.15931$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Peng, Letian</creatorcontrib><creatorcontrib>Li, Zuchao</creatorcontrib><creatorcontrib>Zhao, Hai</creatorcontrib><title>Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence</title><description>Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new state-of-the-art for unlabeled F1 and strong baselines for labeled F1. In contrast with the conventional predicting-and-evaluating scenario, our method acts as an plausible example to inversely apply evaluating metrics for prediction.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0tOwzAYhL1hgQoHYIUvkBK_nSUKFJAqyqKsI8eP9JeCU9lJoLcnLaxGM9KM5kPojpRrroUoH0z6gXlNyRIQUTFyjXafMU9Hn2bI3uHN1Pe4HmIeYZx8tCf8YVKG2OFvGA_43UN3aId0Dp4gjwnaaYQhLmb2qVsK_gZdBdNnf_uvK7TfPO_r12K7e3mrH7eFkYoUVFDdEhGIsi54pttyucQdDdQrE4xkSlhJguKiklbzShthqRTMcsdd4Jqt0P3f7IWoOSb4MunUnMmaCxn7BZ5BSe8</recordid><startdate>20211029</startdate><enddate>20211029</enddate><creator>Peng, Letian</creator><creator>Li, Zuchao</creator><creator>Zhao, Hai</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211029</creationdate><title>Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence</title><author>Peng, Letian ; Li, Zuchao ; Zhao, Hai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-2528b15f17cdfe38b01104d2f2e7afa6375c61f74596c8498a5c2653c4d4df483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Peng, Letian</creatorcontrib><creatorcontrib>Li, Zuchao</creatorcontrib><creatorcontrib>Zhao, Hai</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Peng, Letian</au><au>Li, Zuchao</au><au>Zhao, Hai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence</atitle><date>2021-10-29</date><risdate>2021</risdate><abstract>Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new state-of-the-art for unlabeled F1 and strong baselines for labeled F1. In contrast with the conventional predicting-and-evaluating scenario, our method acts as an plausible example to inversely apply evaluating metrics for prediction.</abstract><doi>10.48550/arxiv.2110.15931</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2110.15931
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2110_15931
source	arXiv.org
subjects	Computer Science - Computation and Language
title	Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T22%3A07%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20Full%20Constituency%20Parsing%20with%20Neighboring%20Distribution%20Divergence&rft.au=Peng,%20Letian&rft.date=2021-10-29&rft_id=info:doi/10.48550/arxiv.2110.15931&rft_dat=%3Carxiv_GOX%3E2110_15931%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true