Learning labeling functions in distantly supervised relation extraction

Distant supervision has become the leading method for training large-scale information extractors. It could be encoded in the form of labeling functions, which employ knowledge bases to provide labels for the data. However, most previous works use only simple labeling functions, resulting in too muc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Intelligent data analysis 2020-01, Vol.24 (2), p.427-443
Hauptverfasser: Gui, Yaocheng, Liu, Qian, Gao, Zhiqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Distant supervision has become the leading method for training large-scale information extractors. It could be encoded in the form of labeling functions, which employ knowledge bases to provide labels for the data. However, most previous works use only simple labeling functions, resulting in too much noise in the training data, and the knowledge bases are far from well-explored. In this paper, in order to improve the labeling quality of the training data for distant supervision relation extraction, we propose to make use of existing knowledge bases to effectively learn labeling functions. Specifically, labeling functions are represented as Markov Logic, which can integrate various resources into a unified model naturally. Experimental results show that the training data produced by the learned labeling functions is significantly improved in quality. Different distantly supervised relation extraction models trained on the produced training data can also achieve better performances.
ISSN:1088-467X
1571-4128
DOI:10.3233/IDA-194492