RelHunter: a machine learning method for relation extraction from text

We propose RelHunter , a machine learning-based method for the extraction of structured information from text. RelHunter ’s key idea is to model the target structures as a relation over entities. Hence, the modeling effort is reduced to the identification of entities and the generation of a candidat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the Brazilian Computer Society 2010-09, Vol.16 (3), p.191-199
Hauptverfasser: Fernandes, Eraldo R., Milidiú, Ruy L., Rentería, Raúl P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose RelHunter , a machine learning-based method for the extraction of structured information from text. RelHunter ’s key idea is to model the target structures as a relation over entities. Hence, the modeling effort is reduced to the identification of entities and the generation of a candidate relation, which are simpler problems than the original one. RelHunter fits a very broad spectrum of complex computational linguistic problems. We apply it to five tasks: phrase chunking, clause identification, hedge detection, quotation extraction, and dependency parsing. We compare RelHunter to token classification approaches through several computational experiments on seven multilingual corpora. RelHunter outperforms the token classification approaches by 2.14% on average. Moreover, we compare the derived systems against state-of-the-art systems for each corpus. Our systems achieve state-of-the-art performances for three corpora: Portuguese phrase chunking, Portuguese clause identification, and English quotation extraction. Additionally, the derived systems show good quality performance for the other four corpora.
ISSN:0104-6500
1678-4804
DOI:10.1007/s13173-010-0018-y