PragmaticOIE: a pragmatic open information extraction for Portuguese language

Information extraction (IE) involves the extraction of useful facts from texts. IE approaches have been categorized into two types: Traditional IE and Open IE. Traditional IE recognizes a predefined set of relationships between the arguments, and it has typically been applied to specific domains. Op...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2020-09, Vol.62 (9), p.3811-3836
Hauptverfasser: Sena, Cleiton Fernando Lima, Claro, Daniela Barreiro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Information extraction (IE) involves the extraction of useful facts from texts. IE approaches have been categorized into two types: Traditional IE and Open IE. Traditional IE recognizes a predefined set of relationships between the arguments, and it has typically been applied to specific domains. Open IE extracts relationship descriptors expressing any semantic relationship between a pair of arguments in different domains. Although a sentence can have a different meaning, given the context and intention used, a single semantic analysis does not guarantee useful extractions. Extractions depend on the context and the intention inherited in a sentence that goes beyond the semantic meaning. Thus, a pragmatic analysis enhances the set of extractions by considering the contextual and intentional aspects. As a consequence, new facts can be extracted from this set of sentences. The combination of inference, context, and intention enables the extraction of implicit facts from texts achieving a first pragmatic level. This novel approach increases the number of facts, extracting relationships from a sentence analyzing inference, context, and intention. This is the first method to analyze a first pragmatic level from a sentence within a set of Portuguese text documents. Our method was performed over a set of Portuguese text documents and outperforms the most relevant related work comparing accuracy, number of extracted facts, and minimality measures.
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-020-01442-7