Linking Surface Facts to Large-Scale Knowledge Graphs
Open Information Extraction (OIE) methods extract facts from natural language text in the form of ("subject"; "relation"; "object") triples. These facts are, however, merely surface forms, the ambiguity of which impedes their downstream usage; e.g., the surface phrase &...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Open Information Extraction (OIE) methods extract facts from natural language
text in the form of ("subject"; "relation"; "object") triples. These facts are,
however, merely surface forms, the ambiguity of which impedes their downstream
usage; e.g., the surface phrase "Michael Jordan" may refer to either the former
basketball player or the university professor. Knowledge Graphs (KGs), on the
other hand, contain facts in a canonical (i.e., unambiguous) form, but their
coverage is limited by a static schema (i.e., a fixed set of entities and
predicates). To bridge this gap, we need the best of both worlds: (i) high
coverage of free-text OIEs, and (ii) semantic precision (i.e., monosemy) of
KGs. In order to achieve this goal, we propose a new benchmark with novel
evaluation protocols that can, for example, measure fact linking performance on
a granular triple slot level, while also measuring if a system has the ability
to recognize that a surface form has no match in the existing KG. Our extensive
evaluation of several baselines show that detection of out-of-KG entities and
predicates is more difficult than accurate linking to existing ones, thus
calling for more research efforts on this difficult task. We publicly release
all resources (data, benchmark and code) on
https://github.com/nec-research/fact-linking. |
---|---|
DOI: | 10.48550/arxiv.2310.14909 |