Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources
The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is c...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The quality of web sources has been traditionally evaluated using exogenous
signals such as the hyperlink structure of the graph. We propose a new approach
that relies on endogenous signals, namely, the correctness of factual
information provided by the source. A source that has few false facts is
considered to be trustworthy. The facts are automatically extracted from each
source by information extraction methods commonly used to construct knowledge
bases. We propose a way to distinguish errors made in the extraction process
from factual errors in the web source per se, by using joint inference in a
novel multi-layer probabilistic model. We call the trustworthiness score we
computed Knowledge-Based Trust (KBT). On synthetic data, we show that our
method can reliably compute the true trustworthiness levels of the sources. We
then apply it to a database of 2.8B facts extracted from the web, and thereby
estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of
the results confirms the effectiveness of the method. |
---|---|
DOI: | 10.48550/arxiv.1502.03519 |