Knowledge Graph Guided Semantic Evaluation of Language Models For User Trust
A fundamental question in natural language processing is - what kind of language structure and semantics is the language model capturing? Graph formats such as knowledge graphs are easy to evaluate as they explicitly express language semantics and structure. This study evaluates the semantics encode...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A fundamental question in natural language processing is - what kind of
language structure and semantics is the language model capturing? Graph formats
such as knowledge graphs are easy to evaluate as they explicitly express
language semantics and structure. This study evaluates the semantics encoded in
the self-attention transformers by leveraging explicit knowledge graph
structures. We propose novel metrics to measure the reconstruction error when
providing graph path sequences from a knowledge graph and trying to
reproduce/reconstruct the same from the outputs of the self-attention
transformer models. The opacity of language models has an immense bearing on
societal issues of trust and explainable decision outcomes. Our findings
suggest that language models are models of stochastic control processes for
plausible language pattern generation. However, they do not ascribe object and
concept-level meaning and semantics to the learned stochastic patterns such as
those described in knowledge graphs. Furthermore, to enable robust evaluation
of concept understanding by language models, we construct and make public an
augmented language understanding benchmark built on the General Language
Understanding Evaluation (GLUE) benchmark. This has significant
application-level user trust implications as stochastic patterns without a
strong sense of meaning cannot be trusted in high-stakes applications. |
---|---|
DOI: | 10.48550/arxiv.2305.04989 |