Lost-in-Distance: Impact of Contextual Proximity on LLM Performance in Graph Tasks
Despite significant advancements, Large Language Models (LLMs) exhibit blind spots that impair their ability to retrieve and process relevant contextual data effectively. We demonstrate that LLM performance in graph tasks with complexities beyond the "needle-in-a-haystack" scenario-where s...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite significant advancements, Large Language Models (LLMs) exhibit blind
spots that impair their ability to retrieve and process relevant contextual
data effectively. We demonstrate that LLM performance in graph tasks with
complexities beyond the "needle-in-a-haystack" scenario-where solving the
problem requires cross-referencing and reasoning across multiple subproblems
jointly-is influenced by the proximity of relevant information within the
context, a phenomenon we term "lost-in-distance". We examine two fundamental
graph tasks: identifying common connections between two nodes and assessing
similarity among three nodes, and show that the model's performance in these
tasks significantly depends on the relative positioning of common edges. We
evaluate three publicly available LLMs-Llama-3-8B, Llama-3-70B, and GPT-4-using
various graph encoding techniques that represent graph structures for LLM
input. We propose a formulation for the lost-in-distance phenomenon and
demonstrate that lost-in-distance and lost-in-the middle phenomenas occur
independently. Results indicate that model accuracy can decline by up to 6x as
the distance between node connections increases, independent of graph encoding
and model size. |
---|---|
DOI: | 10.48550/arxiv.2410.01985 |