Investigating Instruction Tuning Large Language Models on Graphs

Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhu, Kerui, Huang, Bo-Wei, Jin, Bowen, Jiao, Yizhu, Zhong, Ming, Chang, Kevin, Lin, Shou-De, Han, Jiawei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Zhu, Kerui
Huang, Bo-Wei
Jin, Bowen
Jiao, Yizhu
Zhong, Ming
Chang, Kevin
Lin, Shou-De
Han, Jiawei
description Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.
doi_str_mv 10.48550/arxiv.2408.05457
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_05457</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_05457</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_054573</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwNTE152Rw8MwrSy0uyUxPLMnMS1fwzCsuKSpNLsnMz1MIKc0DCfkkFqWnAsm89NJEIMM3PyU1p1gBKO9elFiQUczDwJqWmFOcyguluRnk3VxDnD10wXbFFxRl5iYWVcaD7IwH22lMWAUAGwk25Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Investigating Instruction Tuning Large Language Models on Graphs</title><source>arXiv.org</source><creator>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</creator><creatorcontrib>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</creatorcontrib><description>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</description><identifier>DOI: 10.48550/arxiv.2408.05457</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.05457$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.05457$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhu, Kerui</creatorcontrib><creatorcontrib>Huang, Bo-Wei</creatorcontrib><creatorcontrib>Jin, Bowen</creatorcontrib><creatorcontrib>Jiao, Yizhu</creatorcontrib><creatorcontrib>Zhong, Ming</creatorcontrib><creatorcontrib>Chang, Kevin</creatorcontrib><creatorcontrib>Lin, Shou-De</creatorcontrib><creatorcontrib>Han, Jiawei</creatorcontrib><title>Investigating Instruction Tuning Large Language Models on Graphs</title><description>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwNTE152Rw8MwrSy0uyUxPLMnMS1fwzCsuKSpNLsnMz1MIKc0DCfkkFqWnAsm89NJEIMM3PyU1p1gBKO9elFiQUczDwJqWmFOcyguluRnk3VxDnD10wXbFFxRl5iYWVcaD7IwH22lMWAUAGwk25Q</recordid><startdate>20240810</startdate><enddate>20240810</enddate><creator>Zhu, Kerui</creator><creator>Huang, Bo-Wei</creator><creator>Jin, Bowen</creator><creator>Jiao, Yizhu</creator><creator>Zhong, Ming</creator><creator>Chang, Kevin</creator><creator>Lin, Shou-De</creator><creator>Han, Jiawei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240810</creationdate><title>Investigating Instruction Tuning Large Language Models on Graphs</title><author>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_054573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Kerui</creatorcontrib><creatorcontrib>Huang, Bo-Wei</creatorcontrib><creatorcontrib>Jin, Bowen</creatorcontrib><creatorcontrib>Jiao, Yizhu</creatorcontrib><creatorcontrib>Zhong, Ming</creatorcontrib><creatorcontrib>Chang, Kevin</creatorcontrib><creatorcontrib>Lin, Shou-De</creatorcontrib><creatorcontrib>Han, Jiawei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhu, Kerui</au><au>Huang, Bo-Wei</au><au>Jin, Bowen</au><au>Jiao, Yizhu</au><au>Zhong, Ming</au><au>Chang, Kevin</au><au>Lin, Shou-De</au><au>Han, Jiawei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Investigating Instruction Tuning Large Language Models on Graphs</atitle><date>2024-08-10</date><risdate>2024</risdate><abstract>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</abstract><doi>10.48550/arxiv.2408.05457</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2408.05457
ispartof
issn
language eng
recordid cdi_arxiv_primary_2408_05457
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
title Investigating Instruction Tuning Large Language Models on Graphs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T00%3A16%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20Instruction%20Tuning%20Large%20Language%20Models%20on%20Graphs&rft.au=Zhu,%20Kerui&rft.date=2024-08-10&rft_id=info:doi/10.48550/arxiv.2408.05457&rft_dat=%3Carxiv_GOX%3E2408_05457%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true