Investigating Instruction Tuning Large Language Models on Graphs

Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhu, Kerui, Huang, Bo-Wei, Jin, Bowen, Jiao, Yizhu, Zhong, Ming, Chang, Kevin, Lin, Shou-De, Han, Jiawei
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Zhu, Kerui Huang, Bo-Wei Jin, Bowen Jiao, Yizhu Zhong, Ming Chang, Kevin Lin, Shou-De Han, Jiawei
description	Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.
doi_str_mv	10.48550/arxiv.2408.05457
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_05457</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_05457</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_054573</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwNTE152Rw8MwrSy0uyUxPLMnMS1fwzCsuKSpNLsnMz1MIKc0DCfkkFqWnAsm89NJEIMM3PyU1p1gBKO9elFiQUczDwJqWmFOcyguluRnk3VxDnD10wXbFFxRl5iYWVcaD7IwH22lMWAUAGwk25Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Investigating Instruction Tuning Large Language Models on Graphs</title><source>arXiv.org</source><creator>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</creator><creatorcontrib>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</creatorcontrib><description>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</description><identifier>DOI: 10.48550/arxiv.2408.05457</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.05457$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.05457$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhu, Kerui</creatorcontrib><creatorcontrib>Huang, Bo-Wei</creatorcontrib><creatorcontrib>Jin, Bowen</creatorcontrib><creatorcontrib>Jiao, Yizhu</creatorcontrib><creatorcontrib>Zhong, Ming</creatorcontrib><creatorcontrib>Chang, Kevin</creatorcontrib><creatorcontrib>Lin, Shou-De</creatorcontrib><creatorcontrib>Han, Jiawei</creatorcontrib><title>Investigating Instruction Tuning Large Language Models on Graphs</title><description>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwNTE152Rw8MwrSy0uyUxPLMnMS1fwzCsuKSpNLsnMz1MIKc0DCfkkFqWnAsm89NJEIMM3PyU1p1gBKO9elFiQUczDwJqWmFOcyguluRnk3VxDnD10wXbFFxRl5iYWVcaD7IwH22lMWAUAGwk25Q</recordid><startdate>20240810</startdate><enddate>20240810</enddate><creator>Zhu, Kerui</creator><creator>Huang, Bo-Wei</creator><creator>Jin, Bowen</creator><creator>Jiao, Yizhu</creator><creator>Zhong, Ming</creator><creator>Chang, Kevin</creator><creator>Lin, Shou-De</creator><creator>Han, Jiawei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240810</creationdate><title>Investigating Instruction Tuning Large Language Models on Graphs</title><author>Zhu, Kerui ; Huang, Bo-Wei ; Jin, Bowen ; Jiao, Yizhu ; Zhong, Ming ; Chang, Kevin ; Lin, Shou-De ; Han, Jiawei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_054573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Kerui</creatorcontrib><creatorcontrib>Huang, Bo-Wei</creatorcontrib><creatorcontrib>Jin, Bowen</creatorcontrib><creatorcontrib>Jiao, Yizhu</creatorcontrib><creatorcontrib>Zhong, Ming</creatorcontrib><creatorcontrib>Chang, Kevin</creatorcontrib><creatorcontrib>Lin, Shou-De</creatorcontrib><creatorcontrib>Han, Jiawei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhu, Kerui</au><au>Huang, Bo-Wei</au><au>Jin, Bowen</au><au>Jiao, Yizhu</au><au>Zhong, Ming</au><au>Chang, Kevin</au><au>Lin, Shou-De</au><au>Han, Jiawei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Investigating Instruction Tuning Large Language Models on Graphs</atitle><date>2024-08-10</date><risdate>2024</risdate><abstract>Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing a dataset designed for instruction tuning, which comprises a diverse collection of 79 graph-related tasks from academic and e-commerce domains, featuring 44,240 training instances and 18,960 test samples. Utilizing this benchmark, our initial investigation focuses on identifying the optimal graph representation that serves as a conduit for LLMs to understand complex graph structures. Our findings indicate that JSON format for graph representation consistently outperforms natural language and code formats across various LLMs and graph types. Furthermore, we examine the key factors that influence the generalization abilities of instruction-tuned LLMs by evaluating their performance on both in-domain and out-of-domain graph tasks.</abstract><doi>10.48550/arxiv.2408.05457</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2408.05457
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2408_05457
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
title	Investigating Instruction Tuning Large Language Models on Graphs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T00%3A16%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20Instruction%20Tuning%20Large%20Language%20Models%20on%20Graphs&rft.au=Zhu,%20Kerui&rft.date=2024-08-10&rft_id=info:doi/10.48550/arxiv.2408.05457&rft_dat=%3Carxiv_GOX%3E2408_05457%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true