Evaluating Morphological Compositional Generalization in Large Language Models

Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans ex...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ismayilzada, Mete, Circi, Defne, Sälevä, Jonne, Sirin, Hale, Köksal, Abdullatif, Dhingra, Bhuwan, Bosselut, Antoine, van der Plas, Lonneke, Ataman, Duygu
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Ismayilzada, Mete Circi, Defne Sälevä, Jonne Sirin, Hale Köksal, Abdullatif Dhingra, Bhuwan Bosselut, Antoine van der Plas, Lonneke Ataman, Duygu
description	Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans.
doi_str_mv	10.48550/arxiv.2410.12656
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_12656</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_12656</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_126563</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBqZmZpxMvi5liXmlCaWZOalK_jmFxVk5Ofkp2cmJ-YoOOfnFuQXZ5Zk5ucBee6pealFiTmZVYkgAYXMPAWfxKL0VCCZl16aCGT45qek5hTzMLCmJeYUp_JCaW4GeTfXEGcPXbDN8QVFmbmJRZXxIBfEg11gTFgFANZqPHo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Evaluating Morphological Compositional Generalization in Large Language Models</title><source>arXiv.org</source><creator>Ismayilzada, Mete ; Circi, Defne ; Sälevä, Jonne ; Sirin, Hale ; Köksal, Abdullatif ; Dhingra, Bhuwan ; Bosselut, Antoine ; van der Plas, Lonneke ; Ataman, Duygu</creator><creatorcontrib>Ismayilzada, Mete ; Circi, Defne ; Sälevä, Jonne ; Sirin, Hale ; Köksal, Abdullatif ; Dhingra, Bhuwan ; Bosselut, Antoine ; van der Plas, Lonneke ; Ataman, Duygu</creatorcontrib><description>Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans.</description><identifier>DOI: 10.48550/arxiv.2410.12656</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.12656$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.12656$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ismayilzada, Mete</creatorcontrib><creatorcontrib>Circi, Defne</creatorcontrib><creatorcontrib>Sälevä, Jonne</creatorcontrib><creatorcontrib>Sirin, Hale</creatorcontrib><creatorcontrib>Köksal, Abdullatif</creatorcontrib><creatorcontrib>Dhingra, Bhuwan</creatorcontrib><creatorcontrib>Bosselut, Antoine</creatorcontrib><creatorcontrib>van der Plas, Lonneke</creatorcontrib><creatorcontrib>Ataman, Duygu</creatorcontrib><title>Evaluating Morphological Compositional Generalization in Large Language Models</title><description>Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBqZmZpxMvi5liXmlCaWZOalK_jmFxVk5Ofkp2cmJ-YoOOfnFuQXZ5Zk5ucBee6pealFiTmZVYkgAYXMPAWfxKL0VCCZl16aCGT45qek5hTzMLCmJeYUp_JCaW4GeTfXEGcPXbDN8QVFmbmJRZXxIBfEg11gTFgFANZqPHo</recordid><startdate>20241016</startdate><enddate>20241016</enddate><creator>Ismayilzada, Mete</creator><creator>Circi, Defne</creator><creator>Sälevä, Jonne</creator><creator>Sirin, Hale</creator><creator>Köksal, Abdullatif</creator><creator>Dhingra, Bhuwan</creator><creator>Bosselut, Antoine</creator><creator>van der Plas, Lonneke</creator><creator>Ataman, Duygu</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241016</creationdate><title>Evaluating Morphological Compositional Generalization in Large Language Models</title><author>Ismayilzada, Mete ; Circi, Defne ; Sälevä, Jonne ; Sirin, Hale ; Köksal, Abdullatif ; Dhingra, Bhuwan ; Bosselut, Antoine ; van der Plas, Lonneke ; Ataman, Duygu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_126563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Ismayilzada, Mete</creatorcontrib><creatorcontrib>Circi, Defne</creatorcontrib><creatorcontrib>Sälevä, Jonne</creatorcontrib><creatorcontrib>Sirin, Hale</creatorcontrib><creatorcontrib>Köksal, Abdullatif</creatorcontrib><creatorcontrib>Dhingra, Bhuwan</creatorcontrib><creatorcontrib>Bosselut, Antoine</creatorcontrib><creatorcontrib>van der Plas, Lonneke</creatorcontrib><creatorcontrib>Ataman, Duygu</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ismayilzada, Mete</au><au>Circi, Defne</au><au>Sälevä, Jonne</au><au>Sirin, Hale</au><au>Köksal, Abdullatif</au><au>Dhingra, Bhuwan</au><au>Bosselut, Antoine</au><au>van der Plas, Lonneke</au><au>Ataman, Duygu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluating Morphological Compositional Generalization in Large Language Models</atitle><date>2024-10-16</date><risdate>2024</risdate><abstract>Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans.</abstract><doi>10.48550/arxiv.2410.12656</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2410.12656
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2410_12656
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
title	Evaluating Morphological Compositional Generalization in Large Language Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T17%3A22%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluating%20Morphological%20Compositional%20Generalization%20in%20Large%20Language%20Models&rft.au=Ismayilzada,%20Mete&rft.date=2024-10-16&rft_id=info:doi/10.48550/arxiv.2410.12656&rft_dat=%3Carxiv_GOX%3E2410_12656%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true