Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation

Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in compa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Seraj, Javad, Mohajeri, Mohammad Mahdi, Dousti, Mohammad Javad, Ahmadabadi, Majid Nili
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Seraj, Javad Mohajeri, Mohammad Mahdi Dousti, Mohammad Javad Ahmadabadi, Majid Nili
description	Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in comparison to human evaluators, they often struggle to adapt to reference evaluators over time, a requirement for achieving personalized judgment. Additionally, numerous works have attempted to apply open LLMs as judges or evaluators, but these efforts frequently overlook the limitations of working with scarce data. Personalized judgment is inherently associated with limited data scenarios, which are common in many real-world problems. Our work aims to present a data augmentation technique to select a more effective sample from limited data in order to align an open LLM with human preference. Our work achieves approximately 7% improvements in Pearson correlation with a reference judge over the baseline,and 30% improvement over the base model (Llama3.1-8B-Instruct) in the mathematical reasoning evaluation task. demonstrating that augmenting selecting more effective preference data enables our approach to surpass baseline methods.
doi_str_mv	10.48550/arxiv.2412.07429
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_07429</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_07429</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_074293</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwNzGy5GSI8C8oyczNrMrMS1dwzMlMz8tNzStRKM8syVDwSS0utgKSZalFiekgeZfEkkQFx9J0kJLEksz8PIW0_CKFgNSi4vy8xJzMqtQUBdeyxJxSsBwPA2taYk5xKi-U5maQd3MNcfbQBbshvqAoMzexqDIe5JZ4sFuMCasAAGChQBg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation</title><source>arXiv.org</source><creator>Seraj, Javad ; Mohajeri, Mohammad Mahdi ; Dousti, Mohammad Javad ; Ahmadabadi, Majid Nili</creator><creatorcontrib>Seraj, Javad ; Mohajeri, Mohammad Mahdi ; Dousti, Mohammad Javad ; Ahmadabadi, Majid Nili</creatorcontrib><description>Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in comparison to human evaluators, they often struggle to adapt to reference evaluators over time, a requirement for achieving personalized judgment. Additionally, numerous works have attempted to apply open LLMs as judges or evaluators, but these efforts frequently overlook the limitations of working with scarce data. Personalized judgment is inherently associated with limited data scenarios, which are common in many real-world problems. Our work aims to present a data augmentation technique to select a more effective sample from limited data in order to align an open LLM with human preference. Our work achieves approximately 7% improvements in Pearson correlation with a reference judge over the baseline,and 30% improvement over the base model (Llama3.1-8B-Instruct) in the mathematical reasoning evaluation task. demonstrating that augmenting selecting more effective preference data enables our approach to surpass baseline methods.</description><identifier>DOI: 10.48550/arxiv.2412.07429</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-12</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.07429$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.07429$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Seraj, Javad</creatorcontrib><creatorcontrib>Mohajeri, Mohammad Mahdi</creatorcontrib><creatorcontrib>Dousti, Mohammad Javad</creatorcontrib><creatorcontrib>Ahmadabadi, Majid Nili</creatorcontrib><title>Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation</title><description>Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in comparison to human evaluators, they often struggle to adapt to reference evaluators over time, a requirement for achieving personalized judgment. Additionally, numerous works have attempted to apply open LLMs as judges or evaluators, but these efforts frequently overlook the limitations of working with scarce data. Personalized judgment is inherently associated with limited data scenarios, which are common in many real-world problems. Our work aims to present a data augmentation technique to select a more effective sample from limited data in order to align an open LLM with human preference. Our work achieves approximately 7% improvements in Pearson correlation with a reference judge over the baseline,and 30% improvement over the base model (Llama3.1-8B-Instruct) in the mathematical reasoning evaluation task. demonstrating that augmenting selecting more effective preference data enables our approach to surpass baseline methods.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwNzGy5GSI8C8oyczNrMrMS1dwzMlMz8tNzStRKM8syVDwSS0utgKSZalFiekgeZfEkkQFx9J0kJLEksz8PIW0_CKFgNSi4vy8xJzMqtQUBdeyxJxSsBwPA2taYk5xKi-U5maQd3MNcfbQBbshvqAoMzexqDIe5JZ4sFuMCasAAGChQBg</recordid><startdate>20241210</startdate><enddate>20241210</enddate><creator>Seraj, Javad</creator><creator>Mohajeri, Mohammad Mahdi</creator><creator>Dousti, Mohammad Javad</creator><creator>Ahmadabadi, Majid Nili</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241210</creationdate><title>Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation</title><author>Seraj, Javad ; Mohajeri, Mohammad Mahdi ; Dousti, Mohammad Javad ; Ahmadabadi, Majid Nili</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_074293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Seraj, Javad</creatorcontrib><creatorcontrib>Mohajeri, Mohammad Mahdi</creatorcontrib><creatorcontrib>Dousti, Mohammad Javad</creatorcontrib><creatorcontrib>Ahmadabadi, Majid Nili</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Seraj, Javad</au><au>Mohajeri, Mohammad Mahdi</au><au>Dousti, Mohammad Javad</au><au>Ahmadabadi, Majid Nili</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation</atitle><date>2024-12-10</date><risdate>2024</risdate><abstract>Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in comparison to human evaluators, they often struggle to adapt to reference evaluators over time, a requirement for achieving personalized judgment. Additionally, numerous works have attempted to apply open LLMs as judges or evaluators, but these efforts frequently overlook the limitations of working with scarce data. Personalized judgment is inherently associated with limited data scenarios, which are common in many real-world problems. Our work aims to present a data augmentation technique to select a more effective sample from limited data in order to align an open LLM with human preference. Our work achieves approximately 7% improvements in Pearson correlation with a reference judge over the baseline,and 30% improvement over the base model (Llama3.1-8B-Instruct) in the mathematical reasoning evaluation task. demonstrating that augmenting selecting more effective preference data enables our approach to surpass baseline methods.</abstract><doi>10.48550/arxiv.2412.07429</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2412.07429
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2412_07429
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
title	Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T18%3A27%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimizing%20Alignment%20with%20Less:%20Leveraging%20Data%20Augmentation%20for%20Personalized%20Evaluation&rft.au=Seraj,%20Javad&rft.date=2024-12-10&rft_id=info:doi/10.48550/arxiv.2412.07429&rft_dat=%3Carxiv_GOX%3E2412_07429%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true