Minimizing Factual Inconsistency and Hallucination in Large Language Models

Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a los...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-11
Hauptverfasser:	Muneeswaran, I, Saxena, Shreya, Prasad, Siva, Sai Prakash, M V, Shankar, Advaith, Varun, V, Vaddina, Vishal, Gopalakrishnan, Saisubramaniam
Format:	Artikel
Sprache:	eng
Schlagworte:	Large language models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Muneeswaran, I Saxena, Shreya Prasad, Siva Sai Prakash, M V Shankar, Advaith Varun, V Vaddina, Vishal Gopalakrishnan, Saisubramaniam
description	Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2894157310</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2894157310</sourcerecordid><originalsourceid>FETCH-proquest_journals_28941573103</originalsourceid><addsrcrecordid>eNqNirEKwjAURYMgWLT_EHAupElr6yyWinZzl0cayyvxRZtk0K-3gx_gcs-BcxcskUrlWV1IuWKp96MQQu4qWZYqYecOCR_4QRp4AzpEsPxE2pFHHwzpNwfqeQvWRo0EAR1xJH6BaTDz0hBhls71xvoNW97BepP-uGbb5ng9tNlzcq9ofLiNLk40p5us90VeVioX6r_XFyEhPSw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2894157310</pqid></control><display><type>article</type><title>Minimizing Factual Inconsistency and Hallucination in Large Language Models</title><source>Free E- Journals</source><creator>Muneeswaran, I ; Saxena, Shreya ; Prasad, Siva ; Sai Prakash, M V ; Shankar, Advaith ; Varun, V ; Vaddina, Vishal ; Gopalakrishnan, Saisubramaniam</creator><creatorcontrib>Muneeswaran, I ; Saxena, Shreya ; Prasad, Siva ; Sai Prakash, M V ; Shankar, Advaith ; Varun, V ; Vaddina, Vishal ; Gopalakrishnan, Saisubramaniam</creatorcontrib><description>Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Large language models</subject><ispartof>arXiv.org, 2023-11</ispartof><rights>2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Muneeswaran, I</creatorcontrib><creatorcontrib>Saxena, Shreya</creatorcontrib><creatorcontrib>Prasad, Siva</creatorcontrib><creatorcontrib>Sai Prakash, M V</creatorcontrib><creatorcontrib>Shankar, Advaith</creatorcontrib><creatorcontrib>Varun, V</creatorcontrib><creatorcontrib>Vaddina, Vishal</creatorcontrib><creatorcontrib>Gopalakrishnan, Saisubramaniam</creatorcontrib><title>Minimizing Factual Inconsistency and Hallucination in Large Language Models</title><title>arXiv.org</title><description>Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.</description><subject>Large language models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNirEKwjAURYMgWLT_EHAupElr6yyWinZzl0cayyvxRZtk0K-3gx_gcs-BcxcskUrlWV1IuWKp96MQQu4qWZYqYecOCR_4QRp4AzpEsPxE2pFHHwzpNwfqeQvWRo0EAR1xJH6BaTDz0hBhls71xvoNW97BepP-uGbb5ng9tNlzcq9ofLiNLk40p5us90VeVioX6r_XFyEhPSw</recordid><startdate>20231123</startdate><enddate>20231123</enddate><creator>Muneeswaran, I</creator><creator>Saxena, Shreya</creator><creator>Prasad, Siva</creator><creator>Sai Prakash, M V</creator><creator>Shankar, Advaith</creator><creator>Varun, V</creator><creator>Vaddina, Vishal</creator><creator>Gopalakrishnan, Saisubramaniam</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231123</creationdate><title>Minimizing Factual Inconsistency and Hallucination in Large Language Models</title><author>Muneeswaran, I ; Saxena, Shreya ; Prasad, Siva ; Sai Prakash, M V ; Shankar, Advaith ; Varun, V ; Vaddina, Vishal ; Gopalakrishnan, Saisubramaniam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28941573103</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Large language models</topic><toplevel>online_resources</toplevel><creatorcontrib>Muneeswaran, I</creatorcontrib><creatorcontrib>Saxena, Shreya</creatorcontrib><creatorcontrib>Prasad, Siva</creatorcontrib><creatorcontrib>Sai Prakash, M V</creatorcontrib><creatorcontrib>Shankar, Advaith</creatorcontrib><creatorcontrib>Varun, V</creatorcontrib><creatorcontrib>Vaddina, Vishal</creatorcontrib><creatorcontrib>Gopalakrishnan, Saisubramaniam</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Muneeswaran, I</au><au>Saxena, Shreya</au><au>Prasad, Siva</au><au>Sai Prakash, M V</au><au>Shankar, Advaith</au><au>Varun, V</au><au>Vaddina, Vishal</au><au>Gopalakrishnan, Saisubramaniam</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Minimizing Factual Inconsistency and Hallucination in Large Language Models</atitle><jtitle>arXiv.org</jtitle><date>2023-11-23</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2894157310
source	Free E- Journals
subjects	Large language models
title	Minimizing Factual Inconsistency and Hallucination in Large Language Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T15%3A36%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Minimizing%20Factual%20Inconsistency%20and%20Hallucination%20in%20Large%20Language%20Models&rft.jtitle=arXiv.org&rft.au=Muneeswaran,%20I&rft.date=2023-11-23&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2894157310%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2894157310&rft_id=info:pmid/&rfr_iscdi=true