How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-07
Hauptverfasser:	Qian, Yusu, Zhang, Haotian, Yang, Yinfei, Gan, Zhe
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Benchmarks Empirical analysis Large language models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Qian, Yusu Zhang, Haotian Yang, Yinfei Gan, Zhe
description	The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2929273655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2929273655</sourcerecordid><originalsourceid>FETCH-proquest_journals_29292736553</originalsourceid><addsrcrecordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2929273655</pqid></control><display><type>article</type><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><source>Free E- Journals</source><creator>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creator><creatorcontrib>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creatorcontrib><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Benchmarks ; Empirical analysis ; Large language models</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>778,782</link.rule.ids></links><search><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><title>arXiv.org</title><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><subject>Accuracy</subject><subject>Benchmarks</subject><subject>Empirical analysis</subject><subject>Large language models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</recordid><startdate>20240723</startdate><enddate>20240723</enddate><creator>Qian, Yusu</creator><creator>Zhang, Haotian</creator><creator>Yang, Yinfei</creator><creator>Gan, Zhe</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240723</creationdate><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><author>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29292736553</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Benchmarks</topic><topic>Empirical analysis</topic><topic>Large language models</topic><toplevel>online_resources</toplevel><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qian, Yusu</au><au>Zhang, Haotian</au><au>Yang, Yinfei</au><au>Gan, Zhe</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</atitle><jtitle>arXiv.org</jtitle><date>2024-07-23</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2929273655
source	Free E- Journals
subjects	Accuracy Benchmarks Empirical analysis Large language models
title	How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T00%3A22%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=How%20Easy%20is%20It%20to%20Fool%20Your%20Multimodal%20LLMs?%20An%20Empirical%20Analysis%20on%20Deceptive%20Prompts&rft.jtitle=arXiv.org&rft.au=Qian,%20Yusu&rft.date=2024-07-23&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2929273655%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2929273655&rft_id=info:pmid/&rfr_iscdi=true