How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-07
Hauptverfasser: Qian, Yusu, Zhang, Haotian, Yang, Yinfei, Gan, Zhe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Qian, Yusu
Zhang, Haotian
Yang, Yinfei
Gan, Zhe
description The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2929273655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2929273655</sourcerecordid><originalsourceid>FETCH-proquest_journals_29292736553</originalsourceid><addsrcrecordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2929273655</pqid></control><display><type>article</type><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><source>Free E- Journals</source><creator>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creator><creatorcontrib>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creatorcontrib><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Benchmarks ; Empirical analysis ; Large language models</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>778,782</link.rule.ids></links><search><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><title>arXiv.org</title><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><subject>Accuracy</subject><subject>Benchmarks</subject><subject>Empirical analysis</subject><subject>Large language models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</recordid><startdate>20240723</startdate><enddate>20240723</enddate><creator>Qian, Yusu</creator><creator>Zhang, Haotian</creator><creator>Yang, Yinfei</creator><creator>Gan, Zhe</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240723</creationdate><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><author>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29292736553</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Benchmarks</topic><topic>Empirical analysis</topic><topic>Large language models</topic><toplevel>online_resources</toplevel><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qian, Yusu</au><au>Zhang, Haotian</au><au>Yang, Yinfei</au><au>Gan, Zhe</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</atitle><jtitle>arXiv.org</jtitle><date>2024-07-23</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-07
issn 2331-8422
language eng
recordid cdi_proquest_journals_2929273655
source Free E- Journals
subjects Accuracy
Benchmarks
Empirical analysis
Large language models
title How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T00%3A22%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=How%20Easy%20is%20It%20to%20Fool%20Your%20Multimodal%20LLMs?%20An%20Empirical%20Analysis%20on%20Deceptive%20Prompts&rft.jtitle=arXiv.org&rft.au=Qian,%20Yusu&rft.date=2024-07-23&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2929273655%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2929273655&rft_id=info:pmid/&rfr_iscdi=true