How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts
The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-07 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Qian, Yusu Zhang, Haotian Yang, Yinfei Gan, Zhe |
description | The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2929273655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2929273655</sourcerecordid><originalsourceid>FETCH-proquest_journals_29292736553</originalsourceid><addsrcrecordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2929273655</pqid></control><display><type>article</type><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><source>Free E- Journals</source><creator>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creator><creatorcontrib>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</creatorcontrib><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Benchmarks ; Empirical analysis ; Large language models</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>778,782</link.rule.ids></links><search><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><title>arXiv.org</title><description>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</description><subject>Accuracy</subject><subject>Benchmarks</subject><subject>Empirical analysis</subject><subject>Large language models</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNTssKgkAUHYIgKf_hQmvBxtRahZRioNAiiFYiNsHI6LW5Y-HfN4s-IM7iwOG8ZszhQbDxdlvOF8wlan3f51HMwzBw2C3HD6Q1TSAJzgYMQoao4I6jhnJURnb4qBUURUkHSHpIu0Fq2Vgp6Ws1kY1hDyfRiMHIt4CLxm4wtGLzZ61IuD9esnWWXo-5N2h8jYJM1doF20AV31vEQWT__Of6AmP5QB4</recordid><startdate>20240723</startdate><enddate>20240723</enddate><creator>Qian, Yusu</creator><creator>Zhang, Haotian</creator><creator>Yang, Yinfei</creator><creator>Gan, Zhe</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240723</creationdate><title>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</title><author>Qian, Yusu ; Zhang, Haotian ; Yang, Yinfei ; Gan, Zhe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29292736553</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Benchmarks</topic><topic>Empirical analysis</topic><topic>Large language models</topic><toplevel>online_resources</toplevel><creatorcontrib>Qian, Yusu</creatorcontrib><creatorcontrib>Zhang, Haotian</creatorcontrib><creatorcontrib>Yang, Yinfei</creatorcontrib><creatorcontrib>Gan, Zhe</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qian, Yusu</au><au>Zhang, Haotian</au><au>Yang, Yinfei</au><au>Gan, Zhe</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts</atitle><jtitle>arXiv.org</jtitle><date>2024-07-23</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 1000 test samples divided into 5 categories, such as non-existent objects, count of objects, and spatial relationship. We provide a comprehensive analysis of popular MLLMs, ranging from GPT-4v, Reka, Gemini-Pro, to open-sourced models, such as LLaVA-NeXT and MiniCPM-Llama3. Empirically, we observe significant performance gaps between GPT-4o and other models; and previous robust instruction-tuned models are not effective on this new benchmark. While GPT-4o achieves 82.82% accuracy on MAD-Bench, the accuracy of any other model in our experiments ranges from 9% to 50%. We further propose a remedy that adds an additional paragraph to the deceptive prompts to encourage models to think twice before answering the question. Surprisingly, this simple method can even double the accuracy; however, the absolute numbers are still too low to be satisfactory. We hope MAD-Bench can serve as a valuable benchmark to stimulate further research to enhance model resilience against deceptive prompts.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-07 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2929273655 |
source | Free E- Journals |
subjects | Accuracy Benchmarks Empirical analysis Large language models |
title | How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T00%3A22%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=How%20Easy%20is%20It%20to%20Fool%20Your%20Multimodal%20LLMs?%20An%20Empirical%20Analysis%20on%20Deceptive%20Prompts&rft.jtitle=arXiv.org&rft.au=Qian,%20Yusu&rft.date=2024-07-23&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2929273655%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2929273655&rft_id=info:pmid/&rfr_iscdi=true |