Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. Howev...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-10
Hauptverfasser:	Li, Jinyuan, Li, Han, Pan, Zhuo, Sun, Di, Wang, Jiahao, Zhang, Wenkun, Pan, Gang
Format:	Artikel
Sprache:	eng
Schlagworte:	Chatbots Explicit knowledge Image enhancement Knowledge acquisition Knowledge bases (artificial intelligence) Recognition Redundancy
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Li, Jinyuan Li, Han Pan, Zhuo Sun, Di Wang, Jiahao Zhang, Wenkun Pan, Gang
description	Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge. In this paper, we present PGIM -- a two-stage framework that aims to leverage ChatGPT as an implicit knowledge base and enable it to heuristically generate auxiliary knowledge for more efficient entity prediction. Specifically, PGIM contains a Multimodal Similar Example Awareness module that selects suitable examples from a small number of predefined artificial samples. These examples are then integrated into a formatted prompt template tailored to the MNER and guide ChatGPT to generate auxiliary refined knowledge. Finally, the acquired knowledge is integrated with the original text and fed into a downstream model for further processing. Extensive experiments show that PGIM outperforms state-of-the-art methods on two classic MNER datasets and exhibits a stronger robustness and generalization capability.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2817858592</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2817858592</sourcerecordid><originalsourceid>FETCH-proquest_journals_28178585923</originalsourceid><addsrcrecordid>eNqNisEKgkAUAJcgKKp_eNA5sLUt6xZiBaFIeJclV32yvjVdqf4-gz6g08DMjNiUu-565W04n7BF11WO4_DtjgvhTlkZt6ZuLFIBfintOU4ACcIouB0goFLSXWUQ9tpibTKpIZL1IAKyaN9wU3dTEFo0BE-0JRz7F2qU7TflSMN5JfPUKivUnI1zqTu1-HHGlqcg8S-rpjWPXnU2rUzf0pBS7q13nvDEnrv_XR-_k0cr</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2817858592</pqid></control><display><type>article</type><title>Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge</title><source>Free E- Journals</source><creator>Li, Jinyuan ; Li, Han ; Pan, Zhuo ; Sun, Di ; Wang, Jiahao ; Zhang, Wenkun ; Pan, Gang</creator><creatorcontrib>Li, Jinyuan ; Li, Han ; Pan, Zhuo ; Sun, Di ; Wang, Jiahao ; Zhang, Wenkun ; Pan, Gang</creatorcontrib><description>Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge. In this paper, we present PGIM -- a two-stage framework that aims to leverage ChatGPT as an implicit knowledge base and enable it to heuristically generate auxiliary knowledge for more efficient entity prediction. Specifically, PGIM contains a Multimodal Similar Example Awareness module that selects suitable examples from a small number of predefined artificial samples. These examples are then integrated into a formatted prompt template tailored to the MNER and guide ChatGPT to generate auxiliary refined knowledge. Finally, the acquired knowledge is integrated with the original text and fed into a downstream model for further processing. Extensive experiments show that PGIM outperforms state-of-the-art methods on two classic MNER datasets and exhibits a stronger robustness and generalization capability.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Chatbots ; Explicit knowledge ; Image enhancement ; Knowledge acquisition ; Knowledge bases (artificial intelligence) ; Recognition ; Redundancy</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Li, Jinyuan</creatorcontrib><creatorcontrib>Li, Han</creatorcontrib><creatorcontrib>Pan, Zhuo</creatorcontrib><creatorcontrib>Sun, Di</creatorcontrib><creatorcontrib>Wang, Jiahao</creatorcontrib><creatorcontrib>Zhang, Wenkun</creatorcontrib><creatorcontrib>Pan, Gang</creatorcontrib><title>Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge</title><title>arXiv.org</title><description>Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge. In this paper, we present PGIM -- a two-stage framework that aims to leverage ChatGPT as an implicit knowledge base and enable it to heuristically generate auxiliary knowledge for more efficient entity prediction. Specifically, PGIM contains a Multimodal Similar Example Awareness module that selects suitable examples from a small number of predefined artificial samples. These examples are then integrated into a formatted prompt template tailored to the MNER and guide ChatGPT to generate auxiliary refined knowledge. Finally, the acquired knowledge is integrated with the original text and fed into a downstream model for further processing. Extensive experiments show that PGIM outperforms state-of-the-art methods on two classic MNER datasets and exhibits a stronger robustness and generalization capability.</description><subject>Chatbots</subject><subject>Explicit knowledge</subject><subject>Image enhancement</subject><subject>Knowledge acquisition</subject><subject>Knowledge bases (artificial intelligence)</subject><subject>Recognition</subject><subject>Redundancy</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNisEKgkAUAJcgKKp_eNA5sLUt6xZiBaFIeJclV32yvjVdqf4-gz6g08DMjNiUu-565W04n7BF11WO4_DtjgvhTlkZt6ZuLFIBfintOU4ACcIouB0goFLSXWUQ9tpibTKpIZL1IAKyaN9wU3dTEFo0BE-0JRz7F2qU7TflSMN5JfPUKivUnI1zqTu1-HHGlqcg8S-rpjWPXnU2rUzf0pBS7q13nvDEnrv_XR-_k0cr</recordid><startdate>20231018</startdate><enddate>20231018</enddate><creator>Li, Jinyuan</creator><creator>Li, Han</creator><creator>Pan, Zhuo</creator><creator>Sun, Di</creator><creator>Wang, Jiahao</creator><creator>Zhang, Wenkun</creator><creator>Pan, Gang</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231018</creationdate><title>Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge</title><author>Li, Jinyuan ; Li, Han ; Pan, Zhuo ; Sun, Di ; Wang, Jiahao ; Zhang, Wenkun ; Pan, Gang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28178585923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Chatbots</topic><topic>Explicit knowledge</topic><topic>Image enhancement</topic><topic>Knowledge acquisition</topic><topic>Knowledge bases (artificial intelligence)</topic><topic>Recognition</topic><topic>Redundancy</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Jinyuan</creatorcontrib><creatorcontrib>Li, Han</creatorcontrib><creatorcontrib>Pan, Zhuo</creatorcontrib><creatorcontrib>Sun, Di</creatorcontrib><creatorcontrib>Wang, Jiahao</creatorcontrib><creatorcontrib>Zhang, Wenkun</creatorcontrib><creatorcontrib>Pan, Gang</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jinyuan</au><au>Li, Han</au><au>Pan, Zhuo</au><au>Sun, Di</au><au>Wang, Jiahao</au><au>Zhang, Wenkun</au><au>Pan, Gang</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge</atitle><jtitle>arXiv.org</jtitle><date>2023-10-18</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Multimodal Named Entity Recognition (MNER) on social media aims to enhance textual entity prediction by incorporating image-based clues. Existing studies mainly focus on maximizing the utilization of pertinent image information or incorporating external knowledge from explicit knowledge bases. However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge. In this paper, we present PGIM -- a two-stage framework that aims to leverage ChatGPT as an implicit knowledge base and enable it to heuristically generate auxiliary knowledge for more efficient entity prediction. Specifically, PGIM contains a Multimodal Similar Example Awareness module that selects suitable examples from a small number of predefined artificial samples. These examples are then integrated into a formatted prompt template tailored to the MNER and guide ChatGPT to generate auxiliary refined knowledge. Finally, the acquired knowledge is integrated with the original text and fed into a downstream model for further processing. Extensive experiments show that PGIM outperforms state-of-the-art methods on two classic MNER datasets and exhibits a stronger robustness and generalization capability.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2817858592
source	Free E- Journals
subjects	Chatbots Explicit knowledge Image enhancement Knowledge acquisition Knowledge bases (artificial intelligence) Recognition Redundancy
title	Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T05%3A00%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Prompting%20ChatGPT%20in%20MNER:%20Enhanced%20Multimodal%20Named%20Entity%20Recognition%20with%20Auxiliary%20Refined%20Knowledge&rft.jtitle=arXiv.org&rft.au=Li,%20Jinyuan&rft.date=2023-10-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2817858592%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2817858592&rft_id=info:pmid/&rfr_iscdi=true