Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods

Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of artificial intelligence research 2021-08, Vol.71, p.1183-1317
Hauptverfasser:	Mogadala, Aditya, Kalimuthu, Marimuthu, Klakow, Dietrich
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Artificial neural networks Computer vision Corresponding states Datasets Deep learning Machine learning Natural language processing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1317
container_issue
container_start_page	1183
container_title	The Journal of artificial intelligence research
container_volume	71
creator	Mogadala, Aditya Kalimuthu, Marimuthu Klakow, Dietrich
description	Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications.
doi_str_mv	10.1613/jair.1.11688
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2567810754</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2567810754</sourcerecordid><originalsourceid>FETCH-LOGICAL-c244t-af7364bb1c7e694c9bb5f47c59e31931d08da22b72cd23a4c8ead229559f29593</originalsourceid><addsrcrecordid>eNpNkMtOwzAQRS0EEqWw4wMssW1K_IpjdhXPSkVIUNhaE9tpU8AptoPUvyehLNjMnMWZGc1F6JzkU1IQdrmBJkzJlJCiLA_QiOSyyJQU8vAfH6OTGDd5ThSn5QjpZXDeRtx4PPfJrQKkpvW4rfFbEwcCb_EC_KqDlcPPLjoIZn2FZ_ilC99uN5hLiO9xgm8gQXSpp2Hm0aV1a-MpOqrhI7qzvz5Gr3e3y-uHbPF0P7-eLTJDOU8Z1JIVvKqIka5Q3KiqEjWXRijHiGLE5qUFSitJjaUMuCkdWEqVEKruq2JjdLHfuw3tV-di0pu2C74_qakoZNn_L3hvTfaWCW2MwdV6G5pPCDtNcj1EqIcINdG_EbIfPxZjyw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2567810754</pqid></control><display><type>article</type><title>Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods</title><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Free E- Journals</source><creator>Mogadala, Aditya ; Kalimuthu, Marimuthu ; Klakow, Dietrich</creator><creatorcontrib>Mogadala, Aditya ; Kalimuthu, Marimuthu ; Klakow, Dietrich</creatorcontrib><description>Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications.</description><identifier>ISSN: 1076-9757</identifier><identifier>EISSN: 1076-9757</identifier><identifier>EISSN: 1943-5037</identifier><identifier>DOI: 10.1613/jair.1.11688</identifier><language>eng</language><publisher>San Francisco: AI Access Foundation</publisher><subject>Artificial intelligence ; Artificial neural networks ; Computer vision ; Corresponding states ; Datasets ; Deep learning ; Machine learning ; Natural language processing</subject><ispartof>The Journal of artificial intelligence research, 2021-08, Vol.71, p.1183-1317</ispartof><rights>2021. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://www.jair.org/index.php/jair/about</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c244t-af7364bb1c7e694c9bb5f47c59e31931d08da22b72cd23a4c8ead229559f29593</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,27924,27925</link.rule.ids></links><search><creatorcontrib>Mogadala, Aditya</creatorcontrib><creatorcontrib>Kalimuthu, Marimuthu</creatorcontrib><creatorcontrib>Klakow, Dietrich</creatorcontrib><title>Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods</title><title>The Journal of artificial intelligence research</title><description>Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications.</description><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Computer vision</subject><subject>Corresponding states</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Machine learning</subject><subject>Natural language processing</subject><issn>1076-9757</issn><issn>1076-9757</issn><issn>1943-5037</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpNkMtOwzAQRS0EEqWw4wMssW1K_IpjdhXPSkVIUNhaE9tpU8AptoPUvyehLNjMnMWZGc1F6JzkU1IQdrmBJkzJlJCiLA_QiOSyyJQU8vAfH6OTGDd5ThSn5QjpZXDeRtx4PPfJrQKkpvW4rfFbEwcCb_EC_KqDlcPPLjoIZn2FZ_ilC99uN5hLiO9xgm8gQXSpp2Hm0aV1a-MpOqrhI7qzvz5Gr3e3y-uHbPF0P7-eLTJDOU8Z1JIVvKqIka5Q3KiqEjWXRijHiGLE5qUFSitJjaUMuCkdWEqVEKruq2JjdLHfuw3tV-di0pu2C74_qakoZNn_L3hvTfaWCW2MwdV6G5pPCDtNcj1EqIcINdG_EbIfPxZjyw</recordid><startdate>20210830</startdate><enddate>20210830</enddate><creator>Mogadala, Aditya</creator><creator>Kalimuthu, Marimuthu</creator><creator>Klakow, Dietrich</creator><general>AI Access Foundation</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20210830</creationdate><title>Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods</title><author>Mogadala, Aditya ; Kalimuthu, Marimuthu ; Klakow, Dietrich</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c244t-af7364bb1c7e694c9bb5f47c59e31931d08da22b72cd23a4c8ead229559f29593</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Computer vision</topic><topic>Corresponding states</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Machine learning</topic><topic>Natural language processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mogadala, Aditya</creatorcontrib><creatorcontrib>Kalimuthu, Marimuthu</creatorcontrib><creatorcontrib>Klakow, Dietrich</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>The Journal of artificial intelligence research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mogadala, Aditya</au><au>Kalimuthu, Marimuthu</au><au>Klakow, Dietrich</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods</atitle><jtitle>The Journal of artificial intelligence research</jtitle><date>2021-08-30</date><risdate>2021</risdate><volume>71</volume><spage>1183</spage><epage>1317</epage><pages>1183-1317</pages><issn>1076-9757</issn><eissn>1076-9757</eissn><eissn>1943-5037</eissn><abstract>Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as machine learning, computer vision, and natural language processing. Much of the growth in these fields has been made possible with deep learning, a sub-area of machine learning that uses artificial neural networks. This has created significant interest in the integration of vision and language. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulation, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey stimulates innovative thoughts and ideas to address the existing challenges and build new applications.</abstract><cop>San Francisco</cop><pub>AI Access Foundation</pub><doi>10.1613/jair.1.11688</doi><tpages>135</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1076-9757
ispartof	The Journal of artificial intelligence research, 2021-08, Vol.71, p.1183-1317
issn	1076-9757 1076-9757 1943-5037
language	eng
recordid	cdi_proquest_journals_2567810754
source	DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; Free E- Journals
subjects	Artificial intelligence Artificial neural networks Computer vision Corresponding states Datasets Deep learning Machine learning Natural language processing
title	Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T16%3A28%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Trends%20in%20Integration%20of%20Vision%20and%20Language%20Research:%20A%20Survey%20of%20Tasks,%20Datasets,%20and%20Methods&rft.jtitle=The%20Journal%20of%20artificial%20intelligence%20research&rft.au=Mogadala,%20Aditya&rft.date=2021-08-30&rft.volume=71&rft.spage=1183&rft.epage=1317&rft.pages=1183-1317&rft.issn=1076-9757&rft.eissn=1076-9757&rft_id=info:doi/10.1613/jair.1.11688&rft_dat=%3Cproquest_cross%3E2567810754%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2567810754&rft_id=info:pmid/&rfr_iscdi=true