Deep Learning for Visual Tracking: A Comprehensive Survey

Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable me...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on intelligent transportation systems 2022-05, Vol.23 (5), p.3943-3968
Hauptverfasser:	Marvasti-Zadeh, Seyed Mojtaba, Cheng, Li, Ghanei-Yakhdan, Hossein, Kasaei, Shohreh
Format:	Artikel
Sprache:	eng
Schlagworte:	appearance modeling Benchmark testing Benchmarks Computer architecture Computer vision Correlation Datasets Deep learning Evaluation Exploitation Feature extraction Optical tracking Target tracking Tracking Training Visual tracking Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3968
container_issue	5
container_start_page	3943
container_title	IEEE transactions on intelligent transportation systems
container_volume	23
creator	Marvasti-Zadeh, Seyed Mojtaba Cheng, Li Ghanei-Yakhdan, Hossein Kasaei, Shohreh
description	Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years - predominantly by recent deep learning (DL)-based methods. This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics. It also extensively evaluates and analyzes the leading visual tracking methods. First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from nine key aspects of: network architecture, network exploitation, network training for visual tracking, network objective, network output, exploitation of correlation filter advantages, aerial-view tracking, long-term tracking, and online tracking. Second, popular visual tracking benchmarks and their respective properties are compared, and their evaluation metrics are summarized. Third, the state-of-the-art DL-based methods are comprehensively examined on a set of well-established benchmarks of OTB2013, OTB2015, VOT2018, LaSOT, UAV123, UAVDT, and VisDrone2019. Finally, by conducting critical analyses of these state-of-the-art trackers quantitatively and qualitatively, their pros and cons under various common scenarios are investigated. It may serve as a gentle use guide for practitioners to weigh when and under what conditions to choose which method(s). It also facilitates a discussion on ongoing issues and sheds light on promising research directions.
doi_str_mv	10.1109/TITS.2020.3046478
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2659347107</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9339950</ieee_id><sourcerecordid>2659347107</sourcerecordid><originalsourceid>FETCH-LOGICAL-c341t-aa4e1ab3614340243a1e68df3af758561d371e26bb397807a2fd8a443c2d8d403</originalsourceid><addsrcrecordid>eNo9kE1Lw0AQhhdRsFZ_gHgJeE6d2a9kvZX6VSh4aPS6bJOJprZJ3G0K_fcmtHia4eV5Z-Bh7BZhggjmIZtnywkHDhMBUsskPWMjVCqNAVCfDzuXsQEFl-wqhHWfSoU4YuaJqI0W5Hxd1V9R2fjoswqd20SZd_lPnz1G02jWbFtP31SHak_RsvN7Olyzi9JtAt2c5ph9vDxns7d48f46n00XcS4k7mLnJKFbCY1SSOBSOCSdFqVwZaJSpbEQCRLXq5UwSQqJ42WROilFzou0kCDG7P54t_XNb0dhZ9dN5-v-peVaGSEThKSn8EjlvgnBU2lbX22dP1gEOxiygyE7GLInQ33n7tipiOifN0IYo0D8AWVfX_U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2659347107</pqid></control><display><type>article</type><title>Deep Learning for Visual Tracking: A Comprehensive Survey</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Marvasti-Zadeh, Seyed Mojtaba ; Cheng, Li ; Ghanei-Yakhdan, Hossein ; Kasaei, Shohreh</creator><creatorcontrib>Marvasti-Zadeh, Seyed Mojtaba ; Cheng, Li ; Ghanei-Yakhdan, Hossein ; Kasaei, Shohreh</creatorcontrib><description>Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years - predominantly by recent deep learning (DL)-based methods. This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics. It also extensively evaluates and analyzes the leading visual tracking methods. First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from nine key aspects of: network architecture, network exploitation, network training for visual tracking, network objective, network output, exploitation of correlation filter advantages, aerial-view tracking, long-term tracking, and online tracking. Second, popular visual tracking benchmarks and their respective properties are compared, and their evaluation metrics are summarized. Third, the state-of-the-art DL-based methods are comprehensively examined on a set of well-established benchmarks of OTB2013, OTB2015, VOT2018, LaSOT, UAV123, UAVDT, and VisDrone2019. Finally, by conducting critical analyses of these state-of-the-art trackers quantitatively and qualitatively, their pros and cons under various common scenarios are investigated. It may serve as a gentle use guide for practitioners to weigh when and under what conditions to choose which method(s). It also facilitates a discussion on ongoing issues and sheds light on promising research directions.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2020.3046478</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>appearance modeling ; Benchmark testing ; Benchmarks ; Computer architecture ; Computer vision ; Correlation ; Datasets ; Deep learning ; Evaluation ; Exploitation ; Feature extraction ; Optical tracking ; Target tracking ; Tracking ; Training ; Visual tracking ; Visualization</subject><ispartof>IEEE transactions on intelligent transportation systems, 2022-05, Vol.23 (5), p.3943-3968</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c341t-aa4e1ab3614340243a1e68df3af758561d371e26bb397807a2fd8a443c2d8d403</citedby><cites>FETCH-LOGICAL-c341t-aa4e1ab3614340243a1e68df3af758561d371e26bb397807a2fd8a443c2d8d403</cites><orcidid>0000-0003-4575-1062 ; 0000-0002-3831-0878 ; 0000-0003-0536-0796 ; 0000-0003-3261-3533</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9339950$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9339950$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Marvasti-Zadeh, Seyed Mojtaba</creatorcontrib><creatorcontrib>Cheng, Li</creatorcontrib><creatorcontrib>Ghanei-Yakhdan, Hossein</creatorcontrib><creatorcontrib>Kasaei, Shohreh</creatorcontrib><title>Deep Learning for Visual Tracking: A Comprehensive Survey</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years - predominantly by recent deep learning (DL)-based methods. This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics. It also extensively evaluates and analyzes the leading visual tracking methods. First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from nine key aspects of: network architecture, network exploitation, network training for visual tracking, network objective, network output, exploitation of correlation filter advantages, aerial-view tracking, long-term tracking, and online tracking. Second, popular visual tracking benchmarks and their respective properties are compared, and their evaluation metrics are summarized. Third, the state-of-the-art DL-based methods are comprehensively examined on a set of well-established benchmarks of OTB2013, OTB2015, VOT2018, LaSOT, UAV123, UAVDT, and VisDrone2019. Finally, by conducting critical analyses of these state-of-the-art trackers quantitatively and qualitatively, their pros and cons under various common scenarios are investigated. It may serve as a gentle use guide for practitioners to weigh when and under what conditions to choose which method(s). It also facilitates a discussion on ongoing issues and sheds light on promising research directions.</description><subject>appearance modeling</subject><subject>Benchmark testing</subject><subject>Benchmarks</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Correlation</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Evaluation</subject><subject>Exploitation</subject><subject>Feature extraction</subject><subject>Optical tracking</subject><subject>Target tracking</subject><subject>Tracking</subject><subject>Training</subject><subject>Visual tracking</subject><subject>Visualization</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1Lw0AQhhdRsFZ_gHgJeE6d2a9kvZX6VSh4aPS6bJOJprZJ3G0K_fcmtHia4eV5Z-Bh7BZhggjmIZtnywkHDhMBUsskPWMjVCqNAVCfDzuXsQEFl-wqhHWfSoU4YuaJqI0W5Hxd1V9R2fjoswqd20SZd_lPnz1G02jWbFtP31SHak_RsvN7Olyzi9JtAt2c5ph9vDxns7d48f46n00XcS4k7mLnJKFbCY1SSOBSOCSdFqVwZaJSpbEQCRLXq5UwSQqJ42WROilFzou0kCDG7P54t_XNb0dhZ9dN5-v-peVaGSEThKSn8EjlvgnBU2lbX22dP1gEOxiygyE7GLInQ33n7tipiOifN0IYo0D8AWVfX_U</recordid><startdate>20220501</startdate><enddate>20220501</enddate><creator>Marvasti-Zadeh, Seyed Mojtaba</creator><creator>Cheng, Li</creator><creator>Ghanei-Yakhdan, Hossein</creator><creator>Kasaei, Shohreh</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4575-1062</orcidid><orcidid>https://orcid.org/0000-0002-3831-0878</orcidid><orcidid>https://orcid.org/0000-0003-0536-0796</orcidid><orcidid>https://orcid.org/0000-0003-3261-3533</orcidid></search><sort><creationdate>20220501</creationdate><title>Deep Learning for Visual Tracking: A Comprehensive Survey</title><author>Marvasti-Zadeh, Seyed Mojtaba ; Cheng, Li ; Ghanei-Yakhdan, Hossein ; Kasaei, Shohreh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c341t-aa4e1ab3614340243a1e68df3af758561d371e26bb397807a2fd8a443c2d8d403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>appearance modeling</topic><topic>Benchmark testing</topic><topic>Benchmarks</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Correlation</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Evaluation</topic><topic>Exploitation</topic><topic>Feature extraction</topic><topic>Optical tracking</topic><topic>Target tracking</topic><topic>Tracking</topic><topic>Training</topic><topic>Visual tracking</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marvasti-Zadeh, Seyed Mojtaba</creatorcontrib><creatorcontrib>Cheng, Li</creatorcontrib><creatorcontrib>Ghanei-Yakhdan, Hossein</creatorcontrib><creatorcontrib>Kasaei, Shohreh</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Marvasti-Zadeh, Seyed Mojtaba</au><au>Cheng, Li</au><au>Ghanei-Yakhdan, Hossein</au><au>Kasaei, Shohreh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Learning for Visual Tracking: A Comprehensive Survey</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2022-05-01</date><risdate>2022</risdate><volume>23</volume><issue>5</issue><spage>3943</spage><epage>3968</epage><pages>3943-3968</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years - predominantly by recent deep learning (DL)-based methods. This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics. It also extensively evaluates and analyzes the leading visual tracking methods. First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from nine key aspects of: network architecture, network exploitation, network training for visual tracking, network objective, network output, exploitation of correlation filter advantages, aerial-view tracking, long-term tracking, and online tracking. Second, popular visual tracking benchmarks and their respective properties are compared, and their evaluation metrics are summarized. Third, the state-of-the-art DL-based methods are comprehensively examined on a set of well-established benchmarks of OTB2013, OTB2015, VOT2018, LaSOT, UAV123, UAVDT, and VisDrone2019. Finally, by conducting critical analyses of these state-of-the-art trackers quantitatively and qualitatively, their pros and cons under various common scenarios are investigated. It may serve as a gentle use guide for practitioners to weigh when and under what conditions to choose which method(s). It also facilitates a discussion on ongoing issues and sheds light on promising research directions.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2020.3046478</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0003-4575-1062</orcidid><orcidid>https://orcid.org/0000-0002-3831-0878</orcidid><orcidid>https://orcid.org/0000-0003-0536-0796</orcidid><orcidid>https://orcid.org/0000-0003-3261-3533</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1524-9050
ispartof	IEEE transactions on intelligent transportation systems, 2022-05, Vol.23 (5), p.3943-3968
issn	1524-9050 1558-0016
language	eng
recordid	cdi_proquest_journals_2659347107
source	IEEE/IET Electronic Library (IEL)
subjects	appearance modeling Benchmark testing Benchmarks Computer architecture Computer vision Correlation Datasets Deep learning Evaluation Exploitation Feature extraction Optical tracking Target tracking Tracking Training Visual tracking Visualization
title	Deep Learning for Visual Tracking: A Comprehensive Survey
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T18%3A55%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Learning%20for%20Visual%20Tracking:%20A%20Comprehensive%20Survey&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Marvasti-Zadeh,%20Seyed%20Mojtaba&rft.date=2022-05-01&rft.volume=23&rft.issue=5&rft.spage=3943&rft.epage=3968&rft.pages=3943-3968&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2020.3046478&rft_dat=%3Cproquest_RIE%3E2659347107%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2659347107&rft_id=info:pmid/&rft_ieee_id=9339950&rfr_iscdi=true