Deep Learning in Visual Tracking: A Review

Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2023-09, Vol.34 (9), p.5497-5516
Hauptverfasser: Jiao, Licheng, Wang, Dan, Bai, Yidong, Chen, Puhua, Liu, Fang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5516
container_issue 9
container_start_page 5497
container_title IEEE transaction on neural networks and learning systems
container_volume 34
creator Jiao, Licheng
Wang, Dan
Bai, Yidong
Chen, Puhua
Liu, Fang
description Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.
doi_str_mv 10.1109/TNNLS.2021.3136907
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9666461</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9666461</ieee_id><sourcerecordid>2859710851</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-aae4ea10c749baf17621ce8d57a62241c3b86b0a0074dea901fc95cafe126ff83</originalsourceid><addsrcrecordid>eNpdkF1LwzAUhoMoTub-gIIUvBGhMydp08S7MT-hTNAp3oU0O5XOrpvJqvjvzdzchblJOOd5X8JDyBHQPgBVF-PRKH_qM8qgz4ELRbMdcsBAsJhxKXe37-y1Q3reT2k4gqYiUfukwxMlJEg4IOdXiIsoR-OaqnmLqiZ6qXxr6mjsjH0Po8toED3iZ4Vfh2SvNLXH3ubukueb6_HwLs4fbu-Hgzy2PIVlbAwmaIDaLFGFKSETDCzKSZoZwVgClhdSFNRQmiUTNIpCaVVqTYnARFlK3iVn696Fm3-06Jd6VnmLdW0anLdeMwGpCrXAAnr6D53OW9eE32kmU5UBlSkEiq0p6-beOyz1wlUz4741UL2SqX9l6pVMvZEZQieb6raY4WQb-VMXgOM1UCHidq2EEIkA_gMf2XWw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2859710851</pqid></control><display><type>article</type><title>Deep Learning in Visual Tracking: A Review</title><source>IEEE Electronic Library (IEL)</source><creator>Jiao, Licheng ; Wang, Dan ; Bai, Yidong ; Chen, Puhua ; Liu, Fang</creator><creatorcontrib>Jiao, Licheng ; Wang, Dan ; Bai, Yidong ; Chen, Puhua ; Liu, Fang</creatorcontrib><description>Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.</description><identifier>ISSN: 2162-237X</identifier><identifier>EISSN: 2162-2388</identifier><identifier>DOI: 10.1109/TNNLS.2021.3136907</identifier><identifier>PMID: 34968181</identifier><identifier>CODEN: ITNNAL</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Computer architecture ; Computer vision ; Deep learning ; Deep learning (DL) ; Feature extraction ; Multiple target tracking ; multiple-object tracking (MOT) ; Nonhomogeneous media ; Optical tracking ; Representations ; single-object tracking (SOT) ; Target tracking ; Task analysis ; Tracking ; Trajectory ; Visual discrimination learning ; Visual fields ; Visual tasks ; Visualization</subject><ispartof>IEEE transaction on neural networks and learning systems, 2023-09, Vol.34 (9), p.5497-5516</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-aae4ea10c749baf17621ce8d57a62241c3b86b0a0074dea901fc95cafe126ff83</citedby><cites>FETCH-LOGICAL-c351t-aae4ea10c749baf17621ce8d57a62241c3b86b0a0074dea901fc95cafe126ff83</cites><orcidid>0000-0002-5832-1494 ; 0000-0001-5472-1426 ; 0000-0002-5669-9354 ; 0000-0003-3354-9617</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9666461$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9666461$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34968181$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Jiao, Licheng</creatorcontrib><creatorcontrib>Wang, Dan</creatorcontrib><creatorcontrib>Bai, Yidong</creatorcontrib><creatorcontrib>Chen, Puhua</creatorcontrib><creatorcontrib>Liu, Fang</creatorcontrib><title>Deep Learning in Visual Tracking: A Review</title><title>IEEE transaction on neural networks and learning systems</title><addtitle>TNNLS</addtitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><description>Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.</description><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Deep learning</subject><subject>Deep learning (DL)</subject><subject>Feature extraction</subject><subject>Multiple target tracking</subject><subject>multiple-object tracking (MOT)</subject><subject>Nonhomogeneous media</subject><subject>Optical tracking</subject><subject>Representations</subject><subject>single-object tracking (SOT)</subject><subject>Target tracking</subject><subject>Task analysis</subject><subject>Tracking</subject><subject>Trajectory</subject><subject>Visual discrimination learning</subject><subject>Visual fields</subject><subject>Visual tasks</subject><subject>Visualization</subject><issn>2162-237X</issn><issn>2162-2388</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkF1LwzAUhoMoTub-gIIUvBGhMydp08S7MT-hTNAp3oU0O5XOrpvJqvjvzdzchblJOOd5X8JDyBHQPgBVF-PRKH_qM8qgz4ELRbMdcsBAsJhxKXe37-y1Q3reT2k4gqYiUfukwxMlJEg4IOdXiIsoR-OaqnmLqiZ6qXxr6mjsjH0Po8toED3iZ4Vfh2SvNLXH3ubukueb6_HwLs4fbu-Hgzy2PIVlbAwmaIDaLFGFKSETDCzKSZoZwVgClhdSFNRQmiUTNIpCaVVqTYnARFlK3iVn696Fm3-06Jd6VnmLdW0anLdeMwGpCrXAAnr6D53OW9eE32kmU5UBlSkEiq0p6-beOyz1wlUz4741UL2SqX9l6pVMvZEZQieb6raY4WQb-VMXgOM1UCHidq2EEIkA_gMf2XWw</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Jiao, Licheng</creator><creator>Wang, Dan</creator><creator>Bai, Yidong</creator><creator>Chen, Puhua</creator><creator>Liu, Fang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QP</scope><scope>7QQ</scope><scope>7QR</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TK</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5832-1494</orcidid><orcidid>https://orcid.org/0000-0001-5472-1426</orcidid><orcidid>https://orcid.org/0000-0002-5669-9354</orcidid><orcidid>https://orcid.org/0000-0003-3354-9617</orcidid></search><sort><creationdate>20230901</creationdate><title>Deep Learning in Visual Tracking: A Review</title><author>Jiao, Licheng ; Wang, Dan ; Bai, Yidong ; Chen, Puhua ; Liu, Fang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-aae4ea10c749baf17621ce8d57a62241c3b86b0a0074dea901fc95cafe126ff83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Deep learning</topic><topic>Deep learning (DL)</topic><topic>Feature extraction</topic><topic>Multiple target tracking</topic><topic>multiple-object tracking (MOT)</topic><topic>Nonhomogeneous media</topic><topic>Optical tracking</topic><topic>Representations</topic><topic>single-object tracking (SOT)</topic><topic>Target tracking</topic><topic>Task analysis</topic><topic>Tracking</topic><topic>Trajectory</topic><topic>Visual discrimination learning</topic><topic>Visual fields</topic><topic>Visual tasks</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiao, Licheng</creatorcontrib><creatorcontrib>Wang, Dan</creatorcontrib><creatorcontrib>Bai, Yidong</creatorcontrib><creatorcontrib>Chen, Puhua</creatorcontrib><creatorcontrib>Liu, Fang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transaction on neural networks and learning systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jiao, Licheng</au><au>Wang, Dan</au><au>Bai, Yidong</au><au>Chen, Puhua</au><au>Liu, Fang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Learning in Visual Tracking: A Review</atitle><jtitle>IEEE transaction on neural networks and learning systems</jtitle><stitle>TNNLS</stitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><date>2023-09-01</date><risdate>2023</risdate><volume>34</volume><issue>9</issue><spage>5497</spage><epage>5516</epage><pages>5497-5516</pages><issn>2162-237X</issn><eissn>2162-2388</eissn><coden>ITNNAL</coden><abstract>Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>34968181</pmid><doi>10.1109/TNNLS.2021.3136907</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0002-5832-1494</orcidid><orcidid>https://orcid.org/0000-0001-5472-1426</orcidid><orcidid>https://orcid.org/0000-0002-5669-9354</orcidid><orcidid>https://orcid.org/0000-0003-3354-9617</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2162-237X
ispartof IEEE transaction on neural networks and learning systems, 2023-09, Vol.34 (9), p.5497-5516
issn 2162-237X
2162-2388
language eng
recordid cdi_ieee_primary_9666461
source IEEE Electronic Library (IEL)
subjects Computer architecture
Computer vision
Deep learning
Deep learning (DL)
Feature extraction
Multiple target tracking
multiple-object tracking (MOT)
Nonhomogeneous media
Optical tracking
Representations
single-object tracking (SOT)
Target tracking
Task analysis
Tracking
Trajectory
Visual discrimination learning
Visual fields
Visual tasks
Visualization
title Deep Learning in Visual Tracking: A Review
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T15%3A59%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Learning%20in%20Visual%20Tracking:%20A%20Review&rft.jtitle=IEEE%20transaction%20on%20neural%20networks%20and%20learning%20systems&rft.au=Jiao,%20Licheng&rft.date=2023-09-01&rft.volume=34&rft.issue=9&rft.spage=5497&rft.epage=5516&rft.pages=5497-5516&rft.issn=2162-237X&rft.eissn=2162-2388&rft.coden=ITNNAL&rft_id=info:doi/10.1109/TNNLS.2021.3136907&rft_dat=%3Cproquest_RIE%3E2859710851%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2859710851&rft_id=info:pmid/34968181&rft_ieee_id=9666461&rfr_iscdi=true