STASiamRPN: visual tracking based on spatiotemporal and attention

Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focus...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia systems 2022-10, Vol.28 (5), p.1543-1555
Hauptverfasser: Wu, Ruixu, Wen, Xianbin, Liu, Zhanlu, Yuan, Liming, Xu, Haixia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1555
container_issue 5
container_start_page 1543
container_title Multimedia systems
container_volume 28
creator Wu, Ruixu
Wen, Xianbin
Liu, Zhanlu
Yuan, Liming
Xu, Haixia
description Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.
doi_str_mv 10.1007/s00530-021-00845-y
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2717709836</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2717709836</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKt_wNWA6-hNMpNk3BXxBUXF1nXI5FGmtjNjkgr990ZHcOfqwuE758KH0DmBSwIgriJAxQADJRhAlhXeH6AJKRnFREp6iCZQlxSXNafH6CTGNQARnMEEzRbL2aLV29eXp-vis407vSlS0Oa97VZFo6OzRd8VcdCp7ZPbDn3IgO5soVNyXQ67U3Tk9Sa6s987RW93t8ubBzx_vn-8mc2xYaRO2NYVF1pT4Z2RZaMba4S0jlaWQ0MNF7wSnnAia1kBKx3TwnoGpeeWVs4bNkUX4-4Q-o-di0mt-13o8ktFBRECasl4puhImdDHGJxXQ2i3OuwVAfWtSo2qVFalflSpfS6xsRQz3K1c-Jv-p_UFqrpsPw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2717709836</pqid></control><display><type>article</type><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</creator><creatorcontrib>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</creatorcontrib><description>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</description><identifier>ISSN: 0942-4962</identifier><identifier>EISSN: 1432-1882</identifier><identifier>DOI: 10.1007/s00530-021-00845-y</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial neural networks ; Computer Communication Networks ; Computer Graphics ; Computer Science ; Computer vision ; Cryptology ; Data Storage Representation ; Feature extraction ; Multimedia Information Systems ; Neural networks ; Object recognition ; Operating Systems ; Optical tracking ; Regular Paper ; Tracking systems</subject><ispartof>Multimedia systems, 2022-10, Vol.28 (5), p.1543-1555</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</citedby><cites>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</cites><orcidid>0000-0002-6136-3388</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00530-021-00845-y$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00530-021-00845-y$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Wu, Ruixu</creatorcontrib><creatorcontrib>Wen, Xianbin</creatorcontrib><creatorcontrib>Liu, Zhanlu</creatorcontrib><creatorcontrib>Yuan, Liming</creatorcontrib><creatorcontrib>Xu, Haixia</creatorcontrib><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><title>Multimedia systems</title><addtitle>Multimedia Systems</addtitle><description>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</description><subject>Artificial neural networks</subject><subject>Computer Communication Networks</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Cryptology</subject><subject>Data Storage Representation</subject><subject>Feature extraction</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Operating Systems</subject><subject>Optical tracking</subject><subject>Regular Paper</subject><subject>Tracking systems</subject><issn>0942-4962</issn><issn>1432-1882</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKt_wNWA6-hNMpNk3BXxBUXF1nXI5FGmtjNjkgr990ZHcOfqwuE758KH0DmBSwIgriJAxQADJRhAlhXeH6AJKRnFREp6iCZQlxSXNafH6CTGNQARnMEEzRbL2aLV29eXp-vis407vSlS0Oa97VZFo6OzRd8VcdCp7ZPbDn3IgO5soVNyXQ67U3Tk9Sa6s987RW93t8ubBzx_vn-8mc2xYaRO2NYVF1pT4Z2RZaMba4S0jlaWQ0MNF7wSnnAia1kBKx3TwnoGpeeWVs4bNkUX4-4Q-o-di0mt-13o8ktFBRECasl4puhImdDHGJxXQ2i3OuwVAfWtSo2qVFalflSpfS6xsRQz3K1c-Jv-p_UFqrpsPw</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Wu, Ruixu</creator><creator>Wen, Xianbin</creator><creator>Liu, Zhanlu</creator><creator>Yuan, Liming</creator><creator>Xu, Haixia</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6136-3388</orcidid></search><sort><creationdate>20221001</creationdate><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><author>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial neural networks</topic><topic>Computer Communication Networks</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Cryptology</topic><topic>Data Storage Representation</topic><topic>Feature extraction</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Operating Systems</topic><topic>Optical tracking</topic><topic>Regular Paper</topic><topic>Tracking systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Ruixu</creatorcontrib><creatorcontrib>Wen, Xianbin</creatorcontrib><creatorcontrib>Liu, Zhanlu</creatorcontrib><creatorcontrib>Yuan, Liming</creatorcontrib><creatorcontrib>Xu, Haixia</creatorcontrib><collection>CrossRef</collection><jtitle>Multimedia systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Ruixu</au><au>Wen, Xianbin</au><au>Liu, Zhanlu</au><au>Yuan, Liming</au><au>Xu, Haixia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>STASiamRPN: visual tracking based on spatiotemporal and attention</atitle><jtitle>Multimedia systems</jtitle><stitle>Multimedia Systems</stitle><date>2022-10-01</date><risdate>2022</risdate><volume>28</volume><issue>5</issue><spage>1543</spage><epage>1555</epage><pages>1543-1555</pages><issn>0942-4962</issn><eissn>1432-1882</eissn><abstract>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00530-021-00845-y</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-6136-3388</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0942-4962
ispartof Multimedia systems, 2022-10, Vol.28 (5), p.1543-1555
issn 0942-4962
1432-1882
language eng
recordid cdi_proquest_journals_2717709836
source SpringerLink Journals - AutoHoldings
subjects Artificial neural networks
Computer Communication Networks
Computer Graphics
Computer Science
Computer vision
Cryptology
Data Storage Representation
Feature extraction
Multimedia Information Systems
Neural networks
Object recognition
Operating Systems
Optical tracking
Regular Paper
Tracking systems
title STASiamRPN: visual tracking based on spatiotemporal and attention
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T07%3A48%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=STASiamRPN:%20visual%20tracking%20based%20on%20spatiotemporal%20and%20attention&rft.jtitle=Multimedia%20systems&rft.au=Wu,%20Ruixu&rft.date=2022-10-01&rft.volume=28&rft.issue=5&rft.spage=1543&rft.epage=1555&rft.pages=1543-1555&rft.issn=0942-4962&rft.eissn=1432-1882&rft_id=info:doi/10.1007/s00530-021-00845-y&rft_dat=%3Cproquest_cross%3E2717709836%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2717709836&rft_id=info:pmid/&rfr_iscdi=true