STASiamRPN: visual tracking based on spatiotemporal and attention

Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focus...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia systems 2022-10, Vol.28 (5), p.1543-1555
Hauptverfasser:	Wu, Ruixu, Wen, Xianbin, Liu, Zhanlu, Yuan, Liming, Xu, Haixia
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Computer Communication Networks Computer Graphics Computer Science Computer vision Cryptology Data Storage Representation Feature extraction Multimedia Information Systems Neural networks Object recognition Operating Systems Optical tracking Regular Paper Tracking systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1555
container_issue	5
container_start_page	1543
container_title	Multimedia systems
container_volume	28
creator	Wu, Ruixu Wen, Xianbin Liu, Zhanlu Yuan, Liming Xu, Haixia
description	Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.
doi_str_mv	10.1007/s00530-021-00845-y
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2717709836</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2717709836</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKt_wNWA6-hNMpNk3BXxBUXF1nXI5FGmtjNjkgr990ZHcOfqwuE758KH0DmBSwIgriJAxQADJRhAlhXeH6AJKRnFREp6iCZQlxSXNafH6CTGNQARnMEEzRbL2aLV29eXp-vis407vSlS0Oa97VZFo6OzRd8VcdCp7ZPbDn3IgO5soVNyXQ67U3Tk9Sa6s987RW93t8ubBzx_vn-8mc2xYaRO2NYVF1pT4Z2RZaMba4S0jlaWQ0MNF7wSnnAia1kBKx3TwnoGpeeWVs4bNkUX4-4Q-o-di0mt-13o8ktFBRECasl4puhImdDHGJxXQ2i3OuwVAfWtSo2qVFalflSpfS6xsRQz3K1c-Jv-p_UFqrpsPw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2717709836</pqid></control><display><type>article</type><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</creator><creatorcontrib>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</creatorcontrib><description>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</description><identifier>ISSN: 0942-4962</identifier><identifier>EISSN: 1432-1882</identifier><identifier>DOI: 10.1007/s00530-021-00845-y</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial neural networks ; Computer Communication Networks ; Computer Graphics ; Computer Science ; Computer vision ; Cryptology ; Data Storage Representation ; Feature extraction ; Multimedia Information Systems ; Neural networks ; Object recognition ; Operating Systems ; Optical tracking ; Regular Paper ; Tracking systems</subject><ispartof>Multimedia systems, 2022-10, Vol.28 (5), p.1543-1555</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</citedby><cites>FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</cites><orcidid>0000-0002-6136-3388</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00530-021-00845-y$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00530-021-00845-y$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Wu, Ruixu</creatorcontrib><creatorcontrib>Wen, Xianbin</creatorcontrib><creatorcontrib>Liu, Zhanlu</creatorcontrib><creatorcontrib>Yuan, Liming</creatorcontrib><creatorcontrib>Xu, Haixia</creatorcontrib><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><title>Multimedia systems</title><addtitle>Multimedia Systems</addtitle><description>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</description><subject>Artificial neural networks</subject><subject>Computer Communication Networks</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Cryptology</subject><subject>Data Storage Representation</subject><subject>Feature extraction</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Operating Systems</subject><subject>Optical tracking</subject><subject>Regular Paper</subject><subject>Tracking systems</subject><issn>0942-4962</issn><issn>1432-1882</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKt_wNWA6-hNMpNk3BXxBUXF1nXI5FGmtjNjkgr990ZHcOfqwuE758KH0DmBSwIgriJAxQADJRhAlhXeH6AJKRnFREp6iCZQlxSXNafH6CTGNQARnMEEzRbL2aLV29eXp-vis407vSlS0Oa97VZFo6OzRd8VcdCp7ZPbDn3IgO5soVNyXQ67U3Tk9Sa6s987RW93t8ubBzx_vn-8mc2xYaRO2NYVF1pT4Z2RZaMba4S0jlaWQ0MNF7wSnnAia1kBKx3TwnoGpeeWVs4bNkUX4-4Q-o-di0mt-13o8ktFBRECasl4puhImdDHGJxXQ2i3OuwVAfWtSo2qVFalflSpfS6xsRQz3K1c-Jv-p_UFqrpsPw</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Wu, Ruixu</creator><creator>Wen, Xianbin</creator><creator>Liu, Zhanlu</creator><creator>Yuan, Liming</creator><creator>Xu, Haixia</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6136-3388</orcidid></search><sort><creationdate>20221001</creationdate><title>STASiamRPN: visual tracking based on spatiotemporal and attention</title><author>Wu, Ruixu ; Wen, Xianbin ; Liu, Zhanlu ; Yuan, Liming ; Xu, Haixia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-d9567aa27fec84babdc78de25d60b2c67657f1618985034e3a7df304f6d25efc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial neural networks</topic><topic>Computer Communication Networks</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Cryptology</topic><topic>Data Storage Representation</topic><topic>Feature extraction</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Operating Systems</topic><topic>Optical tracking</topic><topic>Regular Paper</topic><topic>Tracking systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Ruixu</creatorcontrib><creatorcontrib>Wen, Xianbin</creatorcontrib><creatorcontrib>Liu, Zhanlu</creatorcontrib><creatorcontrib>Yuan, Liming</creatorcontrib><creatorcontrib>Xu, Haixia</creatorcontrib><collection>CrossRef</collection><jtitle>Multimedia systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Ruixu</au><au>Wen, Xianbin</au><au>Liu, Zhanlu</au><au>Yuan, Liming</au><au>Xu, Haixia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>STASiamRPN: visual tracking based on spatiotemporal and attention</atitle><jtitle>Multimedia systems</jtitle><stitle>Multimedia Systems</stitle><date>2022-10-01</date><risdate>2022</risdate><volume>28</volume><issue>5</issue><spage>1543</spage><epage>1555</epage><pages>1543-1555</pages><issn>0942-4962</issn><eissn>1432-1882</eissn><abstract>Visual tracking is an important research topic in the field of computer vision. The Siamese network tracker based on the region proposal network has achieved promising tracking results in terms of speed and accuracy. However, for fast-moving objects, the structure of the tracking system mainly focuses on information regarding the object appearance, ignoring information related to movement and change at any moment. The original 2D convolutional neural network cannot extract the spatiotemporal information of tracking object and cannot pay attention to the features of tracking object. In this research, a new tracking method is proposed that can extract the spatiotemporal features of tracking objects by constructing a 3D convolutional neural network and integrating the cascade attention mechanism and distinguish similar objects by background suppression and highlighting techniques. To verify the effectiveness of the proposed tracker (STASiamRPN), experiments on the OTB2015, GOT-10K and UAV123 benchmark datasets demonstrated that the proposed tracker was highly comparable to other state-of-the-art methods.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00530-021-00845-y</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-6136-3388</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0942-4962
ispartof	Multimedia systems, 2022-10, Vol.28 (5), p.1543-1555
issn	0942-4962 1432-1882
language	eng
recordid	cdi_proquest_journals_2717709836
source	SpringerLink Journals - AutoHoldings
subjects	Artificial neural networks Computer Communication Networks Computer Graphics Computer Science Computer vision Cryptology Data Storage Representation Feature extraction Multimedia Information Systems Neural networks Object recognition Operating Systems Optical tracking Regular Paper Tracking systems
title	STASiamRPN: visual tracking based on spatiotemporal and attention
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T07%3A48%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=STASiamRPN:%20visual%20tracking%20based%20on%20spatiotemporal%20and%20attention&rft.jtitle=Multimedia%20systems&rft.au=Wu,%20Ruixu&rft.date=2022-10-01&rft.volume=28&rft.issue=5&rft.spage=1543&rft.epage=1555&rft.pages=1543-1555&rft.issn=0942-4962&rft.eissn=1432-1882&rft_id=info:doi/10.1007/s00530-021-00845-y&rft_dat=%3Cproquest_cross%3E2717709836%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2717709836&rft_id=info:pmid/&rfr_iscdi=true