Mining Spatial-Temporal Similarity for Visual Tracking

Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as cons...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2020-01, Vol.29, p.8107-8119
Hauptverfasser:	Zhang, Yu, Gao, Xingyu, Chen, Zhenyu, Zhong, Huicai, Xie, Hongtao, Yan, Chenggang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Ammonia Cathodes Cognitive tasks correlation filter Fuels Hydrogen Liquids Machine learning Marine vehicles Optical tracking Optimization Propulsion Regularization Similarity Spatial-temporal similarity Visual fields visual object tracking
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	8119
container_issue
container_start_page	8107
container_title	IEEE transactions on image processing
container_volume	29
creator	Zhang, Yu Gao, Xingyu Chen, Zhenyu Zhong, Huicai Xie, Hongtao Yan, Chenggang
description	Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.
doi_str_mv	10.1109/TIP.2020.2981813
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2431701449</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9107463</ieee_id><sourcerecordid>2430376200</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</originalsourceid><addsrcrecordid>eNpdkEtLAzEQgIMotlbvgiALXrxsnWTSZHOU4qNQUWj1umTTrKTuy6R76L83pdWDp0xmvhlmPkIuKYwpBXW3nL2NGTAYM5XRjOIRGVLFaQrA2XGMYSJTSbkakLMQ1gCUT6g4JQNkkguGckjEi2tc85ksOr1xukqXtu5ar6tk4WpXae8226RsffLhQh-zS6_NV-TPyUmpq2AvDu-IvD8-LKfP6fz1aTa9n6cGudykpTTITMFlhkoJ1EUmjMT4QbRYGEtFJrPSshgxBgZQMGb1SkhlIdIUR-R2P7fz7XdvwyavXTC2qnRj2z7kjCOgFAwgojf_0HXb-yZut6OojMdzFSnYU8a3IXhb5p13tfbbnEK-c5pHp_nOaX5wGluuD4P7orarv4ZfiRG42gPOWvtXVhRiHfEH9YB4AQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431701449</pqid></control><display><type>article</type><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</creator><creatorcontrib>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</creatorcontrib><description>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2981813</identifier><identifier>PMID: 32746237</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Ammonia ; Cathodes ; Cognitive tasks ; correlation filter ; Fuels ; Hydrogen ; Liquids ; Machine learning ; Marine vehicles ; Optical tracking ; Optimization ; Propulsion ; Regularization ; Similarity ; Spatial-temporal similarity ; Visual fields ; visual object tracking</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.8107-8119</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</citedby><cites>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</cites><orcidid>0000-0002-4660-8092 ; 0000-0002-9037-5265 ; 0000-0002-6249-5315 ; 0000-0003-1204-0512</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9107463$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27925,27926,54759</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9107463$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32746237$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Gao, Xingyu</creatorcontrib><creatorcontrib>Chen, Zhenyu</creatorcontrib><creatorcontrib>Zhong, Huicai</creatorcontrib><creatorcontrib>Xie, Hongtao</creatorcontrib><creatorcontrib>Yan, Chenggang</creatorcontrib><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</description><subject>Algorithms</subject><subject>Ammonia</subject><subject>Cathodes</subject><subject>Cognitive tasks</subject><subject>correlation filter</subject><subject>Fuels</subject><subject>Hydrogen</subject><subject>Liquids</subject><subject>Machine learning</subject><subject>Marine vehicles</subject><subject>Optical tracking</subject><subject>Optimization</subject><subject>Propulsion</subject><subject>Regularization</subject><subject>Similarity</subject><subject>Spatial-temporal similarity</subject><subject>Visual fields</subject><subject>visual object tracking</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLAzEQgIMotlbvgiALXrxsnWTSZHOU4qNQUWj1umTTrKTuy6R76L83pdWDp0xmvhlmPkIuKYwpBXW3nL2NGTAYM5XRjOIRGVLFaQrA2XGMYSJTSbkakLMQ1gCUT6g4JQNkkguGckjEi2tc85ksOr1xukqXtu5ar6tk4WpXae8226RsffLhQh-zS6_NV-TPyUmpq2AvDu-IvD8-LKfP6fz1aTa9n6cGudykpTTITMFlhkoJ1EUmjMT4QbRYGEtFJrPSshgxBgZQMGb1SkhlIdIUR-R2P7fz7XdvwyavXTC2qnRj2z7kjCOgFAwgojf_0HXb-yZut6OojMdzFSnYU8a3IXhb5p13tfbbnEK-c5pHp_nOaX5wGluuD4P7orarv4ZfiRG42gPOWvtXVhRiHfEH9YB4AQ</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Zhang, Yu</creator><creator>Gao, Xingyu</creator><creator>Chen, Zhenyu</creator><creator>Zhong, Huicai</creator><creator>Xie, Hongtao</creator><creator>Yan, Chenggang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4660-8092</orcidid><orcidid>https://orcid.org/0000-0002-9037-5265</orcidid><orcidid>https://orcid.org/0000-0002-6249-5315</orcidid><orcidid>https://orcid.org/0000-0003-1204-0512</orcidid></search><sort><creationdate>20200101</creationdate><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><author>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Ammonia</topic><topic>Cathodes</topic><topic>Cognitive tasks</topic><topic>correlation filter</topic><topic>Fuels</topic><topic>Hydrogen</topic><topic>Liquids</topic><topic>Machine learning</topic><topic>Marine vehicles</topic><topic>Optical tracking</topic><topic>Optimization</topic><topic>Propulsion</topic><topic>Regularization</topic><topic>Similarity</topic><topic>Spatial-temporal similarity</topic><topic>Visual fields</topic><topic>visual object tracking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Gao, Xingyu</creatorcontrib><creatorcontrib>Chen, Zhenyu</creatorcontrib><creatorcontrib>Zhong, Huicai</creatorcontrib><creatorcontrib>Xie, Hongtao</creatorcontrib><creatorcontrib>Yan, Chenggang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Yu</au><au>Gao, Xingyu</au><au>Chen, Zhenyu</au><au>Zhong, Huicai</au><au>Xie, Hongtao</au><au>Yan, Chenggang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mining Spatial-Temporal Similarity for Visual Tracking</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>8107</spage><epage>8119</epage><pages>8107-8119</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>32746237</pmid><doi>10.1109/TIP.2020.2981813</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-4660-8092</orcidid><orcidid>https://orcid.org/0000-0002-9037-5265</orcidid><orcidid>https://orcid.org/0000-0002-6249-5315</orcidid><orcidid>https://orcid.org/0000-0003-1204-0512</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2020-01, Vol.29, p.8107-8119
issn	1057-7149 1941-0042
language	eng
recordid	cdi_proquest_journals_2431701449
source	IEEE Electronic Library (IEL)
subjects	Algorithms Ammonia Cathodes Cognitive tasks correlation filter Fuels Hydrogen Liquids Machine learning Marine vehicles Optical tracking Optimization Propulsion Regularization Similarity Spatial-temporal similarity Visual fields visual object tracking
title	Mining Spatial-Temporal Similarity for Visual Tracking
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T14%3A54%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mining%20Spatial-Temporal%20Similarity%20for%20Visual%20Tracking&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Zhang,%20Yu&rft.date=2020-01-01&rft.volume=29&rft.spage=8107&rft.epage=8119&rft.pages=8107-8119&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2981813&rft_dat=%3Cproquest_RIE%3E2430376200%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431701449&rft_id=info:pmid/32746237&rft_ieee_id=9107463&rfr_iscdi=true