Mining Spatial-Temporal Similarity for Visual Tracking

Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as cons...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2020-01, Vol.29, p.8107-8119
Hauptverfasser: Zhang, Yu, Gao, Xingyu, Chen, Zhenyu, Zhong, Huicai, Xie, Hongtao, Yan, Chenggang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 8119
container_issue
container_start_page 8107
container_title IEEE transactions on image processing
container_volume 29
creator Zhang, Yu
Gao, Xingyu
Chen, Zhenyu
Zhong, Huicai
Xie, Hongtao
Yan, Chenggang
description Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.
doi_str_mv 10.1109/TIP.2020.2981813
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2431701449</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9107463</ieee_id><sourcerecordid>2430376200</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</originalsourceid><addsrcrecordid>eNpdkEtLAzEQgIMotlbvgiALXrxsnWTSZHOU4qNQUWj1umTTrKTuy6R76L83pdWDp0xmvhlmPkIuKYwpBXW3nL2NGTAYM5XRjOIRGVLFaQrA2XGMYSJTSbkakLMQ1gCUT6g4JQNkkguGckjEi2tc85ksOr1xukqXtu5ar6tk4WpXae8226RsffLhQh-zS6_NV-TPyUmpq2AvDu-IvD8-LKfP6fz1aTa9n6cGudykpTTITMFlhkoJ1EUmjMT4QbRYGEtFJrPSshgxBgZQMGb1SkhlIdIUR-R2P7fz7XdvwyavXTC2qnRj2z7kjCOgFAwgojf_0HXb-yZut6OojMdzFSnYU8a3IXhb5p13tfbbnEK-c5pHp_nOaX5wGluuD4P7orarv4ZfiRG42gPOWvtXVhRiHfEH9YB4AQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431701449</pqid></control><display><type>article</type><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</creator><creatorcontrib>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</creatorcontrib><description>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2981813</identifier><identifier>PMID: 32746237</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Ammonia ; Cathodes ; Cognitive tasks ; correlation filter ; Fuels ; Hydrogen ; Liquids ; Machine learning ; Marine vehicles ; Optical tracking ; Optimization ; Propulsion ; Regularization ; Similarity ; Spatial-temporal similarity ; Visual fields ; visual object tracking</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.8107-8119</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</citedby><cites>FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</cites><orcidid>0000-0002-4660-8092 ; 0000-0002-9037-5265 ; 0000-0002-6249-5315 ; 0000-0003-1204-0512</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9107463$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27925,27926,54759</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9107463$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32746237$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Gao, Xingyu</creatorcontrib><creatorcontrib>Chen, Zhenyu</creatorcontrib><creatorcontrib>Zhong, Huicai</creatorcontrib><creatorcontrib>Xie, Hongtao</creatorcontrib><creatorcontrib>Yan, Chenggang</creatorcontrib><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</description><subject>Algorithms</subject><subject>Ammonia</subject><subject>Cathodes</subject><subject>Cognitive tasks</subject><subject>correlation filter</subject><subject>Fuels</subject><subject>Hydrogen</subject><subject>Liquids</subject><subject>Machine learning</subject><subject>Marine vehicles</subject><subject>Optical tracking</subject><subject>Optimization</subject><subject>Propulsion</subject><subject>Regularization</subject><subject>Similarity</subject><subject>Spatial-temporal similarity</subject><subject>Visual fields</subject><subject>visual object tracking</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLAzEQgIMotlbvgiALXrxsnWTSZHOU4qNQUWj1umTTrKTuy6R76L83pdWDp0xmvhlmPkIuKYwpBXW3nL2NGTAYM5XRjOIRGVLFaQrA2XGMYSJTSbkakLMQ1gCUT6g4JQNkkguGckjEi2tc85ksOr1xukqXtu5ar6tk4WpXae8226RsffLhQh-zS6_NV-TPyUmpq2AvDu-IvD8-LKfP6fz1aTa9n6cGudykpTTITMFlhkoJ1EUmjMT4QbRYGEtFJrPSshgxBgZQMGb1SkhlIdIUR-R2P7fz7XdvwyavXTC2qnRj2z7kjCOgFAwgojf_0HXb-yZut6OojMdzFSnYU8a3IXhb5p13tfbbnEK-c5pHp_nOaX5wGluuD4P7orarv4ZfiRG42gPOWvtXVhRiHfEH9YB4AQ</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Zhang, Yu</creator><creator>Gao, Xingyu</creator><creator>Chen, Zhenyu</creator><creator>Zhong, Huicai</creator><creator>Xie, Hongtao</creator><creator>Yan, Chenggang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4660-8092</orcidid><orcidid>https://orcid.org/0000-0002-9037-5265</orcidid><orcidid>https://orcid.org/0000-0002-6249-5315</orcidid><orcidid>https://orcid.org/0000-0003-1204-0512</orcidid></search><sort><creationdate>20200101</creationdate><title>Mining Spatial-Temporal Similarity for Visual Tracking</title><author>Zhang, Yu ; Gao, Xingyu ; Chen, Zhenyu ; Zhong, Huicai ; Xie, Hongtao ; Yan, Chenggang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-f7c32cb47839963ab86c7383933e3bce16878fe2ce1220c03622ead679e03ab13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Ammonia</topic><topic>Cathodes</topic><topic>Cognitive tasks</topic><topic>correlation filter</topic><topic>Fuels</topic><topic>Hydrogen</topic><topic>Liquids</topic><topic>Machine learning</topic><topic>Marine vehicles</topic><topic>Optical tracking</topic><topic>Optimization</topic><topic>Propulsion</topic><topic>Regularization</topic><topic>Similarity</topic><topic>Spatial-temporal similarity</topic><topic>Visual fields</topic><topic>visual object tracking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Gao, Xingyu</creatorcontrib><creatorcontrib>Chen, Zhenyu</creatorcontrib><creatorcontrib>Zhong, Huicai</creatorcontrib><creatorcontrib>Xie, Hongtao</creatorcontrib><creatorcontrib>Yan, Chenggang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Yu</au><au>Gao, Xingyu</au><au>Chen, Zhenyu</au><au>Zhong, Huicai</au><au>Xie, Hongtao</au><au>Yan, Chenggang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mining Spatial-Temporal Similarity for Visual Tracking</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>8107</spage><epage>8119</epage><pages>8107-8119</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Correlation filter (CF) is a critical technique to improve accuracy and speed in the field of visual object tracking. Despite being studied extensively, most existing CF methods suffer from failing to make the most of the inherent spatial-temporal prior of videos. To address this limitation, as consecutive frames are eminently resemble in most videos, we investigate a novel scheme to predict targets' future state by exploiting previous observations. Specifically, in this paper, we propose a prediction based CF tracking framework by learning the spatial-temporal similarity of consecutive frames for sample managing, template regularization, and training response pre-weighting. We model the learning problem theoretically as a novel objective and provide effective optimization algorithms to solve the learning task. In addition, we implement two CF trackers with different features. Extensive experiments are conducted on three popular benchmarks to validate our scheme. The encouraging results demonstrate that the proposed scheme can significantly boost the accuracy of CF tracking, and the two trackers achieve competitive performances against state-of-the-art trackers. We finally present a comprehensive analysis on the efficacy of our proposed method and the efficiency of our trackers to facilitate real-world visual tracking applications.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>32746237</pmid><doi>10.1109/TIP.2020.2981813</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-4660-8092</orcidid><orcidid>https://orcid.org/0000-0002-9037-5265</orcidid><orcidid>https://orcid.org/0000-0002-6249-5315</orcidid><orcidid>https://orcid.org/0000-0003-1204-0512</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2020-01, Vol.29, p.8107-8119
issn 1057-7149
1941-0042
language eng
recordid cdi_proquest_journals_2431701449
source IEEE Electronic Library (IEL)
subjects Algorithms
Ammonia
Cathodes
Cognitive tasks
correlation filter
Fuels
Hydrogen
Liquids
Machine learning
Marine vehicles
Optical tracking
Optimization
Propulsion
Regularization
Similarity
Spatial-temporal similarity
Visual fields
visual object tracking
title Mining Spatial-Temporal Similarity for Visual Tracking
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-18T14%3A54%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mining%20Spatial-Temporal%20Similarity%20for%20Visual%20Tracking&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Zhang,%20Yu&rft.date=2020-01-01&rft.volume=29&rft.spage=8107&rft.epage=8119&rft.pages=8107-8119&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2981813&rft_dat=%3Cproquest_RIE%3E2430376200%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431701449&rft_id=info:pmid/32746237&rft_ieee_id=9107463&rfr_iscdi=true