Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning
In order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection algorithms, this paper presents an object tracking algorithm, termed as multi-scale spatio-temporal context (MSTC) learning tracking. MSTC collaboratively explores three different types of spatio-temporal c...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems for video technology 2018-10, Vol.28 (10), p.2849-2860 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2860 |
---|---|
container_issue | 10 |
container_start_page | 2849 |
container_title | IEEE transactions on circuits and systems for video technology |
container_volume | 28 |
creator | Xue, Wanli Xu, Chao Feng, Zhiyong |
description | In order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection algorithms, this paper presents an object tracking algorithm, termed as multi-scale spatio-temporal context (MSTC) learning tracking. MSTC collaboratively explores three different types of spatio-temporal contexts, named the long-term historical targets, the medium-term stable scene (i.e., a short continuous and stable video sequence), and the short-term overall samples to improve the tracking efficiency and reduce the drift phenomenon. Different from conventional multi-timescale tracking paradigm that chooses samples in a fixed manner, MSTC formulates a low-dimensional representation named fast perceptual hash algorithm to update long-term historical targets and the medium-term stable scene dynamically with image similarity. MSTC also differs from most tracking-by-detection algorithms that label samples as positive or negative, it investigates a fusion salient sample detection to fuse weights of the samples not only by the distance information, but also by the visual spatial attention, such as color, intensity, and texture. Numerous experimental evaluations with most state-of-the-art algorithms on the standard 50 video benchmark demonstrate the superiority of the proposed algorithm. |
doi_str_mv | 10.1109/TCSVT.2017.2720749 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2126463217</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7961203</ieee_id><sourcerecordid>2126463217</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-8e59c0ef77a3b787c0606e34d53391563063fa3fb95b5fa81700c5bff8b0f543</originalsourceid><addsrcrecordid>eNo9kE1LxDAQQIMouH78Ab0UPGedSZqkPUrRVVgR3LLXkNZEunbbmqSi_96uu3iaObw3A4-QK4Q5IuS3ZbFal3MGqOZMMVBpfkRmKERGGQNxPO0gkGYMxSk5C2EDgGmWqhlZvPbVGGKybsJo2qT0pv5ouvfkqzHJ89jGhq5q09pkNZjY9LS026H3E1j0XbTfMVla47tJuCAnzrTBXh7mOSkf7svikS5fFk_F3ZLWLBeRZlbkNVinlOGVylQNEqTl6ZvgPEchOUjuDHdVLirhTIYKoBaVc1kFTqT8nNzszw6-_xxtiHrTj76bPmqGTKaSM1QTxfZU7fsQvHV68M3W-B-NoHe99F8vveulD70m6XovNdbaf0HlEhlw_gvvU2YV</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2126463217</pqid></control><display><type>article</type><title>Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning</title><source>IEEE Electronic Library (IEL)</source><creator>Xue, Wanli ; Xu, Chao ; Feng, Zhiyong</creator><creatorcontrib>Xue, Wanli ; Xu, Chao ; Feng, Zhiyong</creatorcontrib><description>In order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection algorithms, this paper presents an object tracking algorithm, termed as multi-scale spatio-temporal context (MSTC) learning tracking. MSTC collaboratively explores three different types of spatio-temporal contexts, named the long-term historical targets, the medium-term stable scene (i.e., a short continuous and stable video sequence), and the short-term overall samples to improve the tracking efficiency and reduce the drift phenomenon. Different from conventional multi-timescale tracking paradigm that chooses samples in a fixed manner, MSTC formulates a low-dimensional representation named fast perceptual hash algorithm to update long-term historical targets and the medium-term stable scene dynamically with image similarity. MSTC also differs from most tracking-by-detection algorithms that label samples as positive or negative, it investigates a fusion salient sample detection to fuse weights of the samples not only by the distance information, but also by the visual spatial attention, such as color, intensity, and texture. Numerous experimental evaluations with most state-of-the-art algorithms on the standard 50 video benchmark demonstrate the superiority of the proposed algorithm.</description><identifier>ISSN: 1051-8215</identifier><identifier>EISSN: 1558-2205</identifier><identifier>DOI: 10.1109/TCSVT.2017.2720749</identifier><identifier>CODEN: ITCTEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Automobiles ; Context ; Feature extraction ; Image color analysis ; Image detection ; Machine learning ; multi-scale ; Optical tracking ; perceptual hash ; salient sample ; sample selection ; spatio-temporal ; State of the art ; Target tracking ; Visual tracking ; Visualization</subject><ispartof>IEEE transactions on circuits and systems for video technology, 2018-10, Vol.28 (10), p.2849-2860</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-8e59c0ef77a3b787c0606e34d53391563063fa3fb95b5fa81700c5bff8b0f543</citedby><cites>FETCH-LOGICAL-c295t-8e59c0ef77a3b787c0606e34d53391563063fa3fb95b5fa81700c5bff8b0f543</cites><orcidid>0000-0001-8158-7453 ; 0000-0002-6031-9334 ; 0000-0002-6398-0398</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7961203$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7961203$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xue, Wanli</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Feng, Zhiyong</creatorcontrib><title>Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning</title><title>IEEE transactions on circuits and systems for video technology</title><addtitle>TCSVT</addtitle><description>In order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection algorithms, this paper presents an object tracking algorithm, termed as multi-scale spatio-temporal context (MSTC) learning tracking. MSTC collaboratively explores three different types of spatio-temporal contexts, named the long-term historical targets, the medium-term stable scene (i.e., a short continuous and stable video sequence), and the short-term overall samples to improve the tracking efficiency and reduce the drift phenomenon. Different from conventional multi-timescale tracking paradigm that chooses samples in a fixed manner, MSTC formulates a low-dimensional representation named fast perceptual hash algorithm to update long-term historical targets and the medium-term stable scene dynamically with image similarity. MSTC also differs from most tracking-by-detection algorithms that label samples as positive or negative, it investigates a fusion salient sample detection to fuse weights of the samples not only by the distance information, but also by the visual spatial attention, such as color, intensity, and texture. Numerous experimental evaluations with most state-of-the-art algorithms on the standard 50 video benchmark demonstrate the superiority of the proposed algorithm.</description><subject>Algorithms</subject><subject>Automobiles</subject><subject>Context</subject><subject>Feature extraction</subject><subject>Image color analysis</subject><subject>Image detection</subject><subject>Machine learning</subject><subject>multi-scale</subject><subject>Optical tracking</subject><subject>perceptual hash</subject><subject>salient sample</subject><subject>sample selection</subject><subject>spatio-temporal</subject><subject>State of the art</subject><subject>Target tracking</subject><subject>Visual tracking</subject><subject>Visualization</subject><issn>1051-8215</issn><issn>1558-2205</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LxDAQQIMouH78Ab0UPGedSZqkPUrRVVgR3LLXkNZEunbbmqSi_96uu3iaObw3A4-QK4Q5IuS3ZbFal3MGqOZMMVBpfkRmKERGGQNxPO0gkGYMxSk5C2EDgGmWqhlZvPbVGGKybsJo2qT0pv5ouvfkqzHJ89jGhq5q09pkNZjY9LS026H3E1j0XbTfMVla47tJuCAnzrTBXh7mOSkf7svikS5fFk_F3ZLWLBeRZlbkNVinlOGVylQNEqTl6ZvgPEchOUjuDHdVLirhTIYKoBaVc1kFTqT8nNzszw6-_xxtiHrTj76bPmqGTKaSM1QTxfZU7fsQvHV68M3W-B-NoHe99F8vveulD70m6XovNdbaf0HlEhlw_gvvU2YV</recordid><startdate>20181001</startdate><enddate>20181001</enddate><creator>Xue, Wanli</creator><creator>Xu, Chao</creator><creator>Feng, Zhiyong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8158-7453</orcidid><orcidid>https://orcid.org/0000-0002-6031-9334</orcidid><orcidid>https://orcid.org/0000-0002-6398-0398</orcidid></search><sort><creationdate>20181001</creationdate><title>Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning</title><author>Xue, Wanli ; Xu, Chao ; Feng, Zhiyong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-8e59c0ef77a3b787c0606e34d53391563063fa3fb95b5fa81700c5bff8b0f543</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Automobiles</topic><topic>Context</topic><topic>Feature extraction</topic><topic>Image color analysis</topic><topic>Image detection</topic><topic>Machine learning</topic><topic>multi-scale</topic><topic>Optical tracking</topic><topic>perceptual hash</topic><topic>salient sample</topic><topic>sample selection</topic><topic>spatio-temporal</topic><topic>State of the art</topic><topic>Target tracking</topic><topic>Visual tracking</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xue, Wanli</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Feng, Zhiyong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on circuits and systems for video technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xue, Wanli</au><au>Xu, Chao</au><au>Feng, Zhiyong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning</atitle><jtitle>IEEE transactions on circuits and systems for video technology</jtitle><stitle>TCSVT</stitle><date>2018-10-01</date><risdate>2018</risdate><volume>28</volume><issue>10</issue><spage>2849</spage><epage>2860</epage><pages>2849-2860</pages><issn>1051-8215</issn><eissn>1558-2205</eissn><coden>ITCTEM</coden><abstract>In order to tackle the incomplete and inaccurate of the samples in most tracking-by-detection algorithms, this paper presents an object tracking algorithm, termed as multi-scale spatio-temporal context (MSTC) learning tracking. MSTC collaboratively explores three different types of spatio-temporal contexts, named the long-term historical targets, the medium-term stable scene (i.e., a short continuous and stable video sequence), and the short-term overall samples to improve the tracking efficiency and reduce the drift phenomenon. Different from conventional multi-timescale tracking paradigm that chooses samples in a fixed manner, MSTC formulates a low-dimensional representation named fast perceptual hash algorithm to update long-term historical targets and the medium-term stable scene dynamically with image similarity. MSTC also differs from most tracking-by-detection algorithms that label samples as positive or negative, it investigates a fusion salient sample detection to fuse weights of the samples not only by the distance information, but also by the visual spatial attention, such as color, intensity, and texture. Numerous experimental evaluations with most state-of-the-art algorithms on the standard 50 video benchmark demonstrate the superiority of the proposed algorithm.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSVT.2017.2720749</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-8158-7453</orcidid><orcidid>https://orcid.org/0000-0002-6031-9334</orcidid><orcidid>https://orcid.org/0000-0002-6398-0398</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1051-8215 |
ispartof | IEEE transactions on circuits and systems for video technology, 2018-10, Vol.28 (10), p.2849-2860 |
issn | 1051-8215 1558-2205 |
language | eng |
recordid | cdi_proquest_journals_2126463217 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Automobiles Context Feature extraction Image color analysis Image detection Machine learning multi-scale Optical tracking perceptual hash salient sample sample selection spatio-temporal State of the art Target tracking Visual tracking Visualization |
title | Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T01%3A32%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Visual%20Tracking%20via%20Multi-Scale%20Spatio-Temporal%20Context%20Learning&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems%20for%20video%20technology&rft.au=Xue,%20Wanli&rft.date=2018-10-01&rft.volume=28&rft.issue=10&rft.spage=2849&rft.epage=2860&rft.pages=2849-2860&rft.issn=1051-8215&rft.eissn=1558-2205&rft.coden=ITCTEM&rft_id=info:doi/10.1109/TCSVT.2017.2720749&rft_dat=%3Cproquest_RIE%3E2126463217%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2126463217&rft_id=info:pmid/&rft_ieee_id=7961203&rfr_iscdi=true |