TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator

In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this pape...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhao, Pengyao, Liu, Quanli, Wang, Wei, Guo, Qiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Zhao, Pengyao
Liu, Quanli
Wang, Wei
Guo, Qiang
description In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this paper, a RGB-D tracker named TSDM is proposed, which is composed of a Mask-generator (M-g), SiamRPN++ and a Depth-refiner (D-r). The M-g generates the background masks, and updates them as the target 3D position changes. The D-r optimizes the target bounding box estimated by SiamRPN++, based on the spatial depth distribution difference between the target and the surrounding background. Extensive evaluation on the Princeton Tracking Benchmark and the Visual Object Tracking challenge shows that our tracker outperforms the state-of-the-art by a large margin while achieving 23 FPS. In addition, a light-weight variant can run at 31 FPS and thus it is practical for real world applications. Code and models of TSDM are available at https://github.com/lql-team/TSDM.
doi_str_mv 10.48550/arxiv.2005.04063
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2005_04063</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2005_04063</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-21074102ff7e6fa122ed8210ac9220df1c76aada8bd1e9b5fdd3f4f57661b3763</originalsourceid><addsrcrecordid>eNotz01PAjEUheFuWBjwB7iye9Lxtp22AzsC-JGAGpn95M60FxpkJGWi8u8d0dVJnsVJXsZuJGR5YQzcYfqOn5kCMBnkYPUVW5abxXrKy4TNPrZbXp_5JuLh7fV5POZfsdtx5Itw7HYiBYptSBxb39saT3uxDT1g95FGbED4fgrX_ztk5f2ynD-K1cvD03y2EmidFkqCyyUoIhcsoVQq-KJHbCZKgSfZOIvosai9DJPakPeacjLOWllrZ_WQ3f7dXjqqY4oHTOfqt6e69OgfJllDsQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator</title><source>arXiv.org</source><creator>Zhao, Pengyao ; Liu, Quanli ; Wang, Wei ; Guo, Qiang</creator><creatorcontrib>Zhao, Pengyao ; Liu, Quanli ; Wang, Wei ; Guo, Qiang</creatorcontrib><description>In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this paper, a RGB-D tracker named TSDM is proposed, which is composed of a Mask-generator (M-g), SiamRPN++ and a Depth-refiner (D-r). The M-g generates the background masks, and updates them as the target 3D position changes. The D-r optimizes the target bounding box estimated by SiamRPN++, based on the spatial depth distribution difference between the target and the surrounding background. Extensive evaluation on the Princeton Tracking Benchmark and the Visual Object Tracking challenge shows that our tracker outperforms the state-of-the-art by a large margin while achieving 23 FPS. In addition, a light-weight variant can run at 31 FPS and thus it is practical for real world applications. Code and models of TSDM are available at https://github.com/lql-team/TSDM.</description><identifier>DOI: 10.48550/arxiv.2005.04063</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2020-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2005.04063$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2005.04063$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Pengyao</creatorcontrib><creatorcontrib>Liu, Quanli</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Guo, Qiang</creatorcontrib><title>TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator</title><description>In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this paper, a RGB-D tracker named TSDM is proposed, which is composed of a Mask-generator (M-g), SiamRPN++ and a Depth-refiner (D-r). The M-g generates the background masks, and updates them as the target 3D position changes. The D-r optimizes the target bounding box estimated by SiamRPN++, based on the spatial depth distribution difference between the target and the surrounding background. Extensive evaluation on the Princeton Tracking Benchmark and the Visual Object Tracking challenge shows that our tracker outperforms the state-of-the-art by a large margin while achieving 23 FPS. In addition, a light-weight variant can run at 31 FPS and thus it is practical for real world applications. Code and models of TSDM are available at https://github.com/lql-team/TSDM.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz01PAjEUheFuWBjwB7iye9Lxtp22AzsC-JGAGpn95M60FxpkJGWi8u8d0dVJnsVJXsZuJGR5YQzcYfqOn5kCMBnkYPUVW5abxXrKy4TNPrZbXp_5JuLh7fV5POZfsdtx5Itw7HYiBYptSBxb39saT3uxDT1g95FGbED4fgrX_ztk5f2ynD-K1cvD03y2EmidFkqCyyUoIhcsoVQq-KJHbCZKgSfZOIvosai9DJPakPeacjLOWllrZ_WQ3f7dXjqqY4oHTOfqt6e69OgfJllDsQ</recordid><startdate>20200508</startdate><enddate>20200508</enddate><creator>Zhao, Pengyao</creator><creator>Liu, Quanli</creator><creator>Wang, Wei</creator><creator>Guo, Qiang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200508</creationdate><title>TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator</title><author>Zhao, Pengyao ; Liu, Quanli ; Wang, Wei ; Guo, Qiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-21074102ff7e6fa122ed8210ac9220df1c76aada8bd1e9b5fdd3f4f57661b3763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Pengyao</creatorcontrib><creatorcontrib>Liu, Quanli</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Guo, Qiang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Pengyao</au><au>Liu, Quanli</au><au>Wang, Wei</au><au>Guo, Qiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator</atitle><date>2020-05-08</date><risdate>2020</risdate><abstract>In a generic object tracking, depth (D) information provides informative cues for foreground-background separation and target bounding box regression. However, so far, few trackers have used depth information to play the important role aforementioned due to the lack of a suitable model. In this paper, a RGB-D tracker named TSDM is proposed, which is composed of a Mask-generator (M-g), SiamRPN++ and a Depth-refiner (D-r). The M-g generates the background masks, and updates them as the target 3D position changes. The D-r optimizes the target bounding box estimated by SiamRPN++, based on the spatial depth distribution difference between the target and the surrounding background. Extensive evaluation on the Princeton Tracking Benchmark and the Visual Object Tracking challenge shows that our tracker outperforms the state-of-the-art by a large margin while achieving 23 FPS. In addition, a light-weight variant can run at 31 FPS and thus it is practical for real world applications. Code and models of TSDM are available at https://github.com/lql-team/TSDM.</abstract><doi>10.48550/arxiv.2005.04063</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2005.04063
ispartof
issn
language eng
recordid cdi_arxiv_primary_2005_04063
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title TSDM: Tracking by SiamRPN++ with a Depth-refiner and a Mask-generator
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T10%3A27%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TSDM:%20Tracking%20by%20SiamRPN++%20with%20a%20Depth-refiner%20and%20a%20Mask-generator&rft.au=Zhao,%20Pengyao&rft.date=2020-05-08&rft_id=info:doi/10.48550/arxiv.2005.04063&rft_dat=%3Carxiv_GOX%3E2005_04063%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true