Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking

Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ma, Cong, Yang, Changshui, Yang, Fan, Zhuang, Yueqing, Zhang, Ziwei, Jia, Huizhu, Xie, Xiaodong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Ma, Cong
Yang, Changshui
Yang, Fan
Zhuang, Yueqing
Zhang, Ziwei
Jia, Huizhu
Xie, Xiaodong
description Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16.
doi_str_mv 10.48550/arxiv.1804.04555
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1804_04555</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1804_04555</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-7e5e310a7a800a688136fadb228b591fb1128c467aedc5f80324dc9dfddd83be3</originalsourceid><addsrcrecordid>eNotj0FOwzAQRb1hgQoHYMVcIMFO4sRlB4GWSkWVSlhHY3uMTNMkckNFb0-asnrS19eTHmN3gseZkpI_YPj1x1gonsU8k1Jes10V8JvM0IUTLHDiI4yb2TU0QNkQHn37Bdha2FJkurYdz75rQZ_ghaiHD497OhA8-2i5_QTXBXj_aQbfNwQbfVZfdKPlhl05bA50-88ZqxavVfkWrTfLVfm0jjAvZFSQpFRwLFBxjrlSIs0dWp0kSsu5cFqIRJksL5CskU7xNMmsmVtnrVWppnTG7i_aqbbug99jONXn6nqqTv8AVyJTBg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking</title><source>arXiv.org</source><creator>Ma, Cong ; Yang, Changshui ; Yang, Fan ; Zhuang, Yueqing ; Zhang, Ziwei ; Jia, Huizhu ; Xie, Xiaodong</creator><creatorcontrib>Ma, Cong ; Yang, Changshui ; Yang, Fan ; Zhuang, Yueqing ; Zhang, Ziwei ; Jia, Huizhu ; Xie, Xiaodong</creatorcontrib><description>Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16.</description><identifier>DOI: 10.48550/arxiv.1804.04555</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2018-04</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1804.04555$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1804.04555$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ma, Cong</creatorcontrib><creatorcontrib>Yang, Changshui</creatorcontrib><creatorcontrib>Yang, Fan</creatorcontrib><creatorcontrib>Zhuang, Yueqing</creatorcontrib><creatorcontrib>Zhang, Ziwei</creatorcontrib><creatorcontrib>Jia, Huizhu</creatorcontrib><creatorcontrib>Xie, Xiaodong</creatorcontrib><title>Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking</title><description>Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0FOwzAQRb1hgQoHYMVcIMFO4sRlB4GWSkWVSlhHY3uMTNMkckNFb0-asnrS19eTHmN3gseZkpI_YPj1x1gonsU8k1Jes10V8JvM0IUTLHDiI4yb2TU0QNkQHn37Bdha2FJkurYdz75rQZ_ghaiHD497OhA8-2i5_QTXBXj_aQbfNwQbfVZfdKPlhl05bA50-88ZqxavVfkWrTfLVfm0jjAvZFSQpFRwLFBxjrlSIs0dWp0kSsu5cFqIRJksL5CskU7xNMmsmVtnrVWppnTG7i_aqbbug99jONXn6nqqTv8AVyJTBg</recordid><startdate>20180412</startdate><enddate>20180412</enddate><creator>Ma, Cong</creator><creator>Yang, Changshui</creator><creator>Yang, Fan</creator><creator>Zhuang, Yueqing</creator><creator>Zhang, Ziwei</creator><creator>Jia, Huizhu</creator><creator>Xie, Xiaodong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20180412</creationdate><title>Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking</title><author>Ma, Cong ; Yang, Changshui ; Yang, Fan ; Zhuang, Yueqing ; Zhang, Ziwei ; Jia, Huizhu ; Xie, Xiaodong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-7e5e310a7a800a688136fadb228b591fb1128c467aedc5f80324dc9dfddd83be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Ma, Cong</creatorcontrib><creatorcontrib>Yang, Changshui</creatorcontrib><creatorcontrib>Yang, Fan</creatorcontrib><creatorcontrib>Zhuang, Yueqing</creatorcontrib><creatorcontrib>Zhang, Ziwei</creatorcontrib><creatorcontrib>Jia, Huizhu</creatorcontrib><creatorcontrib>Xie, Xiaodong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ma, Cong</au><au>Yang, Changshui</au><au>Yang, Fan</au><au>Zhuang, Yueqing</au><au>Zhang, Ziwei</au><au>Jia, Huizhu</au><au>Xie, Xiaodong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking</atitle><date>2018-04-12</date><risdate>2018</risdate><abstract>Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16.</abstract><doi>10.48550/arxiv.1804.04555</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1804.04555
ispartof
issn
language eng
recordid cdi_arxiv_primary_1804_04555
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T00%3A44%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Trajectory%20Factory:%20Tracklet%20Cleaving%20and%20Re-connection%20by%20Deep%20Siamese%20Bi-GRU%20for%20Multiple%20Object%20Tracking&rft.au=Ma,%20Cong&rft.date=2018-04-12&rft_id=info:doi/10.48550/arxiv.1804.04555&rft_dat=%3Carxiv_GOX%3E1804_04555%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true