Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification

Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity wi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics (Basel) 2022-07, Vol.11 (13), p.1941
Hauptverfasser: Wang, Qi, Huang, Hao, Zhong, Yuling, Min, Weidong, Han, Qing, Xu, Desheng, Xu, Changwen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 13
container_start_page 1941
container_title Electronics (Basel)
container_volume 11
creator Wang, Qi
Huang, Hao
Zhong, Yuling
Min, Weidong
Han, Qing
Xu, Desheng
Xu, Changwen
description Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity will appear with different backgrounds, which interferes with the Re-ID performance. This paper proposes a swin transformer based on two-fold loss (TL-TransNet) to pay more attention to the semantic information of a pedestrian’s body and preserve valuable background information, thereby reducing the interference of corresponding background appearance. TL-TransNet is supervised by two types of losses (i.e., circle loss and instance loss) during the training phase. In the retrieval phase, DeepLabV3+ as a pedestrian background segmentation model is applied to generate body masks in terms of query and gallery set. The background removal results are generated according to the mask and are used to filter out interfering background information. Subsequently, a background adaptation re-ranking is designed to combine the original information with the background-removed information, which digs out more positive samples with large background deviation. Extensive experiments on two public person Re-ID datasets testify that the proposed method achieves competitive robustness performance in terms of the background variation problem.
doi_str_mv 10.3390/electronics11131941
format Article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2685983672</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A723418551</galeid><sourcerecordid>A723418551</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-6eb1ff4fccf58292b0d2262a30a410d7b1f6a28a8d70209e2c0fc1c37cd911903</originalsourceid><addsrcrecordid>eNptkUtLAzEQx4MoWGo_gZcFz1szyb5yrMVqoaDUel7SPEr6SGqypfjtHV0PHswcMvznNw9mCLkFOuZc0HuzN6qLwTuVAICDKOCCDBitRS6YYJd__GsySmlL8QngDacD4t7OzmerKH2yIR5MzB5kMjoLKJ5DPgt7nS1CSpn0GkNqt4nhhO5Ey2MnO4fc0uRL6XfObzIskb2amHp1ro3vnHXqh7shV1bukxn9_kPyPntcTZ_zxcvTfDpZ5IpX0OWVWYO1hVXKlg2OvKaasYpJTmUBVNcYrSRrZKNryqgwTFGrQPFaaQEgKB-Su77uMYaPk0lduw2n6LFly6qmFA2vaobUuKc2cm9a523oolRo2hycCt5Yh_oEyQKasgRM4H2CiriOaGx7jO4g42cLtP2-Q_vPHfgXIxd-NA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2685983672</pqid></control><display><type>article</type><title>Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification</title><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Wang, Qi ; Huang, Hao ; Zhong, Yuling ; Min, Weidong ; Han, Qing ; Xu, Desheng ; Xu, Changwen</creator><creatorcontrib>Wang, Qi ; Huang, Hao ; Zhong, Yuling ; Min, Weidong ; Han, Qing ; Xu, Desheng ; Xu, Changwen</creatorcontrib><description>Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity will appear with different backgrounds, which interferes with the Re-ID performance. This paper proposes a swin transformer based on two-fold loss (TL-TransNet) to pay more attention to the semantic information of a pedestrian’s body and preserve valuable background information, thereby reducing the interference of corresponding background appearance. TL-TransNet is supervised by two types of losses (i.e., circle loss and instance loss) during the training phase. In the retrieval phase, DeepLabV3+ as a pedestrian background segmentation model is applied to generate body masks in terms of query and gallery set. The background removal results are generated according to the mask and are used to filter out interfering background information. Subsequently, a background adaptation re-ranking is designed to combine the original information with the background-removed information, which digs out more positive samples with large background deviation. Extensive experiments on two public person Re-ID datasets testify that the proposed method achieves competitive robustness performance in terms of the background variation problem.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics11131941</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Adaptation ; Algorithms ; Deep learning ; Electronic surveillance ; Image processing ; Image retrieval ; Methods ; Neural networks ; Pedestrians ; Ranking ; Ratings &amp; rankings ; Segmentation ; Training ; Transformers</subject><ispartof>Electronics (Basel), 2022-07, Vol.11 (13), p.1941</ispartof><rights>COPYRIGHT 2022 MDPI AG</rights><rights>2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-6eb1ff4fccf58292b0d2262a30a410d7b1f6a28a8d70209e2c0fc1c37cd911903</citedby><cites>FETCH-LOGICAL-c361t-6eb1ff4fccf58292b0d2262a30a410d7b1f6a28a8d70209e2c0fc1c37cd911903</cites><orcidid>0000-0003-2526-2181</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Huang, Hao</creatorcontrib><creatorcontrib>Zhong, Yuling</creatorcontrib><creatorcontrib>Min, Weidong</creatorcontrib><creatorcontrib>Han, Qing</creatorcontrib><creatorcontrib>Xu, Desheng</creatorcontrib><creatorcontrib>Xu, Changwen</creatorcontrib><title>Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification</title><title>Electronics (Basel)</title><description>Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity will appear with different backgrounds, which interferes with the Re-ID performance. This paper proposes a swin transformer based on two-fold loss (TL-TransNet) to pay more attention to the semantic information of a pedestrian’s body and preserve valuable background information, thereby reducing the interference of corresponding background appearance. TL-TransNet is supervised by two types of losses (i.e., circle loss and instance loss) during the training phase. In the retrieval phase, DeepLabV3+ as a pedestrian background segmentation model is applied to generate body masks in terms of query and gallery set. The background removal results are generated according to the mask and are used to filter out interfering background information. Subsequently, a background adaptation re-ranking is designed to combine the original information with the background-removed information, which digs out more positive samples with large background deviation. Extensive experiments on two public person Re-ID datasets testify that the proposed method achieves competitive robustness performance in terms of the background variation problem.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Deep learning</subject><subject>Electronic surveillance</subject><subject>Image processing</subject><subject>Image retrieval</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Pedestrians</subject><subject>Ranking</subject><subject>Ratings &amp; rankings</subject><subject>Segmentation</subject><subject>Training</subject><subject>Transformers</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNptkUtLAzEQx4MoWGo_gZcFz1szyb5yrMVqoaDUel7SPEr6SGqypfjtHV0PHswcMvznNw9mCLkFOuZc0HuzN6qLwTuVAICDKOCCDBitRS6YYJd__GsySmlL8QngDacD4t7OzmerKH2yIR5MzB5kMjoLKJ5DPgt7nS1CSpn0GkNqt4nhhO5Ey2MnO4fc0uRL6XfObzIskb2amHp1ro3vnHXqh7shV1bukxn9_kPyPntcTZ_zxcvTfDpZ5IpX0OWVWYO1hVXKlg2OvKaasYpJTmUBVNcYrSRrZKNryqgwTFGrQPFaaQEgKB-Su77uMYaPk0lduw2n6LFly6qmFA2vaobUuKc2cm9a523oolRo2hycCt5Yh_oEyQKasgRM4H2CiriOaGx7jO4g42cLtP2-Q_vPHfgXIxd-NA</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Wang, Qi</creator><creator>Huang, Hao</creator><creator>Zhong, Yuling</creator><creator>Min, Weidong</creator><creator>Han, Qing</creator><creator>Xu, Desheng</creator><creator>Xu, Changwen</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0003-2526-2181</orcidid></search><sort><creationdate>20220701</creationdate><title>Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification</title><author>Wang, Qi ; Huang, Hao ; Zhong, Yuling ; Min, Weidong ; Han, Qing ; Xu, Desheng ; Xu, Changwen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-6eb1ff4fccf58292b0d2262a30a410d7b1f6a28a8d70209e2c0fc1c37cd911903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Deep learning</topic><topic>Electronic surveillance</topic><topic>Image processing</topic><topic>Image retrieval</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Pedestrians</topic><topic>Ranking</topic><topic>Ratings &amp; rankings</topic><topic>Segmentation</topic><topic>Training</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Qi</creatorcontrib><creatorcontrib>Huang, Hao</creatorcontrib><creatorcontrib>Zhong, Yuling</creatorcontrib><creatorcontrib>Min, Weidong</creatorcontrib><creatorcontrib>Han, Qing</creatorcontrib><creatorcontrib>Xu, Desheng</creatorcontrib><creatorcontrib>Xu, Changwen</creatorcontrib><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Qi</au><au>Huang, Hao</au><au>Zhong, Yuling</au><au>Min, Weidong</au><au>Han, Qing</au><au>Xu, Desheng</au><au>Xu, Changwen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification</atitle><jtitle>Electronics (Basel)</jtitle><date>2022-07-01</date><risdate>2022</risdate><volume>11</volume><issue>13</issue><spage>1941</spage><pages>1941-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity will appear with different backgrounds, which interferes with the Re-ID performance. This paper proposes a swin transformer based on two-fold loss (TL-TransNet) to pay more attention to the semantic information of a pedestrian’s body and preserve valuable background information, thereby reducing the interference of corresponding background appearance. TL-TransNet is supervised by two types of losses (i.e., circle loss and instance loss) during the training phase. In the retrieval phase, DeepLabV3+ as a pedestrian background segmentation model is applied to generate body masks in terms of query and gallery set. The background removal results are generated according to the mask and are used to filter out interfering background information. Subsequently, a background adaptation re-ranking is designed to combine the original information with the background-removed information, which digs out more positive samples with large background deviation. Extensive experiments on two public person Re-ID datasets testify that the proposed method achieves competitive robustness performance in terms of the background variation problem.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics11131941</doi><orcidid>https://orcid.org/0000-0003-2526-2181</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2079-9292
ispartof Electronics (Basel), 2022-07, Vol.11 (13), p.1941
issn 2079-9292
2079-9292
language eng
recordid cdi_proquest_journals_2685983672
source MDPI - Multidisciplinary Digital Publishing Institute; EZB-FREE-00999 freely available EZB journals
subjects Adaptation
Algorithms
Deep learning
Electronic surveillance
Image processing
Image retrieval
Methods
Neural networks
Pedestrians
Ranking
Ratings & rankings
Segmentation
Training
Transformers
title Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T08%3A08%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Swin%20Transformer%20Based%20on%20Two-Fold%20Loss%20and%20Background%20Adaptation%20Re-Ranking%20for%20Person%20Re-Identification&rft.jtitle=Electronics%20(Basel)&rft.au=Wang,%20Qi&rft.date=2022-07-01&rft.volume=11&rft.issue=13&rft.spage=1941&rft.pages=1941-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics11131941&rft_dat=%3Cgale_proqu%3EA723418551%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2685983672&rft_id=info:pmid/&rft_galeid=A723418551&rfr_iscdi=true