Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark

Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2023-12, Vol.45 (12), p.14920-14937
Hauptverfasser:	Fan, Chao, Hou, Saihui, Wang, Jilong, Huang, Yongzhen, Yu, Shiqi
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Benchmark testing Benchmarks contrastive learning Datasets Feature extraction Gait recognition GaitLU-1M GaitSSB Learning Legged locomotion Representations self-supervised Task analysis Video Videos Walking
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	14937
container_issue	12
container_start_page	14920
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	45
creator	Fan, Chao Hou, Saihui Wang, Jilong Huang, Yongzhen Yu, Shiqi
description	Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.
doi_str_mv	10.1109/TPAMI.2023.3312419
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2885654455</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10242019</ieee_id><sourcerecordid>2862201611</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</originalsourceid><addsrcrecordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2885654455</pqid></control><display><type>article</type><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><source>IEEE Electronic Library (IEL)</source><creator>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creator><creatorcontrib>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creatorcontrib><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 2160-9292</identifier><identifier>EISSN: 1939-3539</identifier><identifier>DOI: 10.1109/TPAMI.2023.3312419</identifier><identifier>PMID: 37672380</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Annotations ; Benchmark testing ; Benchmarks ; contrastive learning ; Datasets ; Feature extraction ; Gait recognition ; GaitLU-1M ; GaitSSB ; Learning ; Legged locomotion ; Representations ; self-supervised ; Task analysis ; Video ; Videos ; Walking</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</citedby><cites>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</cites><orcidid>0000-0002-5213-5877 ; 0000-0002-3605-2705 ; 0009-0001-9668-2987 ; 0000-0003-4389-9805 ; 0000-0003-4689-2860</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><subject>Annotations</subject><subject>Benchmark testing</subject><subject>Benchmarks</subject><subject>contrastive learning</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Gait recognition</subject><subject>GaitLU-1M</subject><subject>GaitSSB</subject><subject>Learning</subject><subject>Legged locomotion</subject><subject>Representations</subject><subject>self-supervised</subject><subject>Task analysis</subject><subject>Video</subject><subject>Videos</subject><subject>Walking</subject><issn>0162-8828</issn><issn>2160-9292</issn><issn>1939-3539</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Fan, Chao</creator><creator>Hou, Saihui</creator><creator>Wang, Jilong</creator><creator>Huang, Yongzhen</creator><creator>Yu, Shiqi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></search><sort><creationdate>20231201</creationdate><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><author>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Annotations</topic><topic>Benchmark testing</topic><topic>Benchmarks</topic><topic>contrastive learning</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Gait recognition</topic><topic>GaitLU-1M</topic><topic>GaitSSB</topic><topic>Learning</topic><topic>Legged locomotion</topic><topic>Representations</topic><topic>self-supervised</topic><topic>Task analysis</topic><topic>Video</topic><topic>Videos</topic><topic>Walking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fan, Chao</au><au>Hou, Saihui</au><au>Wang, Jilong</au><au>Huang, Yongzhen</au><au>Yu, Shiqi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>45</volume><issue>12</issue><spage>14920</spage><epage>14937</epage><pages>14920-14937</pages><issn>0162-8828</issn><eissn>2160-9292</eissn><eissn>1939-3539</eissn><coden>ITPIDJ</coden><abstract>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>37672380</pmid><doi>10.1109/TPAMI.2023.3312419</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937
issn	0162-8828 2160-9292 1939-3539
language	eng
recordid	cdi_proquest_journals_2885654455
source	IEEE Electronic Library (IEL)
subjects	Annotations Benchmark testing Benchmarks contrastive learning Datasets Feature extraction Gait recognition GaitLU-1M GaitSSB Learning Legged locomotion Representations self-supervised Task analysis Video Videos Walking
title	Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A43%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Gait%20Representation%20From%20Massive%20Unlabelled%20Walking%20Videos:%20A%20Benchmark&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Fan,%20Chao&rft.date=2023-12-01&rft.volume=45&rft.issue=12&rft.spage=14920&rft.epage=14937&rft.pages=14920-14937&rft.issn=0162-8828&rft.eissn=2160-9292&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2023.3312419&rft_dat=%3Cproquest_RIE%3E2862201611%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2885654455&rft_id=info:pmid/37672380&rft_ieee_id=10242019&rfr_iscdi=true