Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark

Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2023-12, Vol.45 (12), p.14920-14937
Hauptverfasser: Fan, Chao, Hou, Saihui, Wang, Jilong, Huang, Yongzhen, Yu, Shiqi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14937
container_issue 12
container_start_page 14920
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 45
creator Fan, Chao
Hou, Saihui
Wang, Jilong
Huang, Yongzhen
Yu, Shiqi
description Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.
doi_str_mv 10.1109/TPAMI.2023.3312419
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2885654455</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10242019</ieee_id><sourcerecordid>2862201611</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</originalsourceid><addsrcrecordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2885654455</pqid></control><display><type>article</type><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><source>IEEE Electronic Library (IEL)</source><creator>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creator><creatorcontrib>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creatorcontrib><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 2160-9292</identifier><identifier>EISSN: 1939-3539</identifier><identifier>DOI: 10.1109/TPAMI.2023.3312419</identifier><identifier>PMID: 37672380</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Annotations ; Benchmark testing ; Benchmarks ; contrastive learning ; Datasets ; Feature extraction ; Gait recognition ; GaitLU-1M ; GaitSSB ; Learning ; Legged locomotion ; Representations ; self-supervised ; Task analysis ; Video ; Videos ; Walking</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</citedby><cites>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</cites><orcidid>0000-0002-5213-5877 ; 0000-0002-3605-2705 ; 0009-0001-9668-2987 ; 0000-0003-4389-9805 ; 0000-0003-4689-2860</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><subject>Annotations</subject><subject>Benchmark testing</subject><subject>Benchmarks</subject><subject>contrastive learning</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Gait recognition</subject><subject>GaitLU-1M</subject><subject>GaitSSB</subject><subject>Learning</subject><subject>Legged locomotion</subject><subject>Representations</subject><subject>self-supervised</subject><subject>Task analysis</subject><subject>Video</subject><subject>Videos</subject><subject>Walking</subject><issn>0162-8828</issn><issn>2160-9292</issn><issn>1939-3539</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Fan, Chao</creator><creator>Hou, Saihui</creator><creator>Wang, Jilong</creator><creator>Huang, Yongzhen</creator><creator>Yu, Shiqi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></search><sort><creationdate>20231201</creationdate><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><author>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Annotations</topic><topic>Benchmark testing</topic><topic>Benchmarks</topic><topic>contrastive learning</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Gait recognition</topic><topic>GaitLU-1M</topic><topic>GaitSSB</topic><topic>Learning</topic><topic>Legged locomotion</topic><topic>Representations</topic><topic>self-supervised</topic><topic>Task analysis</topic><topic>Video</topic><topic>Videos</topic><topic>Walking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fan, Chao</au><au>Hou, Saihui</au><au>Wang, Jilong</au><au>Huang, Yongzhen</au><au>Yu, Shiqi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>45</volume><issue>12</issue><spage>14920</spage><epage>14937</epage><pages>14920-14937</pages><issn>0162-8828</issn><eissn>2160-9292</eissn><eissn>1939-3539</eissn><coden>ITPIDJ</coden><abstract>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>37672380</pmid><doi>10.1109/TPAMI.2023.3312419</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937
issn 0162-8828
2160-9292
1939-3539
language eng
recordid cdi_proquest_journals_2885654455
source IEEE Electronic Library (IEL)
subjects Annotations
Benchmark testing
Benchmarks
contrastive learning
Datasets
Feature extraction
Gait recognition
GaitLU-1M
GaitSSB
Learning
Legged locomotion
Representations
self-supervised
Task analysis
Video
Videos
Walking
title Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A43%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Gait%20Representation%20From%20Massive%20Unlabelled%20Walking%20Videos:%20A%20Benchmark&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Fan,%20Chao&rft.date=2023-12-01&rft.volume=45&rft.issue=12&rft.spage=14920&rft.epage=14937&rft.pages=14920-14937&rft.issn=0162-8828&rft.eissn=2160-9292&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2023.3312419&rft_dat=%3Cproquest_RIE%3E2862201611%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2885654455&rft_id=info:pmid/37672380&rft_ieee_id=10242019&rfr_iscdi=true