Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark
Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely an...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2023-12, Vol.45 (12), p.14920-14937 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 14937 |
---|---|
container_issue | 12 |
container_start_page | 14920 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 45 |
creator | Fan, Chao Hou, Saihui Wang, Jilong Huang, Yongzhen Yu, Shiqi |
description | Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks. |
doi_str_mv | 10.1109/TPAMI.2023.3312419 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2885654455</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10242019</ieee_id><sourcerecordid>2862201611</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</originalsourceid><addsrcrecordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2885654455</pqid></control><display><type>article</type><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><source>IEEE Electronic Library (IEL)</source><creator>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creator><creatorcontrib>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</creatorcontrib><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 2160-9292</identifier><identifier>EISSN: 1939-3539</identifier><identifier>DOI: 10.1109/TPAMI.2023.3312419</identifier><identifier>PMID: 37672380</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Annotations ; Benchmark testing ; Benchmarks ; contrastive learning ; Datasets ; Feature extraction ; Gait recognition ; GaitLU-1M ; GaitSSB ; Learning ; Legged locomotion ; Representations ; self-supervised ; Task analysis ; Video ; Videos ; Walking</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</citedby><cites>FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</cites><orcidid>0000-0002-5213-5877 ; 0000-0002-3605-2705 ; 0009-0001-9668-2987 ; 0000-0003-4389-9805 ; 0000-0003-4689-2860</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10242019$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><description>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</description><subject>Annotations</subject><subject>Benchmark testing</subject><subject>Benchmarks</subject><subject>contrastive learning</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Gait recognition</subject><subject>GaitLU-1M</subject><subject>GaitSSB</subject><subject>Learning</subject><subject>Legged locomotion</subject><subject>Representations</subject><subject>self-supervised</subject><subject>Task analysis</subject><subject>Video</subject><subject>Videos</subject><subject>Walking</subject><issn>0162-8828</issn><issn>2160-9292</issn><issn>1939-3539</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkMFOAjEQhhujEURfwHjYxIuXxXa67bbekAiSQCQG9Lgp3UEXl11sFxPf3iIcjKe5fP_MPx8hl4x2GaP6djbtTUZdoMC7nDNImD4ibWCSxho0HJM2ZRJipUC1yJn3K0pZIig_JS2eyhS4om0yHaNxVVG9RUNTNNEzbhx6rBrTFHUVDVy9jibG--ILo3lVmgWWJebRqyk_dpmXIsfa30W96B4r-7427uOcnCxN6fHiMDtkPniY9R_j8dNw1O-NY8tBN7GSVGsQMqWoZcJFssilELlFaq1e5LkAmuoUJUrBxRJzSq0AKxJlDVNsKXmH3Oz3blz9uUXfZOvC21DPVFhvfQZKAgQBjAX0-h-6qreuCu0CpYQUSSJEoGBPWVd773CZbVwRPvrOGM12vrNf39nOd3bwHUJX-1CBiH8CkITbmv8ApKp4cg</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Fan, Chao</creator><creator>Hou, Saihui</creator><creator>Wang, Jilong</creator><creator>Huang, Yongzhen</creator><creator>Yu, Shiqi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></search><sort><creationdate>20231201</creationdate><title>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</title><author>Fan, Chao ; Hou, Saihui ; Wang, Jilong ; Huang, Yongzhen ; Yu, Shiqi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-8609925670e964354bd655dce0cc9bdd520797e6e6535fed00c52c548ca181f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Annotations</topic><topic>Benchmark testing</topic><topic>Benchmarks</topic><topic>contrastive learning</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Gait recognition</topic><topic>GaitLU-1M</topic><topic>GaitSSB</topic><topic>Learning</topic><topic>Legged locomotion</topic><topic>Representations</topic><topic>self-supervised</topic><topic>Task analysis</topic><topic>Video</topic><topic>Videos</topic><topic>Walking</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Chao</creatorcontrib><creatorcontrib>Hou, Saihui</creatorcontrib><creatorcontrib>Wang, Jilong</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Yu, Shiqi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fan, Chao</au><au>Hou, Saihui</au><au>Wang, Jilong</au><au>Huang, Yongzhen</au><au>Yu, Shiqi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>45</volume><issue>12</issue><spage>14920</spage><epage>14937</epage><pages>14920-14937</pages><issn>0162-8828</issn><eissn>2160-9292</eissn><eissn>1939-3539</eissn><coden>ITPIDJ</coden><abstract>Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>37672380</pmid><doi>10.1109/TPAMI.2023.3312419</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-5213-5877</orcidid><orcidid>https://orcid.org/0000-0002-3605-2705</orcidid><orcidid>https://orcid.org/0009-0001-9668-2987</orcidid><orcidid>https://orcid.org/0000-0003-4389-9805</orcidid><orcidid>https://orcid.org/0000-0003-4689-2860</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2023-12, Vol.45 (12), p.14920-14937 |
issn | 0162-8828 2160-9292 1939-3539 |
language | eng |
recordid | cdi_proquest_journals_2885654455 |
source | IEEE Electronic Library (IEL) |
subjects | Annotations Benchmark testing Benchmarks contrastive learning Datasets Feature extraction Gait recognition GaitLU-1M GaitSSB Learning Legged locomotion Representations self-supervised Task analysis Video Videos Walking |
title | Learning Gait Representation From Massive Unlabelled Walking Videos: A Benchmark |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A43%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Gait%20Representation%20From%20Massive%20Unlabelled%20Walking%20Videos:%20A%20Benchmark&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Fan,%20Chao&rft.date=2023-12-01&rft.volume=45&rft.issue=12&rft.spage=14920&rft.epage=14937&rft.pages=14920-14937&rft.issn=0162-8828&rft.eissn=2160-9292&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2023.3312419&rft_dat=%3Cproquest_RIE%3E2862201611%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2885654455&rft_id=info:pmid/37672380&rft_ieee_id=10242019&rfr_iscdi=true |