Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking

Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Liu, Yun, Yang, Bowen, Zhong, Licheng, Wang, He, Yi, Li
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Liu, Yun
Yang, Bowen
Zhong, Licheng
Wang, He
Yi, Li
description Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collected demonstrations, lacking the general dataset and benchmark support necessary to explore scene geometry generalization effectively. To address this gap, we introduce Mimicking-Bench, the first comprehensive benchmark designed for generalizable humanoid-scene interaction learning through mimicking large-scale human animation references. Mimicking-Bench includes six household full-body humanoid-scene interaction tasks, covering 11K diverse object shapes, along with 20K synthetic and 3K real-world human interaction skill references. We construct a complete humanoid skill learning pipeline and benchmark approaches for motion retargeting, motion tracking, imitation learning, and their various combinations. Extensive experiments highlight the value of human mimicking for skill learning, revealing key challenges and research directions.
doi_str_mv 10.48550/arxiv.2412.17730
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_17730</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_17730</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_177303</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzc24GRI883MzUzOzsxL13VKzUvOsFJwVAAzchOLshXS8osU3FPzUosSczKrEpNyUhU8SnMT8_IzU3SDk4HiCp55JUDJ5JLM_DwFn9TEojygQQplmYkQdQpww3kYWNMSc4pTeaE0N4O8m2uIs4cu2EXxBUWZQPsq40Euiwe7zJiwCgDDrkQj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><source>arXiv.org</source><creator>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</creator><creatorcontrib>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</creatorcontrib><description>Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collected demonstrations, lacking the general dataset and benchmark support necessary to explore scene geometry generalization effectively. To address this gap, we introduce Mimicking-Bench, the first comprehensive benchmark designed for generalizable humanoid-scene interaction learning through mimicking large-scale human animation references. Mimicking-Bench includes six household full-body humanoid-scene interaction tasks, covering 11K diverse object shapes, along with 20K synthetic and 3K real-world human interaction skill references. We construct a complete humanoid skill learning pipeline and benchmark approaches for motion retargeting, motion tracking, imitation learning, and their various combinations. Extensive experiments highlight the value of human mimicking for skill learning, revealing key challenges and research directions.</description><identifier>DOI: 10.48550/arxiv.2412.17730</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Robotics</subject><creationdate>2024-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.17730$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.17730$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Yang, Bowen</creatorcontrib><creatorcontrib>Zhong, Licheng</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Yi, Li</creatorcontrib><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><description>Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collected demonstrations, lacking the general dataset and benchmark support necessary to explore scene geometry generalization effectively. To address this gap, we introduce Mimicking-Bench, the first comprehensive benchmark designed for generalizable humanoid-scene interaction learning through mimicking large-scale human animation references. Mimicking-Bench includes six household full-body humanoid-scene interaction tasks, covering 11K diverse object shapes, along with 20K synthetic and 3K real-world human interaction skill references. We construct a complete humanoid skill learning pipeline and benchmark approaches for motion retargeting, motion tracking, imitation learning, and their various combinations. Extensive experiments highlight the value of human mimicking for skill learning, revealing key challenges and research directions.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzc24GRI883MzUzOzsxL13VKzUvOsFJwVAAzchOLshXS8osU3FPzUosSczKrEpNyUhU8SnMT8_IzU3SDk4HiCp55JUDJ5JLM_DwFn9TEojygQQplmYkQdQpww3kYWNMSc4pTeaE0N4O8m2uIs4cu2EXxBUWZQPsq40Euiwe7zJiwCgDDrkQj</recordid><startdate>20241223</startdate><enddate>20241223</enddate><creator>Liu, Yun</creator><creator>Yang, Bowen</creator><creator>Zhong, Licheng</creator><creator>Wang, He</creator><creator>Yi, Li</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241223</creationdate><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><author>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_177303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Yang, Bowen</creatorcontrib><creatorcontrib>Zhong, Licheng</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Yi, Li</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Yun</au><au>Yang, Bowen</au><au>Zhong, Licheng</au><au>Wang, He</au><au>Yi, Li</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</atitle><date>2024-12-23</date><risdate>2024</risdate><abstract>Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collected demonstrations, lacking the general dataset and benchmark support necessary to explore scene geometry generalization effectively. To address this gap, we introduce Mimicking-Bench, the first comprehensive benchmark designed for generalizable humanoid-scene interaction learning through mimicking large-scale human animation references. Mimicking-Bench includes six household full-body humanoid-scene interaction tasks, covering 11K diverse object shapes, along with 20K synthetic and 3K real-world human interaction skill references. We construct a complete humanoid skill learning pipeline and benchmark approaches for motion retargeting, motion tracking, imitation learning, and their various combinations. Extensive experiments highlight the value of human mimicking for skill learning, revealing key challenges and research directions.</abstract><doi>10.48550/arxiv.2412.17730</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2412.17730
ispartof
issn
language eng
recordid cdi_arxiv_primary_2412_17730
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
Computer Science - Robotics
title Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T18%3A37%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mimicking-Bench:%20A%20Benchmark%20for%20Generalizable%20Humanoid-Scene%20Interaction%20Learning%20via%20Human%20Mimicking&rft.au=Liu,%20Yun&rft.date=2024-12-23&rft_id=info:doi/10.48550/arxiv.2412.17730&rft_dat=%3Carxiv_GOX%3E2412_17730%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true