Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking
Learning generic skills for humanoid robots interacting with 3D scenes by mimicking human data is a key research challenge with significant implications for robotics and real-world applications. However, existing methodologies and benchmarks are constrained by the use of small-scale, manually collec...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Liu, Yun Yang, Bowen Zhong, Licheng Wang, He Yi, Li |
description | Learning generic skills for humanoid robots interacting with 3D scenes by
mimicking human data is a key research challenge with significant implications
for robotics and real-world applications. However, existing methodologies and
benchmarks are constrained by the use of small-scale, manually collected
demonstrations, lacking the general dataset and benchmark support necessary to
explore scene geometry generalization effectively. To address this gap, we
introduce Mimicking-Bench, the first comprehensive benchmark designed for
generalizable humanoid-scene interaction learning through mimicking large-scale
human animation references. Mimicking-Bench includes six household full-body
humanoid-scene interaction tasks, covering 11K diverse object shapes, along
with 20K synthetic and 3K real-world human interaction skill references. We
construct a complete humanoid skill learning pipeline and benchmark approaches
for motion retargeting, motion tracking, imitation learning, and their various
combinations. Extensive experiments highlight the value of human mimicking for
skill learning, revealing key challenges and research directions. |
doi_str_mv | 10.48550/arxiv.2412.17730 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_17730</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_17730</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_177303</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzc24GRI883MzUzOzsxL13VKzUvOsFJwVAAzchOLshXS8osU3FPzUosSczKrEpNyUhU8SnMT8_IzU3SDk4HiCp55JUDJ5JLM_DwFn9TEojygQQplmYkQdQpww3kYWNMSc4pTeaE0N4O8m2uIs4cu2EXxBUWZQPsq40Euiwe7zJiwCgDDrkQj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><source>arXiv.org</source><creator>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</creator><creatorcontrib>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</creatorcontrib><description>Learning generic skills for humanoid robots interacting with 3D scenes by
mimicking human data is a key research challenge with significant implications
for robotics and real-world applications. However, existing methodologies and
benchmarks are constrained by the use of small-scale, manually collected
demonstrations, lacking the general dataset and benchmark support necessary to
explore scene geometry generalization effectively. To address this gap, we
introduce Mimicking-Bench, the first comprehensive benchmark designed for
generalizable humanoid-scene interaction learning through mimicking large-scale
human animation references. Mimicking-Bench includes six household full-body
humanoid-scene interaction tasks, covering 11K diverse object shapes, along
with 20K synthetic and 3K real-world human interaction skill references. We
construct a complete humanoid skill learning pipeline and benchmark approaches
for motion retargeting, motion tracking, imitation learning, and their various
combinations. Extensive experiments highlight the value of human mimicking for
skill learning, revealing key challenges and research directions.</description><identifier>DOI: 10.48550/arxiv.2412.17730</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Robotics</subject><creationdate>2024-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.17730$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.17730$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Yang, Bowen</creatorcontrib><creatorcontrib>Zhong, Licheng</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Yi, Li</creatorcontrib><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><description>Learning generic skills for humanoid robots interacting with 3D scenes by
mimicking human data is a key research challenge with significant implications
for robotics and real-world applications. However, existing methodologies and
benchmarks are constrained by the use of small-scale, manually collected
demonstrations, lacking the general dataset and benchmark support necessary to
explore scene geometry generalization effectively. To address this gap, we
introduce Mimicking-Bench, the first comprehensive benchmark designed for
generalizable humanoid-scene interaction learning through mimicking large-scale
human animation references. Mimicking-Bench includes six household full-body
humanoid-scene interaction tasks, covering 11K diverse object shapes, along
with 20K synthetic and 3K real-world human interaction skill references. We
construct a complete humanoid skill learning pipeline and benchmark approaches
for motion retargeting, motion tracking, imitation learning, and their various
combinations. Extensive experiments highlight the value of human mimicking for
skill learning, revealing key challenges and research directions.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzc24GRI883MzUzOzsxL13VKzUvOsFJwVAAzchOLshXS8osU3FPzUosSczKrEpNyUhU8SnMT8_IzU3SDk4HiCp55JUDJ5JLM_DwFn9TEojygQQplmYkQdQpww3kYWNMSc4pTeaE0N4O8m2uIs4cu2EXxBUWZQPsq40Euiwe7zJiwCgDDrkQj</recordid><startdate>20241223</startdate><enddate>20241223</enddate><creator>Liu, Yun</creator><creator>Yang, Bowen</creator><creator>Zhong, Licheng</creator><creator>Wang, He</creator><creator>Yi, Li</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241223</creationdate><title>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</title><author>Liu, Yun ; Yang, Bowen ; Zhong, Licheng ; Wang, He ; Yi, Li</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_177303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Yang, Bowen</creatorcontrib><creatorcontrib>Zhong, Licheng</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Yi, Li</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Yun</au><au>Yang, Bowen</au><au>Zhong, Licheng</au><au>Wang, He</au><au>Yi, Li</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking</atitle><date>2024-12-23</date><risdate>2024</risdate><abstract>Learning generic skills for humanoid robots interacting with 3D scenes by
mimicking human data is a key research challenge with significant implications
for robotics and real-world applications. However, existing methodologies and
benchmarks are constrained by the use of small-scale, manually collected
demonstrations, lacking the general dataset and benchmark support necessary to
explore scene geometry generalization effectively. To address this gap, we
introduce Mimicking-Bench, the first comprehensive benchmark designed for
generalizable humanoid-scene interaction learning through mimicking large-scale
human animation references. Mimicking-Bench includes six household full-body
humanoid-scene interaction tasks, covering 11K diverse object shapes, along
with 20K synthetic and 3K real-world human interaction skill references. We
construct a complete humanoid skill learning pipeline and benchmark approaches
for motion retargeting, motion tracking, imitation learning, and their various
combinations. Extensive experiments highlight the value of human mimicking for
skill learning, revealing key challenges and research directions.</abstract><doi>10.48550/arxiv.2412.17730</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2412.17730 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2412_17730 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics |
title | Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T18%3A37%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mimicking-Bench:%20A%20Benchmark%20for%20Generalizable%20Humanoid-Scene%20Interaction%20Learning%20via%20Human%20Mimicking&rft.au=Liu,%20Yun&rft.date=2024-12-23&rft_id=info:doi/10.48550/arxiv.2412.17730&rft_dat=%3Carxiv_GOX%3E2412_17730%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |