ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics

Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for ef...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yu, Qiaojun, Hao, Ce, Wang, Junbo, Liu, Wenhai, Liu, Liu, Mu, Yao, You, Yang, Yan, Hengxu, Lu, Cewu
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yu, Qiaojun Hao, Ce Wang, Junbo Liu, Wenhai Liu, Liu Mu, Yao You, Yang Yan, Hengxu Lu, Cewu
description	Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for effective manipulation. For example, if a mug is lying on its side, it's more effective to grasp it by the rim rather than the handle. Despite its importance, research in POM skills remains limited, because learning manipulation skills requires pose-varying simulation environments and datasets. This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks. ManiPose encompasses: 1) Simulation environments for POM feature tasks ranging from 6D pose-specific pick-and-place of single objects to cluttered scenes, further including interactions with articulated objects. 2) A comprehensive dataset featuring geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects across 59 categories. 3) A baseline for POM, leveraging the inferencing abilities of LLM (e.g., ChatGPT) to analyze the relationship between 6D pose and task-specific requirements, offers enhanced pose-aware grasp prediction and motion planning capabilities. Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer, setting new standards for POM research. We will open-source the ManiPose benchmark with the final version paper, inviting the community to engage with our resources, available at our website:https://sites.google.com/view/manipose.
doi_str_mv	10.48550/arxiv.2403.13365
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2403_13365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2403_13365</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-6be8dc615508fd6d27efeeeed8c7610fc1f3289801d5c94bea2db20c4502e71b3</originalsourceid><addsrcrecordid>eNotj8lOwzAYhH3hgFoegBN-gQQvseP2ViI2qaio6j3y8ls1tHbkhAJvT1I6l7nMjOZD6JaSslJCkHudf8KpZBXhJeVcimu0fdMxvKcelniFm3TsMuwh9uEE-AGi3R91_sQ-ZTxlCv2tM-CN-QA74KnZfR30EFLEIeJtMmkItp-jK68PPdxcfIZ2T4-75qVYb55fm9W60LIWhTSgnJV0vKW8k47V4GGUU7aWlHhLPWdqoQh1wi4qA5o5w4itBGFQU8Nn6O5_9gzVdjmMX3_bCa49w_E_4TRKrA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics</title><source>arXiv.org</source><creator>Yu, Qiaojun ; Hao, Ce ; Wang, Junbo ; Liu, Wenhai ; Liu, Liu ; Mu, Yao ; You, Yang ; Yan, Hengxu ; Lu, Cewu</creator><creatorcontrib>Yu, Qiaojun ; Hao, Ce ; Wang, Junbo ; Liu, Wenhai ; Liu, Liu ; Mu, Yao ; You, Yang ; Yan, Hengxu ; Lu, Cewu</creatorcontrib><description>Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for effective manipulation. For example, if a mug is lying on its side, it's more effective to grasp it by the rim rather than the handle. Despite its importance, research in POM skills remains limited, because learning manipulation skills requires pose-varying simulation environments and datasets. This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks. ManiPose encompasses: 1) Simulation environments for POM feature tasks ranging from 6D pose-specific pick-and-place of single objects to cluttered scenes, further including interactions with articulated objects. 2) A comprehensive dataset featuring geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects across 59 categories. 3) A baseline for POM, leveraging the inferencing abilities of LLM (e.g., ChatGPT) to analyze the relationship between 6D pose and task-specific requirements, offers enhanced pose-aware grasp prediction and motion planning capabilities. Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer, setting new standards for POM research. We will open-source the ManiPose benchmark with the final version paper, inviting the community to engage with our resources, available at our website:https://sites.google.com/view/manipose.</description><identifier>DOI: 10.48550/arxiv.2403.13365</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Robotics</subject><creationdate>2024-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2403.13365$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2403.13365$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yu, Qiaojun</creatorcontrib><creatorcontrib>Hao, Ce</creatorcontrib><creatorcontrib>Wang, Junbo</creatorcontrib><creatorcontrib>Liu, Wenhai</creatorcontrib><creatorcontrib>Liu, Liu</creatorcontrib><creatorcontrib>Mu, Yao</creatorcontrib><creatorcontrib>You, Yang</creatorcontrib><creatorcontrib>Yan, Hengxu</creatorcontrib><creatorcontrib>Lu, Cewu</creatorcontrib><title>ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics</title><description>Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for effective manipulation. For example, if a mug is lying on its side, it's more effective to grasp it by the rim rather than the handle. Despite its importance, research in POM skills remains limited, because learning manipulation skills requires pose-varying simulation environments and datasets. This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks. ManiPose encompasses: 1) Simulation environments for POM feature tasks ranging from 6D pose-specific pick-and-place of single objects to cluttered scenes, further including interactions with articulated objects. 2) A comprehensive dataset featuring geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects across 59 categories. 3) A baseline for POM, leveraging the inferencing abilities of LLM (e.g., ChatGPT) to analyze the relationship between 6D pose and task-specific requirements, offers enhanced pose-aware grasp prediction and motion planning capabilities. Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer, setting new standards for POM research. We will open-source the ManiPose benchmark with the final version paper, inviting the community to engage with our resources, available at our website:https://sites.google.com/view/manipose.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8lOwzAYhH3hgFoegBN-gQQvseP2ViI2qaio6j3y8ls1tHbkhAJvT1I6l7nMjOZD6JaSslJCkHudf8KpZBXhJeVcimu0fdMxvKcelniFm3TsMuwh9uEE-AGi3R91_sQ-ZTxlCv2tM-CN-QA74KnZfR30EFLEIeJtMmkItp-jK68PPdxcfIZ2T4-75qVYb55fm9W60LIWhTSgnJV0vKW8k47V4GGUU7aWlHhLPWdqoQh1wi4qA5o5w4itBGFQU8Nn6O5_9gzVdjmMX3_bCa49w_E_4TRKrA</recordid><startdate>20240320</startdate><enddate>20240320</enddate><creator>Yu, Qiaojun</creator><creator>Hao, Ce</creator><creator>Wang, Junbo</creator><creator>Liu, Wenhai</creator><creator>Liu, Liu</creator><creator>Mu, Yao</creator><creator>You, Yang</creator><creator>Yan, Hengxu</creator><creator>Lu, Cewu</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240320</creationdate><title>ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics</title><author>Yu, Qiaojun ; Hao, Ce ; Wang, Junbo ; Liu, Wenhai ; Liu, Liu ; Mu, Yao ; You, Yang ; Yan, Hengxu ; Lu, Cewu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-6be8dc615508fd6d27efeeeed8c7610fc1f3289801d5c94bea2db20c4502e71b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Qiaojun</creatorcontrib><creatorcontrib>Hao, Ce</creatorcontrib><creatorcontrib>Wang, Junbo</creatorcontrib><creatorcontrib>Liu, Wenhai</creatorcontrib><creatorcontrib>Liu, Liu</creatorcontrib><creatorcontrib>Mu, Yao</creatorcontrib><creatorcontrib>You, Yang</creatorcontrib><creatorcontrib>Yan, Hengxu</creatorcontrib><creatorcontrib>Lu, Cewu</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Qiaojun</au><au>Hao, Ce</au><au>Wang, Junbo</au><au>Liu, Wenhai</au><au>Liu, Liu</au><au>Mu, Yao</au><au>You, Yang</au><au>Yan, Hengxu</au><au>Lu, Cewu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics</atitle><date>2024-03-20</date><risdate>2024</risdate><abstract>Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose. Recognizing an object's position and orientation is crucial for effective manipulation. For example, if a mug is lying on its side, it's more effective to grasp it by the rim rather than the handle. Despite its importance, research in POM skills remains limited, because learning manipulation skills requires pose-varying simulation environments and datasets. This paper introduces ManiPose, a pioneering benchmark designed to advance the study of pose-varying manipulation tasks. ManiPose encompasses: 1) Simulation environments for POM feature tasks ranging from 6D pose-specific pick-and-place of single objects to cluttered scenes, further including interactions with articulated objects. 2) A comprehensive dataset featuring geometrically consistent and manipulation-oriented 6D pose labels for 2936 real-world scanned rigid objects and 100 articulated objects across 59 categories. 3) A baseline for POM, leveraging the inferencing abilities of LLM (e.g., ChatGPT) to analyze the relationship between 6D pose and task-specific requirements, offers enhanced pose-aware grasp prediction and motion planning capabilities. Our benchmark demonstrates notable advancements in pose estimation, pose-aware manipulation, and real-robot skill transfer, setting new standards for POM research. We will open-source the ManiPose benchmark with the final version paper, inviting the community to engage with our resources, available at our website:https://sites.google.com/view/manipose.</abstract><doi>10.48550/arxiv.2403.13365</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2403.13365
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2403_13365
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics
title	ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T00%3A52%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ManiPose:%20A%20Comprehensive%20Benchmark%20for%20Pose-aware%20Object%20Manipulation%20in%20Robotics&rft.au=Yu,%20Qiaojun&rft.date=2024-03-20&rft_id=info:doi/10.48550/arxiv.2403.13365&rft_dat=%3Carxiv_GOX%3E2403_13365%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true