GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models

Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-11
Hauptverfasser: Barad, Kuldeep R, Orsula, Andrej, Antoine, Richard, Dentler, Jan, Olivares-Mendez, Miguel, Martinez, Carol
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Barad, Kuldeep R
Orsula, Andrej
Antoine, Richard
Dentler, Jan
Olivares-Mendez, Miguel
Martinez, Carol
description Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric \(SE(3)\) grasp poses conditioned on point clouds. GraspLDM architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .
doi_str_mv 10.48550/arxiv.2312.11243
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2312_11243</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2903733468</sourcerecordid><originalsourceid>FETCH-LOGICAL-a953-e4c0901bf1d01986ef396e5e1168e1a10089223a605465e02ea1830782e8e0ef3</originalsourceid><addsrcrecordid>eNotj01PwkAURScmJhLkB7hyEtfF99EZpu4MCJiUuJB9M8qrlmCLMy2Rf28FVy-57-TmHqVuEMapMwbuffipDmNipDEipXyhBsSMiUuJrtQoxi0AkJ2QMTxQy0XwcZ_PVg96IbUE31YH0TaZNXN9eunXY91-Sqyi7mJVf-jct1K3elaVZR80tV41G9nFa3VZ-l2U0f8dqvX8aT1dJvnL4nn6mCc-M5xI-g4Z4FuJG8DMWSk5s2IE0TpBjwAuI2JvwaTWCJB4dAwTR-IEenqobs-1J81iH6ovH47Fn25x0u2JuzOxD813J7Ettk0X6n5TQRnwhDm1jn8Bq_tVfg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2903733468</pqid></control><display><type>article</type><title>GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Barad, Kuldeep R ; Orsula, Andrej ; Antoine, Richard ; Dentler, Jan ; Olivares-Mendez, Miguel ; Martinez, Carol</creator><creatorcontrib>Barad, Kuldeep R ; Orsula, Andrej ; Antoine, Richard ; Dentler, Jan ; Olivares-Mendez, Miguel ; Martinez, Carol</creatorcontrib><description>Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric \(SE(3)\) grasp poses conditioned on point clouds. GraspLDM architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2312.11243</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer Science - Robotics ; Data transfer (computers) ; Grasping (robotics) ; Synthesis ; Three dimensional models ; Training</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2312.11243$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/ACCESS.2024.3492118$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Barad, Kuldeep R</creatorcontrib><creatorcontrib>Orsula, Andrej</creatorcontrib><creatorcontrib>Antoine, Richard</creatorcontrib><creatorcontrib>Dentler, Jan</creatorcontrib><creatorcontrib>Olivares-Mendez, Miguel</creatorcontrib><creatorcontrib>Martinez, Carol</creatorcontrib><title>GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models</title><title>arXiv.org</title><description>Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric \(SE(3)\) grasp poses conditioned on point clouds. GraspLDM architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .</description><subject>Computer Science - Robotics</subject><subject>Data transfer (computers)</subject><subject>Grasping (robotics)</subject><subject>Synthesis</subject><subject>Three dimensional models</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj01PwkAURScmJhLkB7hyEtfF99EZpu4MCJiUuJB9M8qrlmCLMy2Rf28FVy-57-TmHqVuEMapMwbuffipDmNipDEipXyhBsSMiUuJrtQoxi0AkJ2QMTxQy0XwcZ_PVg96IbUE31YH0TaZNXN9eunXY91-Sqyi7mJVf-jct1K3elaVZR80tV41G9nFa3VZ-l2U0f8dqvX8aT1dJvnL4nn6mCc-M5xI-g4Z4FuJG8DMWSk5s2IE0TpBjwAuI2JvwaTWCJB4dAwTR-IEenqobs-1J81iH6ovH47Fn25x0u2JuzOxD813J7Ettk0X6n5TQRnwhDm1jn8Bq_tVfg</recordid><startdate>20241123</startdate><enddate>20241123</enddate><creator>Barad, Kuldeep R</creator><creator>Orsula, Andrej</creator><creator>Antoine, Richard</creator><creator>Dentler, Jan</creator><creator>Olivares-Mendez, Miguel</creator><creator>Martinez, Carol</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241123</creationdate><title>GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models</title><author>Barad, Kuldeep R ; Orsula, Andrej ; Antoine, Richard ; Dentler, Jan ; Olivares-Mendez, Miguel ; Martinez, Carol</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a953-e4c0901bf1d01986ef396e5e1168e1a10089223a605465e02ea1830782e8e0ef3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Robotics</topic><topic>Data transfer (computers)</topic><topic>Grasping (robotics)</topic><topic>Synthesis</topic><topic>Three dimensional models</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Barad, Kuldeep R</creatorcontrib><creatorcontrib>Orsula, Andrej</creatorcontrib><creatorcontrib>Antoine, Richard</creatorcontrib><creatorcontrib>Dentler, Jan</creatorcontrib><creatorcontrib>Olivares-Mendez, Miguel</creatorcontrib><creatorcontrib>Martinez, Carol</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Barad, Kuldeep R</au><au>Orsula, Andrej</au><au>Antoine, Richard</au><au>Dentler, Jan</au><au>Olivares-Mendez, Miguel</au><au>Martinez, Carol</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models</atitle><jtitle>arXiv.org</jtitle><date>2024-11-23</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric \(SE(3)\) grasp poses conditioned on point clouds. GraspLDM architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2312.11243</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-11
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2312_11243
source arXiv.org; Free E- Journals
subjects Computer Science - Robotics
Data transfer (computers)
Grasping (robotics)
Synthesis
Three dimensional models
Training
title GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T21%3A23%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GraspLDM:%20Generative%206-DoF%20Grasp%20Synthesis%20using%20Latent%20Diffusion%20Models&rft.jtitle=arXiv.org&rft.au=Barad,%20Kuldeep%20R&rft.date=2024-11-23&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2312.11243&rft_dat=%3Cproquest_arxiv%3E2903733468%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2903733468&rft_id=info:pmid/&rfr_iscdi=true