Graspability-Aware Object Pose Estimation in Cluttered Scenes

Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During actual execution, the robot must recognize the object in the current scene, estimate its pose, and t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE robotics and automation letters 2024-04, Vol.9 (4), p.3124-3130
Hauptverfasser: Hoang, Dinh-Cuong, Nguyen, Anh-Nhat, Vu, Van-Duc, Nguyen, Thu-Uyen, Vu, Duy-Quang, Ngo, Phuc-Quan, Hoang, Ngoc-Anh, Phan, Khanh-Toan, Tran, Duc-Thanh, Nguyen, Van-Thiep, Duong, Quang-Tri, Ho, Ngoc-Trung, Tran, Cong-Trinh, Duong, Van-Hiep, Mai, Anh-Truong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3130
container_issue 4
container_start_page 3124
container_title IEEE robotics and automation letters
container_volume 9
creator Hoang, Dinh-Cuong
Nguyen, Anh-Nhat
Vu, Van-Duc
Nguyen, Thu-Uyen
Vu, Duy-Quang
Ngo, Phuc-Quan
Hoang, Ngoc-Anh
Phan, Khanh-Toan
Tran, Duc-Thanh
Nguyen, Van-Thiep
Duong, Quang-Tri
Ho, Ngoc-Trung
Tran, Cong-Trinh
Duong, Van-Hiep
Mai, Anh-Truong
description Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During actual execution, the robot must recognize the object in the current scene, estimate its pose, and then select a feasible grasp pose from the pre-defined grasp configurations. While most existing methods primarily focus on pose estimation, they often neglect the graspability and reachability aspects. This oversight can lead to inefficiencies and failures during execution. In this study, we introduce an innovative graspability-aware object pose estimation framework. Our proposed approach not only estimates the poses of multiple objects in clustered scenes but also identifies graspable areas. This enables the system to concentrate its efforts on specific points or regions of an object that are suitable for grasping. It leverages both depth and color images to extract geometric and appearance features. To effectively combine these diverse features, we have developed an adaptive fusion module. In addition, the fused features are further enhanced through a graspability-aware feature enhancement module. The key innovation of our method lies in improving the discriminability and robustness of the features used for object pose estimation. We have achieved state-of-the-art results on public datasets when compared to several baseline methods. In real robot experiments conducted on a Franka Emika robot arm equipped with an Intel Realsense camera and a two-finger gripper, we consistently achieved high success rates, even in cluttered scenes.
doi_str_mv 10.1109/LRA.2024.3364451
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2929259663</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10430220</ieee_id><sourcerecordid>2929259663</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-4a5081218874cb6d98d227972db529b4f7e7c860a109312d85ca5adf5328b9463</originalsourceid><addsrcrecordid>eNpNkL1rwzAQxUVpoSHN3qGDobNT6fRlDR1CSNNCIKUfs5DlMzikdioplPz3VUiGTHfDe_fu_Qi5Z3TKGDVPq4_ZFCiIKedKCMmuyAi41iXXSl1f7LdkEuOGUsokaG7kiDwvg4s7V3fbLh3K2Z8LWKzrDfpUvA8Ri0VM3Y9L3dAXXV_Mt_uUMGBTfHrsMd6Rm9ZtI07Oc0y-XxZf89dytV6-zWer0oOBVAonacWAVZUWvlaNqRoAbTQ0tQRTi1aj9pWiLnfhDJpKeidd00oOVW2E4mPyeLq7C8PvHmOym2Ef-hxpc4ABaZTiWUVPKh-GGAO2dhfy8-FgGbVHTjZzskdO9swpWx5Olg4RL-SCUwDK_wHWdWGA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2929259663</pqid></control><display><type>article</type><title>Graspability-Aware Object Pose Estimation in Cluttered Scenes</title><source>IEEE Electronic Library (IEL)</source><creator>Hoang, Dinh-Cuong ; Nguyen, Anh-Nhat ; Vu, Van-Duc ; Nguyen, Thu-Uyen ; Vu, Duy-Quang ; Ngo, Phuc-Quan ; Hoang, Ngoc-Anh ; Phan, Khanh-Toan ; Tran, Duc-Thanh ; Nguyen, Van-Thiep ; Duong, Quang-Tri ; Ho, Ngoc-Trung ; Tran, Cong-Trinh ; Duong, Van-Hiep ; Mai, Anh-Truong</creator><creatorcontrib>Hoang, Dinh-Cuong ; Nguyen, Anh-Nhat ; Vu, Van-Duc ; Nguyen, Thu-Uyen ; Vu, Duy-Quang ; Ngo, Phuc-Quan ; Hoang, Ngoc-Anh ; Phan, Khanh-Toan ; Tran, Duc-Thanh ; Nguyen, Van-Thiep ; Duong, Quang-Tri ; Ho, Ngoc-Trung ; Tran, Cong-Trinh ; Duong, Van-Hiep ; Mai, Anh-Truong</creatorcontrib><description>Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During actual execution, the robot must recognize the object in the current scene, estimate its pose, and then select a feasible grasp pose from the pre-defined grasp configurations. While most existing methods primarily focus on pose estimation, they often neglect the graspability and reachability aspects. This oversight can lead to inefficiencies and failures during execution. In this study, we introduce an innovative graspability-aware object pose estimation framework. Our proposed approach not only estimates the poses of multiple objects in clustered scenes but also identifies graspable areas. This enables the system to concentrate its efforts on specific points or regions of an object that are suitable for grasping. It leverages both depth and color images to extract geometric and appearance features. To effectively combine these diverse features, we have developed an adaptive fusion module. In addition, the fused features are further enhanced through a graspability-aware feature enhancement module. The key innovation of our method lies in improving the discriminability and robustness of the features used for object pose estimation. We have achieved state-of-the-art results on public datasets when compared to several baseline methods. In real robot experiments conducted on a Franka Emika robot arm equipped with an Intel Realsense camera and a two-finger gripper, we consistently achieved high success rates, even in cluttered scenes.</description><identifier>ISSN: 2377-3766</identifier><identifier>EISSN: 2377-3766</identifier><identifier>DOI: 10.1109/LRA.2024.3364451</identifier><identifier>CODEN: IRALC6</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>6D object pose estimation ; Color imagery ; Critical components ; Feature extraction ; Fingers ; Geometry ; grasp detection ; Modules ; Object recognition ; Point cloud compression ; Pose estimation ; Robot arms ; robot manipulation ; Robot sensing systems ; Robots ; Solid modeling ; System effectiveness ; Three-dimensional displays</subject><ispartof>IEEE robotics and automation letters, 2024-04, Vol.9 (4), p.3124-3130</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-4a5081218874cb6d98d227972db529b4f7e7c860a109312d85ca5adf5328b9463</citedby><cites>FETCH-LOGICAL-c292t-4a5081218874cb6d98d227972db529b4f7e7c860a109312d85ca5adf5328b9463</cites><orcidid>0009-0003-1231-8251 ; 0009-0009-4006-4352 ; 0009-0003-7045-3355 ; 0009-0001-2469-0961 ; 0009-0001-0689-1292 ; 0009-0007-7122-7148 ; 0009-0006-3800-3014 ; 0009-0002-7775-0021 ; 0009-0008-8349-454X ; 0009-0003-3878-2984 ; 0009-0005-2050-6396 ; 0009-0009-2314-0560 ; 0000-0001-6058-2426 ; 0009-0006-6378-8052 ; 0009-0006-0558-8298</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10430220$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10430220$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hoang, Dinh-Cuong</creatorcontrib><creatorcontrib>Nguyen, Anh-Nhat</creatorcontrib><creatorcontrib>Vu, Van-Duc</creatorcontrib><creatorcontrib>Nguyen, Thu-Uyen</creatorcontrib><creatorcontrib>Vu, Duy-Quang</creatorcontrib><creatorcontrib>Ngo, Phuc-Quan</creatorcontrib><creatorcontrib>Hoang, Ngoc-Anh</creatorcontrib><creatorcontrib>Phan, Khanh-Toan</creatorcontrib><creatorcontrib>Tran, Duc-Thanh</creatorcontrib><creatorcontrib>Nguyen, Van-Thiep</creatorcontrib><creatorcontrib>Duong, Quang-Tri</creatorcontrib><creatorcontrib>Ho, Ngoc-Trung</creatorcontrib><creatorcontrib>Tran, Cong-Trinh</creatorcontrib><creatorcontrib>Duong, Van-Hiep</creatorcontrib><creatorcontrib>Mai, Anh-Truong</creatorcontrib><title>Graspability-Aware Object Pose Estimation in Cluttered Scenes</title><title>IEEE robotics and automation letters</title><addtitle>LRA</addtitle><description>Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During actual execution, the robot must recognize the object in the current scene, estimate its pose, and then select a feasible grasp pose from the pre-defined grasp configurations. While most existing methods primarily focus on pose estimation, they often neglect the graspability and reachability aspects. This oversight can lead to inefficiencies and failures during execution. In this study, we introduce an innovative graspability-aware object pose estimation framework. Our proposed approach not only estimates the poses of multiple objects in clustered scenes but also identifies graspable areas. This enables the system to concentrate its efforts on specific points or regions of an object that are suitable for grasping. It leverages both depth and color images to extract geometric and appearance features. To effectively combine these diverse features, we have developed an adaptive fusion module. In addition, the fused features are further enhanced through a graspability-aware feature enhancement module. The key innovation of our method lies in improving the discriminability and robustness of the features used for object pose estimation. We have achieved state-of-the-art results on public datasets when compared to several baseline methods. In real robot experiments conducted on a Franka Emika robot arm equipped with an Intel Realsense camera and a two-finger gripper, we consistently achieved high success rates, even in cluttered scenes.</description><subject>6D object pose estimation</subject><subject>Color imagery</subject><subject>Critical components</subject><subject>Feature extraction</subject><subject>Fingers</subject><subject>Geometry</subject><subject>grasp detection</subject><subject>Modules</subject><subject>Object recognition</subject><subject>Point cloud compression</subject><subject>Pose estimation</subject><subject>Robot arms</subject><subject>robot manipulation</subject><subject>Robot sensing systems</subject><subject>Robots</subject><subject>Solid modeling</subject><subject>System effectiveness</subject><subject>Three-dimensional displays</subject><issn>2377-3766</issn><issn>2377-3766</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkL1rwzAQxUVpoSHN3qGDobNT6fRlDR1CSNNCIKUfs5DlMzikdioplPz3VUiGTHfDe_fu_Qi5Z3TKGDVPq4_ZFCiIKedKCMmuyAi41iXXSl1f7LdkEuOGUsokaG7kiDwvg4s7V3fbLh3K2Z8LWKzrDfpUvA8Ri0VM3Y9L3dAXXV_Mt_uUMGBTfHrsMd6Rm9ZtI07Oc0y-XxZf89dytV6-zWer0oOBVAonacWAVZUWvlaNqRoAbTQ0tQRTi1aj9pWiLnfhDJpKeidd00oOVW2E4mPyeLq7C8PvHmOym2Ef-hxpc4ABaZTiWUVPKh-GGAO2dhfy8-FgGbVHTjZzskdO9swpWx5Olg4RL-SCUwDK_wHWdWGA</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Hoang, Dinh-Cuong</creator><creator>Nguyen, Anh-Nhat</creator><creator>Vu, Van-Duc</creator><creator>Nguyen, Thu-Uyen</creator><creator>Vu, Duy-Quang</creator><creator>Ngo, Phuc-Quan</creator><creator>Hoang, Ngoc-Anh</creator><creator>Phan, Khanh-Toan</creator><creator>Tran, Duc-Thanh</creator><creator>Nguyen, Van-Thiep</creator><creator>Duong, Quang-Tri</creator><creator>Ho, Ngoc-Trung</creator><creator>Tran, Cong-Trinh</creator><creator>Duong, Van-Hiep</creator><creator>Mai, Anh-Truong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0009-0003-1231-8251</orcidid><orcidid>https://orcid.org/0009-0009-4006-4352</orcidid><orcidid>https://orcid.org/0009-0003-7045-3355</orcidid><orcidid>https://orcid.org/0009-0001-2469-0961</orcidid><orcidid>https://orcid.org/0009-0001-0689-1292</orcidid><orcidid>https://orcid.org/0009-0007-7122-7148</orcidid><orcidid>https://orcid.org/0009-0006-3800-3014</orcidid><orcidid>https://orcid.org/0009-0002-7775-0021</orcidid><orcidid>https://orcid.org/0009-0008-8349-454X</orcidid><orcidid>https://orcid.org/0009-0003-3878-2984</orcidid><orcidid>https://orcid.org/0009-0005-2050-6396</orcidid><orcidid>https://orcid.org/0009-0009-2314-0560</orcidid><orcidid>https://orcid.org/0000-0001-6058-2426</orcidid><orcidid>https://orcid.org/0009-0006-6378-8052</orcidid><orcidid>https://orcid.org/0009-0006-0558-8298</orcidid></search><sort><creationdate>20240401</creationdate><title>Graspability-Aware Object Pose Estimation in Cluttered Scenes</title><author>Hoang, Dinh-Cuong ; Nguyen, Anh-Nhat ; Vu, Van-Duc ; Nguyen, Thu-Uyen ; Vu, Duy-Quang ; Ngo, Phuc-Quan ; Hoang, Ngoc-Anh ; Phan, Khanh-Toan ; Tran, Duc-Thanh ; Nguyen, Van-Thiep ; Duong, Quang-Tri ; Ho, Ngoc-Trung ; Tran, Cong-Trinh ; Duong, Van-Hiep ; Mai, Anh-Truong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-4a5081218874cb6d98d227972db529b4f7e7c860a109312d85ca5adf5328b9463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>6D object pose estimation</topic><topic>Color imagery</topic><topic>Critical components</topic><topic>Feature extraction</topic><topic>Fingers</topic><topic>Geometry</topic><topic>grasp detection</topic><topic>Modules</topic><topic>Object recognition</topic><topic>Point cloud compression</topic><topic>Pose estimation</topic><topic>Robot arms</topic><topic>robot manipulation</topic><topic>Robot sensing systems</topic><topic>Robots</topic><topic>Solid modeling</topic><topic>System effectiveness</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hoang, Dinh-Cuong</creatorcontrib><creatorcontrib>Nguyen, Anh-Nhat</creatorcontrib><creatorcontrib>Vu, Van-Duc</creatorcontrib><creatorcontrib>Nguyen, Thu-Uyen</creatorcontrib><creatorcontrib>Vu, Duy-Quang</creatorcontrib><creatorcontrib>Ngo, Phuc-Quan</creatorcontrib><creatorcontrib>Hoang, Ngoc-Anh</creatorcontrib><creatorcontrib>Phan, Khanh-Toan</creatorcontrib><creatorcontrib>Tran, Duc-Thanh</creatorcontrib><creatorcontrib>Nguyen, Van-Thiep</creatorcontrib><creatorcontrib>Duong, Quang-Tri</creatorcontrib><creatorcontrib>Ho, Ngoc-Trung</creatorcontrib><creatorcontrib>Tran, Cong-Trinh</creatorcontrib><creatorcontrib>Duong, Van-Hiep</creatorcontrib><creatorcontrib>Mai, Anh-Truong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE robotics and automation letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hoang, Dinh-Cuong</au><au>Nguyen, Anh-Nhat</au><au>Vu, Van-Duc</au><au>Nguyen, Thu-Uyen</au><au>Vu, Duy-Quang</au><au>Ngo, Phuc-Quan</au><au>Hoang, Ngoc-Anh</au><au>Phan, Khanh-Toan</au><au>Tran, Duc-Thanh</au><au>Nguyen, Van-Thiep</au><au>Duong, Quang-Tri</au><au>Ho, Ngoc-Trung</au><au>Tran, Cong-Trinh</au><au>Duong, Van-Hiep</au><au>Mai, Anh-Truong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Graspability-Aware Object Pose Estimation in Cluttered Scenes</atitle><jtitle>IEEE robotics and automation letters</jtitle><stitle>LRA</stitle><date>2024-04-01</date><risdate>2024</risdate><volume>9</volume><issue>4</issue><spage>3124</spage><epage>3130</epage><pages>3124-3130</pages><issn>2377-3766</issn><eissn>2377-3766</eissn><coden>IRALC6</coden><abstract>Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During actual execution, the robot must recognize the object in the current scene, estimate its pose, and then select a feasible grasp pose from the pre-defined grasp configurations. While most existing methods primarily focus on pose estimation, they often neglect the graspability and reachability aspects. This oversight can lead to inefficiencies and failures during execution. In this study, we introduce an innovative graspability-aware object pose estimation framework. Our proposed approach not only estimates the poses of multiple objects in clustered scenes but also identifies graspable areas. This enables the system to concentrate its efforts on specific points or regions of an object that are suitable for grasping. It leverages both depth and color images to extract geometric and appearance features. To effectively combine these diverse features, we have developed an adaptive fusion module. In addition, the fused features are further enhanced through a graspability-aware feature enhancement module. The key innovation of our method lies in improving the discriminability and robustness of the features used for object pose estimation. We have achieved state-of-the-art results on public datasets when compared to several baseline methods. In real robot experiments conducted on a Franka Emika robot arm equipped with an Intel Realsense camera and a two-finger gripper, we consistently achieved high success rates, even in cluttered scenes.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LRA.2024.3364451</doi><tpages>7</tpages><orcidid>https://orcid.org/0009-0003-1231-8251</orcidid><orcidid>https://orcid.org/0009-0009-4006-4352</orcidid><orcidid>https://orcid.org/0009-0003-7045-3355</orcidid><orcidid>https://orcid.org/0009-0001-2469-0961</orcidid><orcidid>https://orcid.org/0009-0001-0689-1292</orcidid><orcidid>https://orcid.org/0009-0007-7122-7148</orcidid><orcidid>https://orcid.org/0009-0006-3800-3014</orcidid><orcidid>https://orcid.org/0009-0002-7775-0021</orcidid><orcidid>https://orcid.org/0009-0008-8349-454X</orcidid><orcidid>https://orcid.org/0009-0003-3878-2984</orcidid><orcidid>https://orcid.org/0009-0005-2050-6396</orcidid><orcidid>https://orcid.org/0009-0009-2314-0560</orcidid><orcidid>https://orcid.org/0000-0001-6058-2426</orcidid><orcidid>https://orcid.org/0009-0006-6378-8052</orcidid><orcidid>https://orcid.org/0009-0006-0558-8298</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2377-3766
ispartof IEEE robotics and automation letters, 2024-04, Vol.9 (4), p.3124-3130
issn 2377-3766
2377-3766
language eng
recordid cdi_proquest_journals_2929259663
source IEEE Electronic Library (IEL)
subjects 6D object pose estimation
Color imagery
Critical components
Feature extraction
Fingers
Geometry
grasp detection
Modules
Object recognition
Point cloud compression
Pose estimation
Robot arms
robot manipulation
Robot sensing systems
Robots
Solid modeling
System effectiveness
Three-dimensional displays
title Graspability-Aware Object Pose Estimation in Cluttered Scenes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T14%3A38%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Graspability-Aware%20Object%20Pose%20Estimation%20in%20Cluttered%20Scenes&rft.jtitle=IEEE%20robotics%20and%20automation%20letters&rft.au=Hoang,%20Dinh-Cuong&rft.date=2024-04-01&rft.volume=9&rft.issue=4&rft.spage=3124&rft.epage=3130&rft.pages=3124-3130&rft.issn=2377-3766&rft.eissn=2377-3766&rft.coden=IRALC6&rft_id=info:doi/10.1109/LRA.2024.3364451&rft_dat=%3Cproquest_RIE%3E2929259663%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2929259663&rft_id=info:pmid/&rft_ieee_id=10430220&rfr_iscdi=true