Generative model based robotic grasp pose prediction with limited dataset

In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose ac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-07, Vol.52 (9), p.9952-9966
Hauptverfasser:	Shukla, Priya, Pramanik, Nilotpal, Mehta, Deepesh, Nandi, G. C.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Coders Cognition & reasoning Computer Science Datasets Embedding Grasping (robotics) Machines Manufacturing Mechanical Engineering Neural networks Object recognition Processes
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	9966
container_issue	9
container_start_page	9952
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	52
creator	Shukla, Priya Pramanik, Nilotpal Mehta, Deepesh Nandi, G. C.
description	In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose accuracy for detecting both regular/irregular shaped objects from RGB-Depth images while requiring only one-third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting a sufficient and quality labelled dataset for robot grasping is extremely difficult. To address these issues, we subsequently propose another architecture where our proposed GI-NNet model is attached as a decoder of a Vector Quantized Variational Auto-Encoder (VQ-VAE), which works more efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet) has been trained utilizing the various split of available CGD dataset to test the learning ability of our architecture starting from only 10% label data with the latent embedding of VQ-VAE to 90% label data with the latent embedding. However, being trained with only 50% label data of CGD with latent embedding, the proposed architecture produces the best results which, we believe, is a remarkable accomplishment. The logical reasoning of this together with the other relevant technological details have been elaborated in this paper. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.1348% to 97.7528% which is far better than several existing models trained with only labelled dataset. For the performance verification of both the proposed models, GI-NNet and RGI-NNet, we have performed rigorous experiments on Anukul (Baxter) hardware cobot.
doi_str_mv	10.1007/s10489-021-03011-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2678581210</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2678581210</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-a2520b3f67cfb40ead62f81c0074ddfcbe05b7130d790e0780bffcfde63b0c403</originalsourceid><addsrcrecordid>eNp9kE1LAzEQQIMoWKt_wNOC5-gk2d3sHqVoLRS8KHgL-awp7WZNUkV_vakVvHmay3szzEPoksA1AeA3iUDd9RgowcCAEPx1hCak4QzzuufHaAI9rXHb9i-n6CylNQCwwk3QYm4HG2X277baBmM3lZLJmioGFbLX1SrKNFZjSLYaozVeZx-G6sPn12rjtz4X1MhclHyOTpzcJHvxO6fo-f7uafaAl4_zxex2iTUjfcaSNhQUcy3XTtVgpWmp64guX9TGOK0sNIoTBob3YIF3oJzTztiWKdA1sCm6OuwdY3jb2ZTFOuziUE4K2vKu6Qgle4oeKB1DStE6MUa_lfFTEBD7ZOKQTJRk4ieZ-CoSO0ipwMPKxr_V_1jf52lwTQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2678581210</pqid></control><display><type>article</type><title>Generative model based robotic grasp pose prediction with limited dataset</title><source>Springer Nature - Complete Springer Journals</source><creator>Shukla, Priya ; Pramanik, Nilotpal ; Mehta, Deepesh ; Nandi, G. C.</creator><creatorcontrib>Shukla, Priya ; Pramanik, Nilotpal ; Mehta, Deepesh ; Nandi, G. C.</creatorcontrib><description>In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose accuracy for detecting both regular/irregular shaped objects from RGB-Depth images while requiring only one-third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting a sufficient and quality labelled dataset for robot grasping is extremely difficult. To address these issues, we subsequently propose another architecture where our proposed GI-NNet model is attached as a decoder of a Vector Quantized Variational Auto-Encoder (VQ-VAE), which works more efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet) has been trained utilizing the various split of available CGD dataset to test the learning ability of our architecture starting from only 10% label data with the latent embedding of VQ-VAE to 90% label data with the latent embedding. However, being trained with only 50% label data of CGD with latent embedding, the proposed architecture produces the best results which, we believe, is a remarkable accomplishment. The logical reasoning of this together with the other relevant technological details have been elaborated in this paper. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.1348% to 97.7528% which is far better than several existing models trained with only labelled dataset. For the performance verification of both the proposed models, GI-NNet and RGI-NNet, we have performed rigorous experiments on Anukul (Baxter) hardware cobot.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-021-03011-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Coders ; Cognition & reasoning ; Computer Science ; Datasets ; Embedding ; Grasping (robotics) ; Machines ; Manufacturing ; Mechanical Engineering ; Neural networks ; Object recognition ; Processes</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-07, Vol.52 (9), p.9952-9966</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-a2520b3f67cfb40ead62f81c0074ddfcbe05b7130d790e0780bffcfde63b0c403</citedby><cites>FETCH-LOGICAL-c319t-a2520b3f67cfb40ead62f81c0074ddfcbe05b7130d790e0780bffcfde63b0c403</cites><orcidid>0000-0002-4163-6238</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10489-021-03011-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10489-021-03011-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Shukla, Priya</creatorcontrib><creatorcontrib>Pramanik, Nilotpal</creatorcontrib><creatorcontrib>Mehta, Deepesh</creatorcontrib><creatorcontrib>Nandi, G. C.</creatorcontrib><title>Generative model based robotic grasp pose prediction with limited dataset</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose accuracy for detecting both regular/irregular shaped objects from RGB-Depth images while requiring only one-third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting a sufficient and quality labelled dataset for robot grasping is extremely difficult. To address these issues, we subsequently propose another architecture where our proposed GI-NNet model is attached as a decoder of a Vector Quantized Variational Auto-Encoder (VQ-VAE), which works more efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet) has been trained utilizing the various split of available CGD dataset to test the learning ability of our architecture starting from only 10% label data with the latent embedding of VQ-VAE to 90% label data with the latent embedding. However, being trained with only 50% label data of CGD with latent embedding, the proposed architecture produces the best results which, we believe, is a remarkable accomplishment. The logical reasoning of this together with the other relevant technological details have been elaborated in this paper. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.1348% to 97.7528% which is far better than several existing models trained with only labelled dataset. For the performance verification of both the proposed models, GI-NNet and RGI-NNet, we have performed rigorous experiments on Anukul (Baxter) hardware cobot.</description><subject>Artificial Intelligence</subject><subject>Coders</subject><subject>Cognition & reasoning</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Embedding</subject><subject>Grasping (robotics)</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Processes</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNp9kE1LAzEQQIMoWKt_wNOC5-gk2d3sHqVoLRS8KHgL-awp7WZNUkV_vakVvHmay3szzEPoksA1AeA3iUDd9RgowcCAEPx1hCak4QzzuufHaAI9rXHb9i-n6CylNQCwwk3QYm4HG2X277baBmM3lZLJmioGFbLX1SrKNFZjSLYaozVeZx-G6sPn12rjtz4X1MhclHyOTpzcJHvxO6fo-f7uafaAl4_zxex2iTUjfcaSNhQUcy3XTtVgpWmp64guX9TGOK0sNIoTBob3YIF3oJzTztiWKdA1sCm6OuwdY3jb2ZTFOuziUE4K2vKu6Qgle4oeKB1DStE6MUa_lfFTEBD7ZOKQTJRk4ieZ-CoSO0ipwMPKxr_V_1jf52lwTQ</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Shukla, Priya</creator><creator>Pramanik, Nilotpal</creator><creator>Mehta, Deepesh</creator><creator>Nandi, G. C.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-4163-6238</orcidid></search><sort><creationdate>20220701</creationdate><title>Generative model based robotic grasp pose prediction with limited dataset</title><author>Shukla, Priya ; Pramanik, Nilotpal ; Mehta, Deepesh ; Nandi, G. C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-a2520b3f67cfb40ead62f81c0074ddfcbe05b7130d790e0780bffcfde63b0c403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial Intelligence</topic><topic>Coders</topic><topic>Cognition & reasoning</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Embedding</topic><topic>Grasping (robotics)</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Processes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shukla, Priya</creatorcontrib><creatorcontrib>Pramanik, Nilotpal</creatorcontrib><creatorcontrib>Mehta, Deepesh</creatorcontrib><creatorcontrib>Nandi, G. C.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shukla, Priya</au><au>Pramanik, Nilotpal</au><au>Mehta, Deepesh</au><au>Nandi, G. C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generative model based robotic grasp pose prediction with limited dataset</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-07-01</date><risdate>2022</risdate><volume>52</volume><issue>9</issue><spage>9952</spage><epage>9966</epage><pages>9952-9966</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In the present investigation, we propose an architecture which we name as Generative Inception Neural Network (GI-NNet), capable of predicting antipodal robotic grasps intelligently, on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attains a 98.87% grasp pose accuracy for detecting both regular/irregular shaped objects from RGB-Depth images while requiring only one-third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting a sufficient and quality labelled dataset for robot grasping is extremely difficult. To address these issues, we subsequently propose another architecture where our proposed GI-NNet model is attached as a decoder of a Vector Quantized Variational Auto-Encoder (VQ-VAE), which works more efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet) has been trained utilizing the various split of available CGD dataset to test the learning ability of our architecture starting from only 10% label data with the latent embedding of VQ-VAE to 90% label data with the latent embedding. However, being trained with only 50% label data of CGD with latent embedding, the proposed architecture produces the best results which, we believe, is a remarkable accomplishment. The logical reasoning of this together with the other relevant technological details have been elaborated in this paper. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.1348% to 97.7528% which is far better than several existing models trained with only labelled dataset. For the performance verification of both the proposed models, GI-NNet and RGI-NNet, we have performed rigorous experiments on Anukul (Baxter) hardware cobot.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-021-03011-z</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-4163-6238</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2022-07, Vol.52 (9), p.9952-9966
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_2678581210
source	Springer Nature - Complete Springer Journals
subjects	Artificial Intelligence Coders Cognition & reasoning Computer Science Datasets Embedding Grasping (robotics) Machines Manufacturing Mechanical Engineering Neural networks Object recognition Processes
title	Generative model based robotic grasp pose prediction with limited dataset
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T11%3A46%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generative%20model%20based%20robotic%20grasp%20pose%20prediction%20with%20limited%20dataset&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Shukla,%20Priya&rft.date=2022-07-01&rft.volume=52&rft.issue=9&rft.spage=9952&rft.epage=9966&rft.pages=9952-9966&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-021-03011-z&rft_dat=%3Cproquest_cross%3E2678581210%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2678581210&rft_id=info:pmid/&rfr_iscdi=true