DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation

Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-07
Hauptverfasser:	Tchuiev, Vladimir, Miron, Yakov, Di-Castro, Dotan
Format:	Artikel
Sprache:	eng
Schlagworte:	Coders Decision making Encoders-Decoders Robotics Structural hierarchy
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Tchuiev, Vladimir Miron, Yakov Di-Castro, Dotan
description	Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present DUQIM-Net, a decision-making approach for object manipulation in a setting of stacked objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using Adj-Net, a model that leverages existing Transformer Encoder-Decoder object detectors by adding an adjacency head. The output of this head probabilistically infers the underlying hierarchical structure of the objects in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to perform decision making and assist with object-grasping tasks. Our experimental results show that Adj-Net surpasses the state-of-the-art in object-relationship inference on the Visual Manipulation Relationship Dataset (VMRD), and that DUQIM-Net outperforms comparable approaches in bin clearing tasks.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2691900527</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2691900527</sourcerecordid><originalsourceid>FETCH-proquest_journals_26919005273</originalsourceid><addsrcrecordid>eNqNytsKgjAcgPERBEn5DoOuB3OmZrcdsAs7Yd3KlL80GZvtQPT2RfQAXX0Xv2-EAhbHEVkuGJug0NqeUsrSjCVJHKBqcz3vS3IAt8InoxveCCmsEy0-Nj20DhcCDDft_YUvMBiwoBx3QivcaYNLL50gNwFPXHIlBi-_NkPjjksL4a9TNN9tq3VBBqMfHqyre-2N-lDN0jzKKU1YFv93vQE1L0A0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2691900527</pqid></control><display><type>article</type><title>DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation</title><source>Free E- Journals</source><creator>Tchuiev, Vladimir ; Miron, Yakov ; Di-Castro, Dotan</creator><creatorcontrib>Tchuiev, Vladimir ; Miron, Yakov ; Di-Castro, Dotan</creatorcontrib><description>Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present DUQIM-Net, a decision-making approach for object manipulation in a setting of stacked objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using Adj-Net, a model that leverages existing Transformer Encoder-Decoder object detectors by adding an adjacency head. The output of this head probabilistically infers the underlying hierarchical structure of the objects in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to perform decision making and assist with object-grasping tasks. Our experimental results show that Adj-Net surpasses the state-of-the-art in object-relationship inference on the Visual Manipulation Relationship Dataset (VMRD), and that DUQIM-Net outperforms comparable approaches in bin clearing tasks.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Coders ; Decision making ; Encoders-Decoders ; Robotics ; Structural hierarchy</subject><ispartof>arXiv.org, 2022-07</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>777,781</link.rule.ids></links><search><creatorcontrib>Tchuiev, Vladimir</creatorcontrib><creatorcontrib>Miron, Yakov</creatorcontrib><creatorcontrib>Di-Castro, Dotan</creatorcontrib><title>DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation</title><title>arXiv.org</title><description>Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present DUQIM-Net, a decision-making approach for object manipulation in a setting of stacked objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using Adj-Net, a model that leverages existing Transformer Encoder-Decoder object detectors by adding an adjacency head. The output of this head probabilistically infers the underlying hierarchical structure of the objects in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to perform decision making and assist with object-grasping tasks. Our experimental results show that Adj-Net surpasses the state-of-the-art in object-relationship inference on the Visual Manipulation Relationship Dataset (VMRD), and that DUQIM-Net outperforms comparable approaches in bin clearing tasks.</description><subject>Coders</subject><subject>Decision making</subject><subject>Encoders-Decoders</subject><subject>Robotics</subject><subject>Structural hierarchy</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNytsKgjAcgPERBEn5DoOuB3OmZrcdsAs7Yd3KlL80GZvtQPT2RfQAXX0Xv2-EAhbHEVkuGJug0NqeUsrSjCVJHKBqcz3vS3IAt8InoxveCCmsEy0-Nj20DhcCDDft_YUvMBiwoBx3QivcaYNLL50gNwFPXHIlBi-_NkPjjksL4a9TNN9tq3VBBqMfHqyre-2N-lDN0jzKKU1YFv93vQE1L0A0</recordid><startdate>20220719</startdate><enddate>20220719</enddate><creator>Tchuiev, Vladimir</creator><creator>Miron, Yakov</creator><creator>Di-Castro, Dotan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220719</creationdate><title>DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation</title><author>Tchuiev, Vladimir ; Miron, Yakov ; Di-Castro, Dotan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26919005273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Coders</topic><topic>Decision making</topic><topic>Encoders-Decoders</topic><topic>Robotics</topic><topic>Structural hierarchy</topic><toplevel>online_resources</toplevel><creatorcontrib>Tchuiev, Vladimir</creatorcontrib><creatorcontrib>Miron, Yakov</creatorcontrib><creatorcontrib>Di-Castro, Dotan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tchuiev, Vladimir</au><au>Miron, Yakov</au><au>Di-Castro, Dotan</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation</atitle><jtitle>arXiv.org</jtitle><date>2022-07-19</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present DUQIM-Net, a decision-making approach for object manipulation in a setting of stacked objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using Adj-Net, a model that leverages existing Transformer Encoder-Decoder object detectors by adding an adjacency head. The output of this head probabilistically infers the underlying hierarchical structure of the objects in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to perform decision making and assist with object-grasping tasks. Our experimental results show that Adj-Net surpasses the state-of-the-art in object-relationship inference on the Visual Manipulation Relationship Dataset (VMRD), and that DUQIM-Net outperforms comparable approaches in bin clearing tasks.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2691900527
source	Free E- Journals
subjects	Coders Decision making Encoders-Decoders Robotics Structural hierarchy
title	DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T03%3A34%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=DUQIM-Net:%20Probabilistic%20Object%20Hierarchy%20Representation%20for%20Multi-View%20Manipulation&rft.jtitle=arXiv.org&rft.au=Tchuiev,%20Vladimir&rft.date=2022-07-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2691900527%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2691900527&rft_id=info:pmid/&rfr_iscdi=true