Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images

This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in clut...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Mercier, Jean-Philippe, Mitash, Chaitanya, Giguère, Philippe, Boularias, Abdeslam
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Mercier, Jean-Philippe Mitash, Chaitanya Giguère, Philippe Boularias, Abdeslam
description	This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.
doi_str_mv	10.48550/arxiv.1806.06888
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1806_06888</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1806_06888</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-85a6f9df302a99fb9596d4bed423281bb98bebceb39c795327aceb3ae713f63a3</originalsourceid><addsrcrecordid>eNpFj89KxDAYxHPxIKsP4Mm8QGvaNGlylHXVhcCKLngsX5ovSzRtJa3i7tOv-wc8DTMMw_wIuSlYXikh2B2k3_CTF4rJnEml1CWxBiH1od_Qlf3AdqJmaCGGHUxh6Cn0jsoH-jKMSBfjFLpT7NPQ0bfQfcf_2jvCZ9xSAxYjOvqKEOmygw2OV-TCQxzx-qwzsn5crOfPmVk9Lef3JgNZq0wJkF47z1kJWnurhZausuiqkpeqsFYri7ZFy3Vba8HLGg4GsC64lxz4jNyeZo-QzVf6e5u2zQG2OcLyPUv0UEo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><source>arXiv.org</source><creator>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</creator><creatorcontrib>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</creatorcontrib><description>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</description><identifier>DOI: 10.48550/arxiv.1806.06888</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2018-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1806.06888$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1806.06888$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mercier, Jean-Philippe</creatorcontrib><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Giguère, Philippe</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><description>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpFj89KxDAYxHPxIKsP4Mm8QGvaNGlylHXVhcCKLngsX5ovSzRtJa3i7tOv-wc8DTMMw_wIuSlYXikh2B2k3_CTF4rJnEml1CWxBiH1od_Qlf3AdqJmaCGGHUxh6Cn0jsoH-jKMSBfjFLpT7NPQ0bfQfcf_2jvCZ9xSAxYjOvqKEOmygw2OV-TCQxzx-qwzsn5crOfPmVk9Lef3JgNZq0wJkF47z1kJWnurhZausuiqkpeqsFYri7ZFy3Vba8HLGg4GsC64lxz4jNyeZo-QzVf6e5u2zQG2OcLyPUv0UEo</recordid><startdate>20180618</startdate><enddate>20180618</enddate><creator>Mercier, Jean-Philippe</creator><creator>Mitash, Chaitanya</creator><creator>Giguère, Philippe</creator><creator>Boularias, Abdeslam</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20180618</creationdate><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><author>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-85a6f9df302a99fb9596d4bed423281bb98bebceb39c795327aceb3ae713f63a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Mercier, Jean-Philippe</creatorcontrib><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Giguère, Philippe</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mercier, Jean-Philippe</au><au>Mitash, Chaitanya</au><au>Giguère, Philippe</au><au>Boularias, Abdeslam</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</atitle><date>2018-06-18</date><risdate>2018</risdate><abstract>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</abstract><doi>10.48550/arxiv.1806.06888</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1806.06888
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1806_06888
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T20%3A50%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Object%20Localization%20and%206D%20Pose%20Estimation%20from%20Simulation%20and%20Weakly%20Labeled%20Real%20Images&rft.au=Mercier,%20Jean-Philippe&rft.date=2018-06-18&rft_id=info:doi/10.48550/arxiv.1806.06888&rft_dat=%3Carxiv_GOX%3E1806_06888%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true