Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images

This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in clut...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mercier, Jean-Philippe, Mitash, Chaitanya, Giguère, Philippe, Boularias, Abdeslam
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Mercier, Jean-Philippe
Mitash, Chaitanya
Giguère, Philippe
Boularias, Abdeslam
description This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.
doi_str_mv 10.48550/arxiv.1806.06888
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1806_06888</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1806_06888</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-85a6f9df302a99fb9596d4bed423281bb98bebceb39c795327aceb3ae713f63a3</originalsourceid><addsrcrecordid>eNpFj89KxDAYxHPxIKsP4Mm8QGvaNGlylHXVhcCKLngsX5ovSzRtJa3i7tOv-wc8DTMMw_wIuSlYXikh2B2k3_CTF4rJnEml1CWxBiH1od_Qlf3AdqJmaCGGHUxh6Cn0jsoH-jKMSBfjFLpT7NPQ0bfQfcf_2jvCZ9xSAxYjOvqKEOmygw2OV-TCQxzx-qwzsn5crOfPmVk9Lef3JgNZq0wJkF47z1kJWnurhZausuiqkpeqsFYri7ZFy3Vba8HLGg4GsC64lxz4jNyeZo-QzVf6e5u2zQG2OcLyPUv0UEo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><source>arXiv.org</source><creator>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</creator><creatorcontrib>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</creatorcontrib><description>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</description><identifier>DOI: 10.48550/arxiv.1806.06888</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2018-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1806.06888$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1806.06888$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mercier, Jean-Philippe</creatorcontrib><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Giguère, Philippe</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><description>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpFj89KxDAYxHPxIKsP4Mm8QGvaNGlylHXVhcCKLngsX5ovSzRtJa3i7tOv-wc8DTMMw_wIuSlYXikh2B2k3_CTF4rJnEml1CWxBiH1od_Qlf3AdqJmaCGGHUxh6Cn0jsoH-jKMSBfjFLpT7NPQ0bfQfcf_2jvCZ9xSAxYjOvqKEOmygw2OV-TCQxzx-qwzsn5crOfPmVk9Lef3JgNZq0wJkF47z1kJWnurhZausuiqkpeqsFYri7ZFy3Vba8HLGg4GsC64lxz4jNyeZo-QzVf6e5u2zQG2OcLyPUv0UEo</recordid><startdate>20180618</startdate><enddate>20180618</enddate><creator>Mercier, Jean-Philippe</creator><creator>Mitash, Chaitanya</creator><creator>Giguère, Philippe</creator><creator>Boularias, Abdeslam</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20180618</creationdate><title>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</title><author>Mercier, Jean-Philippe ; Mitash, Chaitanya ; Giguère, Philippe ; Boularias, Abdeslam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-85a6f9df302a99fb9596d4bed423281bb98bebceb39c795327aceb3ae713f63a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Mercier, Jean-Philippe</creatorcontrib><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Giguère, Philippe</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mercier, Jean-Philippe</au><au>Mitash, Chaitanya</au><au>Giguère, Philippe</au><au>Boularias, Abdeslam</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images</atitle><date>2018-06-18</date><risdate>2018</risdate><abstract>This work proposes a process for efficiently training a point-wise object detector that enables localizing objects and computing their 6D poses in cluttered and occluded scenes. Accurate pose estimation is typically a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. To minimize the human labor required for annotation, the proposed object detector is first trained in simulation by using automatically annotated synthetic images. We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects. To close the gap between real and synthetic images, we adopt a domain adaptation approach through adversarial training. The detector resulting from this training process can be used to localize objects by using its per-object activation maps. In this work, we use the activation maps to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.</abstract><doi>10.48550/arxiv.1806.06888</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1806.06888
ispartof
issn
language eng
recordid cdi_arxiv_primary_1806_06888
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T20%3A50%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Object%20Localization%20and%206D%20Pose%20Estimation%20from%20Simulation%20and%20Weakly%20Labeled%20Real%20Images&rft.au=Mercier,%20Jean-Philippe&rft.date=2018-06-18&rft_id=info:doi/10.48550/arxiv.1806.06888&rft_dat=%3Carxiv_GOX%3E1806_06888%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true