A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation
Progress has been achieved recently in object detection given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large n...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Mitash, Chaitanya Bekris, Kostas E Boularias, Abdeslam |
description | Progress has been achieved recently in object detection given advancements in
deep learning. Nevertheless, such tools typically require a large amount of
training data and significant manual effort to label objects. This limits their
applicability in robotics, where solutions must scale to a large number of
objects and variety of conditions. This work proposes an autonomous process for
training a Convolutional Neural Network (CNN) for object detection and pose
estimation in robotic setups. The focus is on detecting objects placed in
cluttered, tight environments, such as a shelf with multiple objects. In
particular, given access to 3D object models, several aspects of the
environment are physically simulated. The models are placed in physically
realistic poses with respect to their environment to generate a labeled
synthetic dataset. To further improve object detection, the network self-trains
over real images that are labeled using a robust multi-view pose estimation
process. The proposed training process is evaluated on several existing
datasets and on a dataset collected for this paper with a Motoman robotic arm.
Results show that the proposed approach outperforms popular training processes
relying on synthetic - but not physically realistic - data and manual
annotation. The key contributions are the incorporation of physical reasoning
in the synthetic data generation process and the automation of the annotation
process over real images. |
doi_str_mv | 10.48550/arxiv.1703.03347 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1703_03347</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1703_03347</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-9b4c23eda7996886d75fa18e2cf97dd6a21153d03b1d32c3a4a52007699defb53</originalsourceid><addsrcrecordid>eNotj8tOwzAURL1hgQofwIr7Awl2HMfxsirlIQW1UrqPnPgajPKo7CTQv6dNWc3ijGZ0CHlgNE5zIeiT9r9ujpmkPKacp_KW9GsosbVRmI7oZxfQQIHa967_hPIURuzADh529Tc2IzzjeA439DCFS2P_dQquCVC6bmr1AnRv4GNqRxfNDn9gPwSEbRhdt-A7cmN1G_D-P1fk8LI9bN6iYvf6vlkXkc6kjFSdNglHo6VSWZ5nRgqrWY5JY5U0JtMJY4IbymtmeNJwnWqRUCozpQzaWvAVebzOLsLV0Z_v_am6iFeLOP8DN2BU1g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation</title><source>arXiv.org</source><creator>Mitash, Chaitanya ; Bekris, Kostas E ; Boularias, Abdeslam</creator><creatorcontrib>Mitash, Chaitanya ; Bekris, Kostas E ; Boularias, Abdeslam</creatorcontrib><description>Progress has been achieved recently in object detection given advancements in
deep learning. Nevertheless, such tools typically require a large amount of
training data and significant manual effort to label objects. This limits their
applicability in robotics, where solutions must scale to a large number of
objects and variety of conditions. This work proposes an autonomous process for
training a Convolutional Neural Network (CNN) for object detection and pose
estimation in robotic setups. The focus is on detecting objects placed in
cluttered, tight environments, such as a shelf with multiple objects. In
particular, given access to 3D object models, several aspects of the
environment are physically simulated. The models are placed in physically
realistic poses with respect to their environment to generate a labeled
synthetic dataset. To further improve object detection, the network self-trains
over real images that are labeled using a robust multi-view pose estimation
process. The proposed training process is evaluated on several existing
datasets and on a dataset collected for this paper with a Motoman robotic arm.
Results show that the proposed approach outperforms popular training processes
relying on synthetic - but not physically realistic - data and manual
annotation. The key contributions are the incorporation of physical reasoning
in the synthetic data generation process and the automation of the annotation
process over real images.</description><identifier>DOI: 10.48550/arxiv.1703.03347</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Robotics</subject><creationdate>2017-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1703.03347$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1703.03347$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Bekris, Kostas E</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><title>A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation</title><description>Progress has been achieved recently in object detection given advancements in
deep learning. Nevertheless, such tools typically require a large amount of
training data and significant manual effort to label objects. This limits their
applicability in robotics, where solutions must scale to a large number of
objects and variety of conditions. This work proposes an autonomous process for
training a Convolutional Neural Network (CNN) for object detection and pose
estimation in robotic setups. The focus is on detecting objects placed in
cluttered, tight environments, such as a shelf with multiple objects. In
particular, given access to 3D object models, several aspects of the
environment are physically simulated. The models are placed in physically
realistic poses with respect to their environment to generate a labeled
synthetic dataset. To further improve object detection, the network self-trains
over real images that are labeled using a robust multi-view pose estimation
process. The proposed training process is evaluated on several existing
datasets and on a dataset collected for this paper with a Motoman robotic arm.
Results show that the proposed approach outperforms popular training processes
relying on synthetic - but not physically realistic - data and manual
annotation. The key contributions are the incorporation of physical reasoning
in the synthetic data generation process and the automation of the annotation
process over real images.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tOwzAURL1hgQofwIr7Awl2HMfxsirlIQW1UrqPnPgajPKo7CTQv6dNWc3ijGZ0CHlgNE5zIeiT9r9ujpmkPKacp_KW9GsosbVRmI7oZxfQQIHa967_hPIURuzADh529Tc2IzzjeA439DCFS2P_dQquCVC6bmr1AnRv4GNqRxfNDn9gPwSEbRhdt-A7cmN1G_D-P1fk8LI9bN6iYvf6vlkXkc6kjFSdNglHo6VSWZ5nRgqrWY5JY5U0JtMJY4IbymtmeNJwnWqRUCozpQzaWvAVebzOLsLV0Z_v_am6iFeLOP8DN2BU1g</recordid><startdate>20170309</startdate><enddate>20170309</enddate><creator>Mitash, Chaitanya</creator><creator>Bekris, Kostas E</creator><creator>Boularias, Abdeslam</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20170309</creationdate><title>A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation</title><author>Mitash, Chaitanya ; Bekris, Kostas E ; Boularias, Abdeslam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-9b4c23eda7996886d75fa18e2cf97dd6a21153d03b1d32c3a4a52007699defb53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Mitash, Chaitanya</creatorcontrib><creatorcontrib>Bekris, Kostas E</creatorcontrib><creatorcontrib>Boularias, Abdeslam</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mitash, Chaitanya</au><au>Bekris, Kostas E</au><au>Boularias, Abdeslam</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation</atitle><date>2017-03-09</date><risdate>2017</risdate><abstract>Progress has been achieved recently in object detection given advancements in
deep learning. Nevertheless, such tools typically require a large amount of
training data and significant manual effort to label objects. This limits their
applicability in robotics, where solutions must scale to a large number of
objects and variety of conditions. This work proposes an autonomous process for
training a Convolutional Neural Network (CNN) for object detection and pose
estimation in robotic setups. The focus is on detecting objects placed in
cluttered, tight environments, such as a shelf with multiple objects. In
particular, given access to 3D object models, several aspects of the
environment are physically simulated. The models are placed in physically
realistic poses with respect to their environment to generate a labeled
synthetic dataset. To further improve object detection, the network self-trains
over real images that are labeled using a robust multi-view pose estimation
process. The proposed training process is evaluated on several existing
datasets and on a dataset collected for this paper with a Motoman robotic arm.
Results show that the proposed approach outperforms popular training processes
relying on synthetic - but not physically realistic - data and manual
annotation. The key contributions are the incorporation of physical reasoning
in the synthetic data generation process and the automation of the annotation
process over real images.</abstract><doi>10.48550/arxiv.1703.03347</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1703.03347 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1703_03347 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics |
title | A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T19%3A26%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Self-supervised%20Learning%20System%20for%20Object%20Detection%20using%20Physics%20Simulation%20and%20Multi-view%20Pose%20Estimation&rft.au=Mitash,%20Chaitanya&rft.date=2017-03-09&rft_id=info:doi/10.48550/arxiv.1703.03347&rft_dat=%3Carxiv_GOX%3E1703_03347%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |