Unsupervised deep learning based ego motion estimation with a downward facing camera

Knowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2023-03, Vol.39 (3), p.785-798
Hauptverfasser: Gilles, Maximilian, Ibrahimpasic, Sascha
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 798
container_issue 3
container_start_page 785
container_title The Visual computer
container_volume 39
creator Gilles, Maximilian
Ibrahimpasic, Sascha
description Knowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.
doi_str_mv 10.1007/s00371-021-02345-6
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2918052531</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2918052531</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-bf69f5586adf48602789bc79182ea0582fd97b8e785f83edc3fcd877460ff6203</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcLLE2eBH_MgRVbykSlzas-X4UVK1TrBTKv4ep0HixmG169XM7HgAuCX4nmAsHzLGTBKE6Vis4kicgRmpGEWUEX4OZphIhahU9SW4ynmLy1tW9Qys1jEfep--2uwddN73cOdNim3cwMaMO7_p4L4b2i5Cn4d2b07jsR0-oIGuO8ajSQ4GY0eKNXufzDW4CGaX_c1vn4P189Nq8YqW7y9vi8clskywATVB1IFzJYwLlRJ4tNdYWRNFvcFc0eBq2SgvFQ-KeWdZsE4V3wKHIChmc3A36fap-zwUd3rbHVIsJzUtKphTzkhB0QllU5dz8kH3qXwjfWuC9ZientLTJT19Sk-LQmITKRdw3Pj0J_0P6weHq3Ka</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2918052531</pqid></control><display><type>article</type><title>Unsupervised deep learning based ego motion estimation with a downward facing camera</title><source>ProQuest Central UK/Ireland</source><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Gilles, Maximilian ; Ibrahimpasic, Sascha</creator><creatorcontrib>Gilles, Maximilian ; Ibrahimpasic, Sascha</creatorcontrib><description>Knowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-021-02345-6</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial Intelligence ; Autonomous navigation ; Cameras ; Collision avoidance ; Computer Graphics ; Computer Science ; Deep learning ; Estimation ; Fast Fourier transformations ; Fourier transforms ; Image Processing and Computer Vision ; Image warping ; Localization ; Motion simulation ; Neural networks ; Original Article ; Prediction models ; Registration ; Robot dynamics ; Robots ; Vehicles</subject><ispartof>The Visual computer, 2023-03, Vol.39 (3), p.785-798</ispartof><rights>The Author(s) 2021</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-bf69f5586adf48602789bc79182ea0582fd97b8e785f83edc3fcd877460ff6203</citedby><cites>FETCH-LOGICAL-c363t-bf69f5586adf48602789bc79182ea0582fd97b8e785f83edc3fcd877460ff6203</cites><orcidid>0000-0002-0528-5709</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00371-021-02345-6$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2918052531?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,21388,27924,27925,33744,41488,42557,43805,51319,64385,64389,72469</link.rule.ids></links><search><creatorcontrib>Gilles, Maximilian</creatorcontrib><creatorcontrib>Ibrahimpasic, Sascha</creatorcontrib><title>Unsupervised deep learning based ego motion estimation with a downward facing camera</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Knowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.</description><subject>Artificial Intelligence</subject><subject>Autonomous navigation</subject><subject>Cameras</subject><subject>Collision avoidance</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Deep learning</subject><subject>Estimation</subject><subject>Fast Fourier transformations</subject><subject>Fourier transforms</subject><subject>Image Processing and Computer Vision</subject><subject>Image warping</subject><subject>Localization</subject><subject>Motion simulation</subject><subject>Neural networks</subject><subject>Original Article</subject><subject>Prediction models</subject><subject>Registration</subject><subject>Robot dynamics</subject><subject>Robots</subject><subject>Vehicles</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9UMtOwzAQtBBIlMIPcLLE2eBH_MgRVbykSlzas-X4UVK1TrBTKv4ep0HixmG169XM7HgAuCX4nmAsHzLGTBKE6Vis4kicgRmpGEWUEX4OZphIhahU9SW4ynmLy1tW9Qys1jEfep--2uwddN73cOdNim3cwMaMO7_p4L4b2i5Cn4d2b07jsR0-oIGuO8ajSQ4GY0eKNXufzDW4CGaX_c1vn4P189Nq8YqW7y9vi8clskywATVB1IFzJYwLlRJ4tNdYWRNFvcFc0eBq2SgvFQ-KeWdZsE4V3wKHIChmc3A36fap-zwUd3rbHVIsJzUtKphTzkhB0QllU5dz8kH3qXwjfWuC9ZientLTJT19Sk-LQmITKRdw3Pj0J_0P6weHq3Ka</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Gilles, Maximilian</creator><creator>Ibrahimpasic, Sascha</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-0528-5709</orcidid></search><sort><creationdate>20230301</creationdate><title>Unsupervised deep learning based ego motion estimation with a downward facing camera</title><author>Gilles, Maximilian ; Ibrahimpasic, Sascha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-bf69f5586adf48602789bc79182ea0582fd97b8e785f83edc3fcd877460ff6203</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Autonomous navigation</topic><topic>Cameras</topic><topic>Collision avoidance</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Deep learning</topic><topic>Estimation</topic><topic>Fast Fourier transformations</topic><topic>Fourier transforms</topic><topic>Image Processing and Computer Vision</topic><topic>Image warping</topic><topic>Localization</topic><topic>Motion simulation</topic><topic>Neural networks</topic><topic>Original Article</topic><topic>Prediction models</topic><topic>Registration</topic><topic>Robot dynamics</topic><topic>Robots</topic><topic>Vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gilles, Maximilian</creatorcontrib><creatorcontrib>Ibrahimpasic, Sascha</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gilles, Maximilian</au><au>Ibrahimpasic, Sascha</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised deep learning based ego motion estimation with a downward facing camera</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>39</volume><issue>3</issue><spage>785</spage><epage>798</epage><pages>785-798</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Knowing the robot's pose is a crucial prerequisite for mobile robot tasks such as collision avoidance or autonomous navigation. Using powerful predictive models to estimate transformations for visual odometry via downward facing cameras is an understudied area of research. This work proposes a novel approach based on deep learning for estimating ego motion with a downward looking camera. The network can be trained completely unsupervised and is not restricted to a specific motion model. We propose two neural network architectures based on the Early Fusion and Slow Fusion design principle: “EarlyBird” and “SlowBird”. Both networks share a Spatial Transformer layer for image warping and are trained with a modified structural similarity index (SSIM) loss function. Experiments carried out in simulation and for a real world differential drive robot show similar and partially better results of our proposed deep learning based approaches compared to a state-of-the-art method based on fast Fourier transformation.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-021-02345-6</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-0528-5709</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0178-2789
ispartof The Visual computer, 2023-03, Vol.39 (3), p.785-798
issn 0178-2789
1432-2315
language eng
recordid cdi_proquest_journals_2918052531
source ProQuest Central UK/Ireland; SpringerLink Journals - AutoHoldings; ProQuest Central
subjects Artificial Intelligence
Autonomous navigation
Cameras
Collision avoidance
Computer Graphics
Computer Science
Deep learning
Estimation
Fast Fourier transformations
Fourier transforms
Image Processing and Computer Vision
Image warping
Localization
Motion simulation
Neural networks
Original Article
Prediction models
Registration
Robot dynamics
Robots
Vehicles
title Unsupervised deep learning based ego motion estimation with a downward facing camera
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T16%3A29%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20deep%20learning%20based%20ego%20motion%20estimation%20with%20a%20downward%20facing%20camera&rft.jtitle=The%20Visual%20computer&rft.au=Gilles,%20Maximilian&rft.date=2023-03-01&rft.volume=39&rft.issue=3&rft.spage=785&rft.epage=798&rft.pages=785-798&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-021-02345-6&rft_dat=%3Cproquest_cross%3E2918052531%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2918052531&rft_id=info:pmid/&rfr_iscdi=true