Adversarial Learning for Joint Optimization of Depth and Ego-Motion

In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2020-01, Vol.29, p.4130-4142
Hauptverfasser: Wang, Anjie, Fang, Zhijun, Gao, Yongbin, Tan, Songchao, Wang, Shanshe, Ma, Siwei, Hwang, Jenq-Neng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4142
container_issue
container_start_page 4130
container_title IEEE transactions on image processing
container_volume 29
creator Wang, Anjie
Fang, Zhijun
Gao, Yongbin
Tan, Songchao
Wang, Shanshe
Ma, Siwei
Hwang, Jenq-Neng
description In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.
doi_str_mv 10.1109/TIP.2020.2968751
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2350340085</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8972902</ieee_id><sourcerecordid>2350340085</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMotlbvgiALXrxsnXxtNsdS6xeVeqjnJd2d1JR2U5OtoL_eLa09eJph5nmH4SHkkkKfUtB30-e3PgMGfaazXEl6RLpUC5oCCHbc9iBVqqjQHXIW4wKACkmzU9LhDChlknXJcFB9YYgmOLNMxmhC7ep5Yn1IXryrm2SybtzK_ZjG-TrxNrnHdfORmLpKRnOfvvrt_JycWLOMeLGvPfL-MJoOn9Lx5PF5OBinJReqSWeUitxWyBRYWipRWtS5EtxkFaI1AjMhNIDOqJKKZ9JqJnhuOQjFjBbAe-R2d3cd_OcGY1OsXCxxuTQ1-k0sGJfABUAuW_TmH7rwm1C3320pkYHULGsp2FFl8DEGtMU6uJUJ3wWFYiu4aAUXW8HFXnAbud4f3sxWWB0Cf0Zb4GoHOEQ8rHOtmAbGfwGMWHve</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2354605926</pqid></control><display><type>article</type><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</creator><creatorcontrib>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</creatorcontrib><description>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2968751</identifier><identifier>PMID: 32011252</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>adversarial learning ; Back propagation ; Cameras ; Deep learning ; Depth estimation ; direct visual odometry ; ego-motion ; Estimation ; Generators ; Geometric constraints ; Image reconstruction ; Machine learning ; Motion simulation ; Odometers ; Optimization ; Photometry ; self-supervised ; Training</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.4130-4142</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</citedby><cites>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</cites><orcidid>0000-0001-8563-5678 ; 0000-0002-7665-7434 ; 0000-0002-2731-5403 ; 0000-0002-9486-1009</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8972902$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8972902$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32011252$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Anjie</creatorcontrib><creatorcontrib>Fang, Zhijun</creatorcontrib><creatorcontrib>Gao, Yongbin</creatorcontrib><creatorcontrib>Tan, Songchao</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hwang, Jenq-Neng</creatorcontrib><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</description><subject>adversarial learning</subject><subject>Back propagation</subject><subject>Cameras</subject><subject>Deep learning</subject><subject>Depth estimation</subject><subject>direct visual odometry</subject><subject>ego-motion</subject><subject>Estimation</subject><subject>Generators</subject><subject>Geometric constraints</subject><subject>Image reconstruction</subject><subject>Machine learning</subject><subject>Motion simulation</subject><subject>Odometers</subject><subject>Optimization</subject><subject>Photometry</subject><subject>self-supervised</subject><subject>Training</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMotlbvgiALXrxsnXxtNsdS6xeVeqjnJd2d1JR2U5OtoL_eLa09eJph5nmH4SHkkkKfUtB30-e3PgMGfaazXEl6RLpUC5oCCHbc9iBVqqjQHXIW4wKACkmzU9LhDChlknXJcFB9YYgmOLNMxmhC7ep5Yn1IXryrm2SybtzK_ZjG-TrxNrnHdfORmLpKRnOfvvrt_JycWLOMeLGvPfL-MJoOn9Lx5PF5OBinJReqSWeUitxWyBRYWipRWtS5EtxkFaI1AjMhNIDOqJKKZ9JqJnhuOQjFjBbAe-R2d3cd_OcGY1OsXCxxuTQ1-k0sGJfABUAuW_TmH7rwm1C3320pkYHULGsp2FFl8DEGtMU6uJUJ3wWFYiu4aAUXW8HFXnAbud4f3sxWWB0Cf0Zb4GoHOEQ8rHOtmAbGfwGMWHve</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Wang, Anjie</creator><creator>Fang, Zhijun</creator><creator>Gao, Yongbin</creator><creator>Tan, Songchao</creator><creator>Wang, Shanshe</creator><creator>Ma, Siwei</creator><creator>Hwang, Jenq-Neng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8563-5678</orcidid><orcidid>https://orcid.org/0000-0002-7665-7434</orcidid><orcidid>https://orcid.org/0000-0002-2731-5403</orcidid><orcidid>https://orcid.org/0000-0002-9486-1009</orcidid></search><sort><creationdate>20200101</creationdate><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><author>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>adversarial learning</topic><topic>Back propagation</topic><topic>Cameras</topic><topic>Deep learning</topic><topic>Depth estimation</topic><topic>direct visual odometry</topic><topic>ego-motion</topic><topic>Estimation</topic><topic>Generators</topic><topic>Geometric constraints</topic><topic>Image reconstruction</topic><topic>Machine learning</topic><topic>Motion simulation</topic><topic>Odometers</topic><topic>Optimization</topic><topic>Photometry</topic><topic>self-supervised</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Anjie</creatorcontrib><creatorcontrib>Fang, Zhijun</creatorcontrib><creatorcontrib>Gao, Yongbin</creatorcontrib><creatorcontrib>Tan, Songchao</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hwang, Jenq-Neng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Anjie</au><au>Fang, Zhijun</au><au>Gao, Yongbin</au><au>Tan, Songchao</au><au>Wang, Shanshe</au><au>Ma, Siwei</au><au>Hwang, Jenq-Neng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>4130</spage><epage>4142</epage><pages>4130-4142</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>32011252</pmid><doi>10.1109/TIP.2020.2968751</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8563-5678</orcidid><orcidid>https://orcid.org/0000-0002-7665-7434</orcidid><orcidid>https://orcid.org/0000-0002-2731-5403</orcidid><orcidid>https://orcid.org/0000-0002-9486-1009</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2020-01, Vol.29, p.4130-4142
issn 1057-7149
1941-0042
language eng
recordid cdi_proquest_miscellaneous_2350340085
source IEEE Electronic Library (IEL)
subjects adversarial learning
Back propagation
Cameras
Deep learning
Depth estimation
direct visual odometry
ego-motion
Estimation
Generators
Geometric constraints
Image reconstruction
Machine learning
Motion simulation
Odometers
Optimization
Photometry
self-supervised
Training
title Adversarial Learning for Joint Optimization of Depth and Ego-Motion
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T02%3A52%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adversarial%20Learning%20for%20Joint%20Optimization%20of%20Depth%20and%20Ego-Motion&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Wang,%20Anjie&rft.date=2020-01-01&rft.volume=29&rft.spage=4130&rft.epage=4142&rft.pages=4130-4142&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2968751&rft_dat=%3Cproquest_RIE%3E2350340085%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2354605926&rft_id=info:pmid/32011252&rft_ieee_id=8972902&rfr_iscdi=true