Adversarial Learning for Joint Optimization of Depth and Ego-Motion
In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocu...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on image processing 2020-01, Vol.29, p.4130-4142 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4142 |
---|---|
container_issue | |
container_start_page | 4130 |
container_title | IEEE transactions on image processing |
container_volume | 29 |
creator | Wang, Anjie Fang, Zhijun Gao, Yongbin Tan, Songchao Wang, Shanshe Ma, Siwei Hwang, Jenq-Neng |
description | In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach. |
doi_str_mv | 10.1109/TIP.2020.2968751 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2350340085</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8972902</ieee_id><sourcerecordid>2350340085</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMotlbvgiALXrxsnXxtNsdS6xeVeqjnJd2d1JR2U5OtoL_eLa09eJph5nmH4SHkkkKfUtB30-e3PgMGfaazXEl6RLpUC5oCCHbc9iBVqqjQHXIW4wKACkmzU9LhDChlknXJcFB9YYgmOLNMxmhC7ep5Yn1IXryrm2SybtzK_ZjG-TrxNrnHdfORmLpKRnOfvvrt_JycWLOMeLGvPfL-MJoOn9Lx5PF5OBinJReqSWeUitxWyBRYWipRWtS5EtxkFaI1AjMhNIDOqJKKZ9JqJnhuOQjFjBbAe-R2d3cd_OcGY1OsXCxxuTQ1-k0sGJfABUAuW_TmH7rwm1C3320pkYHULGsp2FFl8DEGtMU6uJUJ3wWFYiu4aAUXW8HFXnAbud4f3sxWWB0Cf0Zb4GoHOEQ8rHOtmAbGfwGMWHve</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2354605926</pqid></control><display><type>article</type><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</creator><creatorcontrib>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</creatorcontrib><description>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2020.2968751</identifier><identifier>PMID: 32011252</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>adversarial learning ; Back propagation ; Cameras ; Deep learning ; Depth estimation ; direct visual odometry ; ego-motion ; Estimation ; Generators ; Geometric constraints ; Image reconstruction ; Machine learning ; Motion simulation ; Odometers ; Optimization ; Photometry ; self-supervised ; Training</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.4130-4142</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</citedby><cites>FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</cites><orcidid>0000-0001-8563-5678 ; 0000-0002-7665-7434 ; 0000-0002-2731-5403 ; 0000-0002-9486-1009</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8972902$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8972902$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32011252$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Anjie</creatorcontrib><creatorcontrib>Fang, Zhijun</creatorcontrib><creatorcontrib>Gao, Yongbin</creatorcontrib><creatorcontrib>Tan, Songchao</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hwang, Jenq-Neng</creatorcontrib><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</description><subject>adversarial learning</subject><subject>Back propagation</subject><subject>Cameras</subject><subject>Deep learning</subject><subject>Depth estimation</subject><subject>direct visual odometry</subject><subject>ego-motion</subject><subject>Estimation</subject><subject>Generators</subject><subject>Geometric constraints</subject><subject>Image reconstruction</subject><subject>Machine learning</subject><subject>Motion simulation</subject><subject>Odometers</subject><subject>Optimization</subject><subject>Photometry</subject><subject>self-supervised</subject><subject>Training</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMotlbvgiALXrxsnXxtNsdS6xeVeqjnJd2d1JR2U5OtoL_eLa09eJph5nmH4SHkkkKfUtB30-e3PgMGfaazXEl6RLpUC5oCCHbc9iBVqqjQHXIW4wKACkmzU9LhDChlknXJcFB9YYgmOLNMxmhC7ep5Yn1IXryrm2SybtzK_ZjG-TrxNrnHdfORmLpKRnOfvvrt_JycWLOMeLGvPfL-MJoOn9Lx5PF5OBinJReqSWeUitxWyBRYWipRWtS5EtxkFaI1AjMhNIDOqJKKZ9JqJnhuOQjFjBbAe-R2d3cd_OcGY1OsXCxxuTQ1-k0sGJfABUAuW_TmH7rwm1C3320pkYHULGsp2FFl8DEGtMU6uJUJ3wWFYiu4aAUXW8HFXnAbud4f3sxWWB0Cf0Zb4GoHOEQ8rHOtmAbGfwGMWHve</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Wang, Anjie</creator><creator>Fang, Zhijun</creator><creator>Gao, Yongbin</creator><creator>Tan, Songchao</creator><creator>Wang, Shanshe</creator><creator>Ma, Siwei</creator><creator>Hwang, Jenq-Neng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8563-5678</orcidid><orcidid>https://orcid.org/0000-0002-7665-7434</orcidid><orcidid>https://orcid.org/0000-0002-2731-5403</orcidid><orcidid>https://orcid.org/0000-0002-9486-1009</orcidid></search><sort><creationdate>20200101</creationdate><title>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</title><author>Wang, Anjie ; Fang, Zhijun ; Gao, Yongbin ; Tan, Songchao ; Wang, Shanshe ; Ma, Siwei ; Hwang, Jenq-Neng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-b1148fde270f1c74cfe98743a6deefa4e644900961757365f92438f30472a9403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>adversarial learning</topic><topic>Back propagation</topic><topic>Cameras</topic><topic>Deep learning</topic><topic>Depth estimation</topic><topic>direct visual odometry</topic><topic>ego-motion</topic><topic>Estimation</topic><topic>Generators</topic><topic>Geometric constraints</topic><topic>Image reconstruction</topic><topic>Machine learning</topic><topic>Motion simulation</topic><topic>Odometers</topic><topic>Optimization</topic><topic>Photometry</topic><topic>self-supervised</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Anjie</creatorcontrib><creatorcontrib>Fang, Zhijun</creatorcontrib><creatorcontrib>Gao, Yongbin</creatorcontrib><creatorcontrib>Tan, Songchao</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hwang, Jenq-Neng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Anjie</au><au>Fang, Zhijun</au><au>Gao, Yongbin</au><au>Tan, Songchao</au><au>Wang, Shanshe</au><au>Ma, Siwei</au><au>Hwang, Jenq-Neng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adversarial Learning for Joint Optimization of Depth and Ego-Motion</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>4130</spage><epage>4142</epage><pages>4130-4142</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>In recent years, supervised deep learning methods have shown a great promise in dense depth estimation. However, massive high-quality training data are expensive and impractical to acquire. Alternatively, self-supervised learning-based depth estimators can learn the latent transformation from monocular or binocular video sequences by minimizing the photometric warp error between consecutive frames, but they suffer from the scale ambiguity problem or have difficulty in estimating precise pose changes between frames. In this paper, we propose a joint self-supervised deep learning pipeline for depth and ego-motion estimation by employing the advantages of adversarial learning and joint optimization with spatial-temporal geometrical constraints. The stereo reconstruction error provides the spatial geometric constraint to estimate the absolute scale depth. Meanwhile, the depth map with an absolute scale and a pre-trained pose network serves as a good starting point for direct visual odometry (DVO). DVO optimization based on spatial geometric constraints can result in a fine-grained ego-motion estimation with the additional backpropagation signals provided to the depth estimation network. Finally, the spatial and temporal domain-based reconstructed views are concatenated, and the iterative coupling optimization process is implemented in combination with the adversarial learning for accurate depth and precise ego-motion estimation. The experimental results show superior performance compared with state-of-the-art methods for monocular depth and ego-motion estimation on the KITTI dataset and a great generalization ability of the proposed approach.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>32011252</pmid><doi>10.1109/TIP.2020.2968751</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8563-5678</orcidid><orcidid>https://orcid.org/0000-0002-7665-7434</orcidid><orcidid>https://orcid.org/0000-0002-2731-5403</orcidid><orcidid>https://orcid.org/0000-0002-9486-1009</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2020-01, Vol.29, p.4130-4142 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_proquest_miscellaneous_2350340085 |
source | IEEE Electronic Library (IEL) |
subjects | adversarial learning Back propagation Cameras Deep learning Depth estimation direct visual odometry ego-motion Estimation Generators Geometric constraints Image reconstruction Machine learning Motion simulation Odometers Optimization Photometry self-supervised Training |
title | Adversarial Learning for Joint Optimization of Depth and Ego-Motion |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T02%3A52%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adversarial%20Learning%20for%20Joint%20Optimization%20of%20Depth%20and%20Ego-Motion&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Wang,%20Anjie&rft.date=2020-01-01&rft.volume=29&rft.spage=4130&rft.epage=4142&rft.pages=4130-4142&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2020.2968751&rft_dat=%3Cproquest_RIE%3E2350340085%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2354605926&rft_id=info:pmid/32011252&rft_ieee_id=8972902&rfr_iscdi=true |