Real-time monocular depth estimation with adaptive receptive fields
Monocular depth estimation is a popular research topic in the field of autonomous driving. Nowadays many models are leading in accuracy but performing poorly in a real-time scenario. To effectively increase the depth estimation efficiency, we propose a novel model combining a multi-scale pyramid arc...
Gespeichert in:
Veröffentlicht in: | Journal of real-time image processing 2021-08, Vol.18 (4), p.1369-1381 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1381 |
---|---|
container_issue | 4 |
container_start_page | 1369 |
container_title | Journal of real-time image processing |
container_volume | 18 |
creator | Ji, Zhenyan Song, Xiaojun Guo, Xiaoxuan Wang, Fangshi Armendáriz-Iñigo, José Enrique |
description | Monocular depth estimation is a popular research topic in the field of autonomous driving. Nowadays many models are leading in accuracy but performing poorly in a real-time scenario. To effectively increase the depth estimation efficiency, we propose a novel model combining a multi-scale pyramid architecture for depth estimation together with adaptive receptive fields. The pyramid architecture reduces the trainable parameters from dozens of mega to less than 10 mega. Adaptive receptive fields are more sensitive to objects at different depth/distances in images, leading to better accuracy. We have adopted stacked convolution kernels instead of raw kernels to compress the model. Thus, the model that we proposed performs well in both real-time performance and estimation accuracy. We provide a set of experiments where our model performs better in terms of Eigen split than other previously known models. Furthermore, we show that our model is also better in runtime performance in regard to the depth estimation to the rest of models but the Pyd-Net model. Finally, our model is a lightweight depth estimation model with state-of-the-art accuracy. |
doi_str_mv | 10.1007/s11554-020-01036-0 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2918676085</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2918676085</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-6bfb572bdea277bdd852002ca14e2fa278532bfd229ef7c475c47d2c2f0d9daf3</originalsourceid><addsrcrecordid>eNp9UE1LxDAQDaLgWv0Dngqeo5PppmmPsvgFC4LoOaTJRLt025p0Ff-90YrePAwz85j3ZuYxdirgXACoiyiElEsOCBwEFCWHPbYQVSl4haLe_60BDtlRjBuAUpWFXLDVA5mOT-2W8u3QD3bXmZA7GqeXnGKCzdQOff7ept44M07tG-WBLM2Vb6lz8ZgdeNNFOvnJGXu6vnpc3fL1_c3d6nLNLSqYeNn4RipsHBlUqnGukuketEYsCX3CKllg4x1iTV7ZpZIpHFr04GpnfJGxs1l3DMPrLp2nN8Mu9Gmlxjo9qEpIEhnDecqGIcZAXo8h_RE-tAD9ZZaezdLJLP1tloZEKmZSTMP9M4U_6X9Yn1xgbU0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2918676085</pqid></control><display><type>article</type><title>Real-time monocular depth estimation with adaptive receptive fields</title><source>SpringerLink Journals</source><source>ProQuest Central UK/Ireland</source><source>ProQuest Central</source><creator>Ji, Zhenyan ; Song, Xiaojun ; Guo, Xiaoxuan ; Wang, Fangshi ; Armendáriz-Iñigo, José Enrique</creator><creatorcontrib>Ji, Zhenyan ; Song, Xiaojun ; Guo, Xiaoxuan ; Wang, Fangshi ; Armendáriz-Iñigo, José Enrique</creatorcontrib><description>Monocular depth estimation is a popular research topic in the field of autonomous driving. Nowadays many models are leading in accuracy but performing poorly in a real-time scenario. To effectively increase the depth estimation efficiency, we propose a novel model combining a multi-scale pyramid architecture for depth estimation together with adaptive receptive fields. The pyramid architecture reduces the trainable parameters from dozens of mega to less than 10 mega. Adaptive receptive fields are more sensitive to objects at different depth/distances in images, leading to better accuracy. We have adopted stacked convolution kernels instead of raw kernels to compress the model. Thus, the model that we proposed performs well in both real-time performance and estimation accuracy. We provide a set of experiments where our model performs better in terms of Eigen split than other previously known models. Furthermore, we show that our model is also better in runtime performance in regard to the depth estimation to the rest of models but the Pyd-Net model. Finally, our model is a lightweight depth estimation model with state-of-the-art accuracy.</description><identifier>ISSN: 1861-8200</identifier><identifier>EISSN: 1861-8219</identifier><identifier>DOI: 10.1007/s11554-020-01036-0</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Cameras ; Computer Graphics ; Computer Science ; Deep learning ; Dictionaries ; Image Processing and Computer Vision ; Machine learning ; Model accuracy ; Multimedia Information Systems ; Optimization ; Pattern Recognition ; Real time ; Signal,Image and Speech Processing ; Special Issue Paper</subject><ispartof>Journal of real-time image processing, 2021-08, Vol.18 (4), p.1369-1381</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2020</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-6bfb572bdea277bdd852002ca14e2fa278532bfd229ef7c475c47d2c2f0d9daf3</cites><orcidid>0000-0002-6566-9464</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11554-020-01036-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2918676085?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,21368,27903,27904,33723,41467,42536,43784,51297,64361,64365,72215</link.rule.ids></links><search><creatorcontrib>Ji, Zhenyan</creatorcontrib><creatorcontrib>Song, Xiaojun</creatorcontrib><creatorcontrib>Guo, Xiaoxuan</creatorcontrib><creatorcontrib>Wang, Fangshi</creatorcontrib><creatorcontrib>Armendáriz-Iñigo, José Enrique</creatorcontrib><title>Real-time monocular depth estimation with adaptive receptive fields</title><title>Journal of real-time image processing</title><addtitle>J Real-Time Image Proc</addtitle><description>Monocular depth estimation is a popular research topic in the field of autonomous driving. Nowadays many models are leading in accuracy but performing poorly in a real-time scenario. To effectively increase the depth estimation efficiency, we propose a novel model combining a multi-scale pyramid architecture for depth estimation together with adaptive receptive fields. The pyramid architecture reduces the trainable parameters from dozens of mega to less than 10 mega. Adaptive receptive fields are more sensitive to objects at different depth/distances in images, leading to better accuracy. We have adopted stacked convolution kernels instead of raw kernels to compress the model. Thus, the model that we proposed performs well in both real-time performance and estimation accuracy. We provide a set of experiments where our model performs better in terms of Eigen split than other previously known models. Furthermore, we show that our model is also better in runtime performance in regard to the depth estimation to the rest of models but the Pyd-Net model. Finally, our model is a lightweight depth estimation model with state-of-the-art accuracy.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Cameras</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Deep learning</subject><subject>Dictionaries</subject><subject>Image Processing and Computer Vision</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Multimedia Information Systems</subject><subject>Optimization</subject><subject>Pattern Recognition</subject><subject>Real time</subject><subject>Signal,Image and Speech Processing</subject><subject>Special Issue Paper</subject><issn>1861-8200</issn><issn>1861-8219</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9UE1LxDAQDaLgWv0Dngqeo5PppmmPsvgFC4LoOaTJRLt025p0Ff-90YrePAwz85j3ZuYxdirgXACoiyiElEsOCBwEFCWHPbYQVSl4haLe_60BDtlRjBuAUpWFXLDVA5mOT-2W8u3QD3bXmZA7GqeXnGKCzdQOff7ept44M07tG-WBLM2Vb6lz8ZgdeNNFOvnJGXu6vnpc3fL1_c3d6nLNLSqYeNn4RipsHBlUqnGukuketEYsCX3CKllg4x1iTV7ZpZIpHFr04GpnfJGxs1l3DMPrLp2nN8Mu9Gmlxjo9qEpIEhnDecqGIcZAXo8h_RE-tAD9ZZaezdLJLP1tloZEKmZSTMP9M4U_6X9Yn1xgbU0</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Ji, Zhenyan</creator><creator>Song, Xiaojun</creator><creator>Guo, Xiaoxuan</creator><creator>Wang, Fangshi</creator><creator>Armendáriz-Iñigo, José Enrique</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0002-6566-9464</orcidid></search><sort><creationdate>20210801</creationdate><title>Real-time monocular depth estimation with adaptive receptive fields</title><author>Ji, Zhenyan ; Song, Xiaojun ; Guo, Xiaoxuan ; Wang, Fangshi ; Armendáriz-Iñigo, José Enrique</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-6bfb572bdea277bdd852002ca14e2fa278532bfd229ef7c475c47d2c2f0d9daf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Cameras</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Deep learning</topic><topic>Dictionaries</topic><topic>Image Processing and Computer Vision</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Multimedia Information Systems</topic><topic>Optimization</topic><topic>Pattern Recognition</topic><topic>Real time</topic><topic>Signal,Image and Speech Processing</topic><topic>Special Issue Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ji, Zhenyan</creatorcontrib><creatorcontrib>Song, Xiaojun</creatorcontrib><creatorcontrib>Guo, Xiaoxuan</creatorcontrib><creatorcontrib>Wang, Fangshi</creatorcontrib><creatorcontrib>Armendáriz-Iñigo, José Enrique</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Journal of real-time image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ji, Zhenyan</au><au>Song, Xiaojun</au><au>Guo, Xiaoxuan</au><au>Wang, Fangshi</au><au>Armendáriz-Iñigo, José Enrique</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Real-time monocular depth estimation with adaptive receptive fields</atitle><jtitle>Journal of real-time image processing</jtitle><stitle>J Real-Time Image Proc</stitle><date>2021-08-01</date><risdate>2021</risdate><volume>18</volume><issue>4</issue><spage>1369</spage><epage>1381</epage><pages>1369-1381</pages><issn>1861-8200</issn><eissn>1861-8219</eissn><abstract>Monocular depth estimation is a popular research topic in the field of autonomous driving. Nowadays many models are leading in accuracy but performing poorly in a real-time scenario. To effectively increase the depth estimation efficiency, we propose a novel model combining a multi-scale pyramid architecture for depth estimation together with adaptive receptive fields. The pyramid architecture reduces the trainable parameters from dozens of mega to less than 10 mega. Adaptive receptive fields are more sensitive to objects at different depth/distances in images, leading to better accuracy. We have adopted stacked convolution kernels instead of raw kernels to compress the model. Thus, the model that we proposed performs well in both real-time performance and estimation accuracy. We provide a set of experiments where our model performs better in terms of Eigen split than other previously known models. Furthermore, we show that our model is also better in runtime performance in regard to the depth estimation to the rest of models but the Pyd-Net model. Finally, our model is a lightweight depth estimation model with state-of-the-art accuracy.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s11554-020-01036-0</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-6566-9464</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1861-8200 |
ispartof | Journal of real-time image processing, 2021-08, Vol.18 (4), p.1369-1381 |
issn | 1861-8200 1861-8219 |
language | eng |
recordid | cdi_proquest_journals_2918676085 |
source | SpringerLink Journals; ProQuest Central UK/Ireland; ProQuest Central |
subjects | Accuracy Algorithms Artificial intelligence Cameras Computer Graphics Computer Science Deep learning Dictionaries Image Processing and Computer Vision Machine learning Model accuracy Multimedia Information Systems Optimization Pattern Recognition Real time Signal,Image and Speech Processing Special Issue Paper |
title | Real-time monocular depth estimation with adaptive receptive fields |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T15%3A57%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Real-time%20monocular%20depth%20estimation%20with%20adaptive%20receptive%20fields&rft.jtitle=Journal%20of%20real-time%20image%20processing&rft.au=Ji,%20Zhenyan&rft.date=2021-08-01&rft.volume=18&rft.issue=4&rft.spage=1369&rft.epage=1381&rft.pages=1369-1381&rft.issn=1861-8200&rft.eissn=1861-8219&rft_id=info:doi/10.1007/s11554-020-01036-0&rft_dat=%3Cproquest_cross%3E2918676085%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2918676085&rft_id=info:pmid/&rfr_iscdi=true |