InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects
Visual odometry is one of the key technologies for unmanned ground vehicles. To improve the robustness of the systems and enable intelligent tasks, researchers introduced learning-based recognition modules into visual odometry systems, but didn't realize tight coupling between visual odometry s...
Gespeichert in:
Veröffentlicht in: | IEEE robotics and automation letters 2024-11, Vol.9 (11), p.10708-10715 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 10715 |
---|---|
container_issue | 11 |
container_start_page | 10708 |
container_title | IEEE robotics and automation letters |
container_volume | 9 |
creator | Xie, Yuanyan Yang, Junzhe Zhou, Huaidong Sun, Fuchun |
description | Visual odometry is one of the key technologies for unmanned ground vehicles. To improve the robustness of the systems and enable intelligent tasks, researchers introduced learning-based recognition modules into visual odometry systems, but didn't realize tight coupling between visual odometry systems and recognition modules. This letter proposes a self-supervised semantic visual odometry method, which can complete the tasks of ego-motion estimation, depth prediction, and instance segmentation with a shared encoder. The potential dynamic regions are removed and the image reconstruction loss is rectified by instance detection results. Moreover, the instance-guided triplet loss and cross-task self-attention modules are devised to learn the geometrical relationships among pixels that are implied in instance object priors. The proposed method is validated on KITTI and ComplexUrban datasets. The experimental results show that our method has superiority to baseline models in both pose estimation and depth prediction. We also discuss the efficacy of evaluation metrics for pose estimation, and consider the accumulation errors of trajectories. |
doi_str_mv | 10.1109/LRA.2024.3477292 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3118090677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10711206</ieee_id><sourcerecordid>3118090677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c175t-669aa8285b81000f51ab951c7ca73f9552c95fdd71642079ce14c9822cef3f653</originalsourceid><addsrcrecordid>eNpNkM1Lw0AQxYMoWGrvHjwseE7dj2w2660UrYVIxdpew2YzkS3tJu4mQs_-425thZ5mHvPeDPOLoluCx4Rg-ZC_T8YU02TMEiGopBfRgDIhYibS9PKsv45G3m8wxoRTwSQfRD9z6ztlNawXj2gJ2zpe9i24b-OhCnqnbGc0Whvfqy1aVM0OOrdH5R6tvLGf6DXIMM9BOXvQXYPmVjeubZzqAM3gL2B0CL850ziPjEX_J9Gi3IDu_E10Vauth9GpDqPV89PH9CXOF7P5dJLHmgjexWkqlcpoxsuMhBdqTlQpOdFCK8FqyTnVktdVJUiaUCykBpJomVGqoWZ1ytkwuj_ubV3z1YPvik3TOxtOFoyQDEucChFc-OjSrvHeQV20zuyU2xcEFwfaRaBdHGgXJ9ohcneMGAA4swtCKE7ZLxcGe-g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3118090677</pqid></control><display><type>article</type><title>InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects</title><source>IEEE Electronic Library (IEL)</source><creator>Xie, Yuanyan ; Yang, Junzhe ; Zhou, Huaidong ; Sun, Fuchun</creator><creatorcontrib>Xie, Yuanyan ; Yang, Junzhe ; Zhou, Huaidong ; Sun, Fuchun</creatorcontrib><description>Visual odometry is one of the key technologies for unmanned ground vehicles. To improve the robustness of the systems and enable intelligent tasks, researchers introduced learning-based recognition modules into visual odometry systems, but didn't realize tight coupling between visual odometry systems and recognition modules. This letter proposes a self-supervised semantic visual odometry method, which can complete the tasks of ego-motion estimation, depth prediction, and instance segmentation with a shared encoder. The potential dynamic regions are removed and the image reconstruction loss is rectified by instance detection results. Moreover, the instance-guided triplet loss and cross-task self-attention modules are devised to learn the geometrical relationships among pixels that are implied in instance object priors. The proposed method is validated on KITTI and ComplexUrban datasets. The experimental results show that our method has superiority to baseline models in both pose estimation and depth prediction. We also discuss the efficacy of evaluation metrics for pose estimation, and consider the accumulation errors of trajectories.</description><identifier>ISSN: 2377-3766</identifier><identifier>EISSN: 2377-3766</identifier><identifier>DOI: 10.1109/LRA.2024.3477292</identifier><identifier>CODEN: IRALC6</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computer architecture ; Depth prediction ; ego-motion estimation ; Feature extraction ; Image reconstruction ; Image segmentation ; Instance segmentation ; Learning ; Measurement ; Modules ; Motion simulation ; Odometry ; Pose estimation ; Robustness ; self-supervised learning ; semantic understanding ; Semantics ; Task complexity ; Unmanned ground vehicles ; Visual odometry ; Visual tasks</subject><ispartof>IEEE robotics and automation letters, 2024-11, Vol.9 (11), p.10708-10715</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c175t-669aa8285b81000f51ab951c7ca73f9552c95fdd71642079ce14c9822cef3f653</cites><orcidid>0000-0002-7106-6555 ; 0000-0003-3546-6305 ; 0000-0002-0063-9782</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10711206$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10711206$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xie, Yuanyan</creatorcontrib><creatorcontrib>Yang, Junzhe</creatorcontrib><creatorcontrib>Zhou, Huaidong</creatorcontrib><creatorcontrib>Sun, Fuchun</creatorcontrib><title>InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects</title><title>IEEE robotics and automation letters</title><addtitle>LRA</addtitle><description>Visual odometry is one of the key technologies for unmanned ground vehicles. To improve the robustness of the systems and enable intelligent tasks, researchers introduced learning-based recognition modules into visual odometry systems, but didn't realize tight coupling between visual odometry systems and recognition modules. This letter proposes a self-supervised semantic visual odometry method, which can complete the tasks of ego-motion estimation, depth prediction, and instance segmentation with a shared encoder. The potential dynamic regions are removed and the image reconstruction loss is rectified by instance detection results. Moreover, the instance-guided triplet loss and cross-task self-attention modules are devised to learn the geometrical relationships among pixels that are implied in instance object priors. The proposed method is validated on KITTI and ComplexUrban datasets. The experimental results show that our method has superiority to baseline models in both pose estimation and depth prediction. We also discuss the efficacy of evaluation metrics for pose estimation, and consider the accumulation errors of trajectories.</description><subject>Computer architecture</subject><subject>Depth prediction</subject><subject>ego-motion estimation</subject><subject>Feature extraction</subject><subject>Image reconstruction</subject><subject>Image segmentation</subject><subject>Instance segmentation</subject><subject>Learning</subject><subject>Measurement</subject><subject>Modules</subject><subject>Motion simulation</subject><subject>Odometry</subject><subject>Pose estimation</subject><subject>Robustness</subject><subject>self-supervised learning</subject><subject>semantic understanding</subject><subject>Semantics</subject><subject>Task complexity</subject><subject>Unmanned ground vehicles</subject><subject>Visual odometry</subject><subject>Visual tasks</subject><issn>2377-3766</issn><issn>2377-3766</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1Lw0AQxYMoWGrvHjwseE7dj2w2660UrYVIxdpew2YzkS3tJu4mQs_-425thZ5mHvPeDPOLoluCx4Rg-ZC_T8YU02TMEiGopBfRgDIhYibS9PKsv45G3m8wxoRTwSQfRD9z6ztlNawXj2gJ2zpe9i24b-OhCnqnbGc0Whvfqy1aVM0OOrdH5R6tvLGf6DXIMM9BOXvQXYPmVjeubZzqAM3gL2B0CL850ziPjEX_J9Gi3IDu_E10Vauth9GpDqPV89PH9CXOF7P5dJLHmgjexWkqlcpoxsuMhBdqTlQpOdFCK8FqyTnVktdVJUiaUCykBpJomVGqoWZ1ytkwuj_ubV3z1YPvik3TOxtOFoyQDEucChFc-OjSrvHeQV20zuyU2xcEFwfaRaBdHGgXJ9ohcneMGAA4swtCKE7ZLxcGe-g</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Xie, Yuanyan</creator><creator>Yang, Junzhe</creator><creator>Zhou, Huaidong</creator><creator>Sun, Fuchun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7106-6555</orcidid><orcidid>https://orcid.org/0000-0003-3546-6305</orcidid><orcidid>https://orcid.org/0000-0002-0063-9782</orcidid></search><sort><creationdate>20241101</creationdate><title>InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects</title><author>Xie, Yuanyan ; Yang, Junzhe ; Zhou, Huaidong ; Sun, Fuchun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c175t-669aa8285b81000f51ab951c7ca73f9552c95fdd71642079ce14c9822cef3f653</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer architecture</topic><topic>Depth prediction</topic><topic>ego-motion estimation</topic><topic>Feature extraction</topic><topic>Image reconstruction</topic><topic>Image segmentation</topic><topic>Instance segmentation</topic><topic>Learning</topic><topic>Measurement</topic><topic>Modules</topic><topic>Motion simulation</topic><topic>Odometry</topic><topic>Pose estimation</topic><topic>Robustness</topic><topic>self-supervised learning</topic><topic>semantic understanding</topic><topic>Semantics</topic><topic>Task complexity</topic><topic>Unmanned ground vehicles</topic><topic>Visual odometry</topic><topic>Visual tasks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xie, Yuanyan</creatorcontrib><creatorcontrib>Yang, Junzhe</creatorcontrib><creatorcontrib>Zhou, Huaidong</creatorcontrib><creatorcontrib>Sun, Fuchun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE robotics and automation letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xie, Yuanyan</au><au>Yang, Junzhe</au><au>Zhou, Huaidong</au><au>Sun, Fuchun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects</atitle><jtitle>IEEE robotics and automation letters</jtitle><stitle>LRA</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>9</volume><issue>11</issue><spage>10708</spage><epage>10715</epage><pages>10708-10715</pages><issn>2377-3766</issn><eissn>2377-3766</eissn><coden>IRALC6</coden><abstract>Visual odometry is one of the key technologies for unmanned ground vehicles. To improve the robustness of the systems and enable intelligent tasks, researchers introduced learning-based recognition modules into visual odometry systems, but didn't realize tight coupling between visual odometry systems and recognition modules. This letter proposes a self-supervised semantic visual odometry method, which can complete the tasks of ego-motion estimation, depth prediction, and instance segmentation with a shared encoder. The potential dynamic regions are removed and the image reconstruction loss is rectified by instance detection results. Moreover, the instance-guided triplet loss and cross-task self-attention modules are devised to learn the geometrical relationships among pixels that are implied in instance object priors. The proposed method is validated on KITTI and ComplexUrban datasets. The experimental results show that our method has superiority to baseline models in both pose estimation and depth prediction. We also discuss the efficacy of evaluation metrics for pose estimation, and consider the accumulation errors of trajectories.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LRA.2024.3477292</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-7106-6555</orcidid><orcidid>https://orcid.org/0000-0003-3546-6305</orcidid><orcidid>https://orcid.org/0000-0002-0063-9782</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2377-3766 |
ispartof | IEEE robotics and automation letters, 2024-11, Vol.9 (11), p.10708-10715 |
issn | 2377-3766 2377-3766 |
language | eng |
recordid | cdi_proquest_journals_3118090677 |
source | IEEE Electronic Library (IEL) |
subjects | Computer architecture Depth prediction ego-motion estimation Feature extraction Image reconstruction Image segmentation Instance segmentation Learning Measurement Modules Motion simulation Odometry Pose estimation Robustness self-supervised learning semantic understanding Semantics Task complexity Unmanned ground vehicles Visual odometry Visual tasks |
title | InstanceVO: Self-Supervised Semantic Visual Odometry by Using Metric Learning to Incorporate Geometrical Priors in Instance Objects |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T15%3A50%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=InstanceVO:%20Self-Supervised%20Semantic%20Visual%20Odometry%20by%20Using%20Metric%20Learning%20to%20Incorporate%20Geometrical%20Priors%20in%20Instance%20Objects&rft.jtitle=IEEE%20robotics%20and%20automation%20letters&rft.au=Xie,%20Yuanyan&rft.date=2024-11-01&rft.volume=9&rft.issue=11&rft.spage=10708&rft.epage=10715&rft.pages=10708-10715&rft.issn=2377-3766&rft.eissn=2377-3766&rft.coden=IRALC6&rft_id=info:doi/10.1109/LRA.2024.3477292&rft_dat=%3Cproquest_RIE%3E3118090677%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3118090677&rft_id=info:pmid/&rft_ieee_id=10711206&rfr_iscdi=true |