DNS SLAM: Dense Neural Semantic-Informed SLAM

In recent years, coordinate-based neural implicit representations have shown promising results for the task of Simultaneous Localization and Mapping (SLAM). While achieving impressive performance on small synthetic scenes, these methods often suffer from oversmoothed reconstructions, especially for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Kunyi, Niemeyer, Michael, Navab, Nassir, Tombari, Federico
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Li, Kunyi
Niemeyer, Michael
Navab, Nassir
Tombari, Federico
description In recent years, coordinate-based neural implicit representations have shown promising results for the task of Simultaneous Localization and Mapping (SLAM). While achieving impressive performance on small synthetic scenes, these methods often suffer from oversmoothed reconstructions, especially for complex real-world scenes. In this work, we introduce DNS SLAM, a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Relying only on 2D semantic priors, we propose the first semantic neural SLAM method that trains class-wise scene representations while providing stable camera tracking at the same time. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details and to output color, density, and semantic class information, enabling many downstream applications. To further enable real-time tracking, we introduce a lightweight coarse scene representation which is trained in a self-supervised manner in latent space. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking while maintaining a commendable operational speed on off-the-shelf hardware. Further, our method outputs class-wise decomposed reconstructions with better texture capturing appearance and geometric details.
doi_str_mv 10.48550/arxiv.2312.00204
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2312_00204</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2312_00204</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-e2cf2ab32965e6a1857cd4bf73fdb57a32d50ea3045f554c1e0ebf667b2d607c3</originalsourceid><addsrcrecordid>eNotzrFuwjAUhWEvHSroA3RqXsDpje1rUzYELSAFGMIeXdvXUiSSVqat2rdHBKaz_Dr6hHiuoDQzRHil_Nf9lkpXqgRQYB6FXO2boqkXu3mx4uHMxZ5_Mp2Khnsavrsgt0P6zD3HMZqKh0SnMz_ddyKOH-_H5UbWh_V2uaglWWckq5AUea3eLLKlaoYuROOT0yl6dKRVRGDSYDAhmlAxsE_WOq-iBRf0RLzcbkdv-5W7nvJ_e3W3o1tfACB0Oxw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DNS SLAM: Dense Neural Semantic-Informed SLAM</title><source>arXiv.org</source><creator>Li, Kunyi ; Niemeyer, Michael ; Navab, Nassir ; Tombari, Federico</creator><creatorcontrib>Li, Kunyi ; Niemeyer, Michael ; Navab, Nassir ; Tombari, Federico</creatorcontrib><description>In recent years, coordinate-based neural implicit representations have shown promising results for the task of Simultaneous Localization and Mapping (SLAM). While achieving impressive performance on small synthetic scenes, these methods often suffer from oversmoothed reconstructions, especially for complex real-world scenes. In this work, we introduce DNS SLAM, a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Relying only on 2D semantic priors, we propose the first semantic neural SLAM method that trains class-wise scene representations while providing stable camera tracking at the same time. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details and to output color, density, and semantic class information, enabling many downstream applications. To further enable real-time tracking, we introduce a lightweight coarse scene representation which is trained in a self-supervised manner in latent space. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking while maintaining a commendable operational speed on off-the-shelf hardware. Further, our method outputs class-wise decomposed reconstructions with better texture capturing appearance and geometric details.</description><identifier>DOI: 10.48550/arxiv.2312.00204</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2312.00204$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2312.00204$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Kunyi</creatorcontrib><creatorcontrib>Niemeyer, Michael</creatorcontrib><creatorcontrib>Navab, Nassir</creatorcontrib><creatorcontrib>Tombari, Federico</creatorcontrib><title>DNS SLAM: Dense Neural Semantic-Informed SLAM</title><description>In recent years, coordinate-based neural implicit representations have shown promising results for the task of Simultaneous Localization and Mapping (SLAM). While achieving impressive performance on small synthetic scenes, these methods often suffer from oversmoothed reconstructions, especially for complex real-world scenes. In this work, we introduce DNS SLAM, a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Relying only on 2D semantic priors, we propose the first semantic neural SLAM method that trains class-wise scene representations while providing stable camera tracking at the same time. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details and to output color, density, and semantic class information, enabling many downstream applications. To further enable real-time tracking, we introduce a lightweight coarse scene representation which is trained in a self-supervised manner in latent space. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking while maintaining a commendable operational speed on off-the-shelf hardware. Further, our method outputs class-wise decomposed reconstructions with better texture capturing appearance and geometric details.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrFuwjAUhWEvHSroA3RqXsDpje1rUzYELSAFGMIeXdvXUiSSVqat2rdHBKaz_Dr6hHiuoDQzRHil_Nf9lkpXqgRQYB6FXO2boqkXu3mx4uHMxZ5_Mp2Khnsavrsgt0P6zD3HMZqKh0SnMz_ddyKOH-_H5UbWh_V2uaglWWckq5AUea3eLLKlaoYuROOT0yl6dKRVRGDSYDAhmlAxsE_WOq-iBRf0RLzcbkdv-5W7nvJ_e3W3o1tfACB0Oxw</recordid><startdate>20231130</startdate><enddate>20231130</enddate><creator>Li, Kunyi</creator><creator>Niemeyer, Michael</creator><creator>Navab, Nassir</creator><creator>Tombari, Federico</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231130</creationdate><title>DNS SLAM: Dense Neural Semantic-Informed SLAM</title><author>Li, Kunyi ; Niemeyer, Michael ; Navab, Nassir ; Tombari, Federico</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-e2cf2ab32965e6a1857cd4bf73fdb57a32d50ea3045f554c1e0ebf667b2d607c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Kunyi</creatorcontrib><creatorcontrib>Niemeyer, Michael</creatorcontrib><creatorcontrib>Navab, Nassir</creatorcontrib><creatorcontrib>Tombari, Federico</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Kunyi</au><au>Niemeyer, Michael</au><au>Navab, Nassir</au><au>Tombari, Federico</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DNS SLAM: Dense Neural Semantic-Informed SLAM</atitle><date>2023-11-30</date><risdate>2023</risdate><abstract>In recent years, coordinate-based neural implicit representations have shown promising results for the task of Simultaneous Localization and Mapping (SLAM). While achieving impressive performance on small synthetic scenes, these methods often suffer from oversmoothed reconstructions, especially for complex real-world scenes. In this work, we introduce DNS SLAM, a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Relying only on 2D semantic priors, we propose the first semantic neural SLAM method that trains class-wise scene representations while providing stable camera tracking at the same time. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details and to output color, density, and semantic class information, enabling many downstream applications. To further enable real-time tracking, we introduce a lightweight coarse scene representation which is trained in a self-supervised manner in latent space. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking while maintaining a commendable operational speed on off-the-shelf hardware. Further, our method outputs class-wise decomposed reconstructions with better texture capturing appearance and geometric details.</abstract><doi>10.48550/arxiv.2312.00204</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2312.00204
ispartof
issn
language eng
recordid cdi_arxiv_primary_2312_00204
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
title DNS SLAM: Dense Neural Semantic-Informed SLAM
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T12%3A01%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DNS%20SLAM:%20Dense%20Neural%20Semantic-Informed%20SLAM&rft.au=Li,%20Kunyi&rft.date=2023-11-30&rft_id=info:doi/10.48550/arxiv.2312.00204&rft_dat=%3Carxiv_GOX%3E2312_00204%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true