SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain

With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, n...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Xiong, Butian, Ye, Xiaoyu, Tse, Tze Ho Elden, Han, Kai, Cui, Shuguang, Li, Zhen
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Xiong, Butian Ye, Xiaoyu Tse, Tze Ho Elden Han, Kai Cui, Shuguang Li, Zhen
description	With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats. Specifically, we leverage prior information stored in large vision models such as SAM and DINO to generate semantic masks. We then introduce a geometric complexity measurement function to serve as soft regularization, guiding the shape of each Gaussian Splat within specific semantic areas. Additionally, we present a method that estimates the expected number of Gaussian Splats in different semantic areas, effectively providing a lower bound for Gaussian Splats in these areas. Subsequently, we extract the point cloud using a novel probability density-based extraction method, transforming Gaussian Splats into a point cloud crucial for downstream tasks. Our method also offers the potential for detailed semantic inquiries while maintaining high image-based reconstruction results. We provide extensive experiments on publicly available large-scale scene reconstruction datasets with highly accurate point clouds as ground truth and our novel dataset. Our results demonstrate the superiority of our method over current state-of-the-art Gaussian Splats reconstruction methods by a significant margin in terms of geometric-based measurement metrics. Code and additional results will soon be available on our project page.
doi_str_mv	10.48550/arxiv.2405.16923
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2405_16923</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2405_16923</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-a61ad681251fbb55886023764d18e9382546e08ec493c133cfc56be612aab3cc3</originalsourceid><addsrcrecordid>eNotz01OwzAUBGBvWKDCAVjhCyT4J3ad7qIIAlIkJNx99OK-tJYap3JcSm8PBDYzi5FG-gh54CwvjFLsCeKX_8xFwVTOdSnkLelslTV2Qy2OEJJ3WXWBiLSB8zx7CNSejpCSD3s6TJG2EPdIrcOA9APdFOYUzy75KdCLTwfa4DRiildaLxP4cEduBjjOeP_fK7J9ed7Wr1n73rzVVZuBXsuf4LDThgvFh75XyhjNhFzrYscNltIIVWhkBl1RSseldINTukfNBUAvnZMr8vh3uwi7U_QjxGv3K-0WqfwG3IlOUw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain</title><source>arXiv.org</source><creator>Xiong, Butian ; Ye, Xiaoyu ; Tse, Tze Ho Elden ; Han, Kai ; Cui, Shuguang ; Li, Zhen</creator><creatorcontrib>Xiong, Butian ; Ye, Xiaoyu ; Tse, Tze Ho Elden ; Han, Kai ; Cui, Shuguang ; Li, Zhen</creatorcontrib><description>With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats. Specifically, we leverage prior information stored in large vision models such as SAM and DINO to generate semantic masks. We then introduce a geometric complexity measurement function to serve as soft regularization, guiding the shape of each Gaussian Splat within specific semantic areas. Additionally, we present a method that estimates the expected number of Gaussian Splats in different semantic areas, effectively providing a lower bound for Gaussian Splats in these areas. Subsequently, we extract the point cloud using a novel probability density-based extraction method, transforming Gaussian Splats into a point cloud crucial for downstream tasks. Our method also offers the potential for detailed semantic inquiries while maintaining high image-based reconstruction results. We provide extensive experiments on publicly available large-scale scene reconstruction datasets with highly accurate point clouds as ground truth and our novel dataset. Our results demonstrate the superiority of our method over current state-of-the-art Gaussian Splats reconstruction methods by a significant margin in terms of geometric-based measurement metrics. Code and additional results will soon be available on our project page.</description><identifier>DOI: 10.48550/arxiv.2405.16923</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-05</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2405.16923$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2405.16923$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Xiong, Butian</creatorcontrib><creatorcontrib>Ye, Xiaoyu</creatorcontrib><creatorcontrib>Tse, Tze Ho Elden</creatorcontrib><creatorcontrib>Han, Kai</creatorcontrib><creatorcontrib>Cui, Shuguang</creatorcontrib><creatorcontrib>Li, Zhen</creatorcontrib><title>SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain</title><description>With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats. Specifically, we leverage prior information stored in large vision models such as SAM and DINO to generate semantic masks. We then introduce a geometric complexity measurement function to serve as soft regularization, guiding the shape of each Gaussian Splat within specific semantic areas. Additionally, we present a method that estimates the expected number of Gaussian Splats in different semantic areas, effectively providing a lower bound for Gaussian Splats in these areas. Subsequently, we extract the point cloud using a novel probability density-based extraction method, transforming Gaussian Splats into a point cloud crucial for downstream tasks. Our method also offers the potential for detailed semantic inquiries while maintaining high image-based reconstruction results. We provide extensive experiments on publicly available large-scale scene reconstruction datasets with highly accurate point clouds as ground truth and our novel dataset. Our results demonstrate the superiority of our method over current state-of-the-art Gaussian Splats reconstruction methods by a significant margin in terms of geometric-based measurement metrics. Code and additional results will soon be available on our project page.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz01OwzAUBGBvWKDCAVjhCyT4J3ad7qIIAlIkJNx99OK-tJYap3JcSm8PBDYzi5FG-gh54CwvjFLsCeKX_8xFwVTOdSnkLelslTV2Qy2OEJJ3WXWBiLSB8zx7CNSejpCSD3s6TJG2EPdIrcOA9APdFOYUzy75KdCLTwfa4DRiildaLxP4cEduBjjOeP_fK7J9ed7Wr1n73rzVVZuBXsuf4LDThgvFh75XyhjNhFzrYscNltIIVWhkBl1RSseldINTukfNBUAvnZMr8vh3uwi7U_QjxGv3K-0WqfwG3IlOUw</recordid><startdate>20240527</startdate><enddate>20240527</enddate><creator>Xiong, Butian</creator><creator>Ye, Xiaoyu</creator><creator>Tse, Tze Ho Elden</creator><creator>Han, Kai</creator><creator>Cui, Shuguang</creator><creator>Li, Zhen</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240527</creationdate><title>SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain</title><author>Xiong, Butian ; Ye, Xiaoyu ; Tse, Tze Ho Elden ; Han, Kai ; Cui, Shuguang ; Li, Zhen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-a61ad681251fbb55886023764d18e9382546e08ec493c133cfc56be612aab3cc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiong, Butian</creatorcontrib><creatorcontrib>Ye, Xiaoyu</creatorcontrib><creatorcontrib>Tse, Tze Ho Elden</creatorcontrib><creatorcontrib>Han, Kai</creatorcontrib><creatorcontrib>Cui, Shuguang</creatorcontrib><creatorcontrib>Li, Zhen</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xiong, Butian</au><au>Ye, Xiaoyu</au><au>Tse, Tze Ho Elden</au><au>Han, Kai</au><au>Cui, Shuguang</au><au>Li, Zhen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain</atitle><date>2024-05-27</date><risdate>2024</risdate><abstract>With the emergence of Gaussian Splats, recent efforts have focused on large-scale scene geometric reconstruction. However, most of these efforts either concentrate on memory reduction or spatial space division, neglecting information in the semantic space. In this paper, we propose a novel method, named SA-GS, for fine-grained 3D geometry reconstruction using semantic-aware 3D Gaussian Splats. Specifically, we leverage prior information stored in large vision models such as SAM and DINO to generate semantic masks. We then introduce a geometric complexity measurement function to serve as soft regularization, guiding the shape of each Gaussian Splat within specific semantic areas. Additionally, we present a method that estimates the expected number of Gaussian Splats in different semantic areas, effectively providing a lower bound for Gaussian Splats in these areas. Subsequently, we extract the point cloud using a novel probability density-based extraction method, transforming Gaussian Splats into a point cloud crucial for downstream tasks. Our method also offers the potential for detailed semantic inquiries while maintaining high image-based reconstruction results. We provide extensive experiments on publicly available large-scale scene reconstruction datasets with highly accurate point clouds as ground truth and our novel dataset. Our results demonstrate the superiority of our method over current state-of-the-art Gaussian Splats reconstruction methods by a significant margin in terms of geometric-based measurement metrics. Code and additional results will soon be available on our project page.</abstract><doi>10.48550/arxiv.2405.16923</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2405.16923
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2405_16923
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T10%3A40%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SA-GS:%20Semantic-Aware%20Gaussian%20Splatting%20for%20Large%20Scene%20Reconstruction%20with%20Geometry%20Constrain&rft.au=Xiong,%20Butian&rft.date=2024-05-27&rft_id=info:doi/10.48550/arxiv.2405.16923&rft_dat=%3Carxiv_GOX%3E2405_16923%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true