Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies

This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not s...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters 2024-06, Vol.9 (6), p.1-8
Hauptverfasser:	Strader, Jared, Hughes, Nathan, Chen, William, Speranzon, Alberto, Carlone, Luca
Format:	Artikel
Sprache:	eng
Schlagworte:	3D scene graphs AI-based methods Axioms Coastal environments Large language models Ontologies Ontology Predictions Robots Rural environments Semantic scene understanding Semantics Solid modeling spatial ontologies Tensors Three-dimensional displays Training Training data
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	8
container_issue	6
container_start_page	1
container_title	IEEE robotics and automation letters
container_volume	9
creator	Strader, Jared Hughes, Nathan Chen, William Speranzon, Alberto Carlone, Luca
description	This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.
doi_str_mv	10.1109/LRA.2024.3384084
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10487851</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10487851</ieee_id><sourcerecordid>3040055670</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-ea9bfdd0206c836d49e04c550af8ba673a51c50a7afedf967312899af8833b423</originalsourceid><addsrcrecordid>eNpNkL1PwzAQxS0EElXpzsBgiTnl_BU7Y1VKqRSpEi2slpM4JVWwg5MM_Pe4lKHTvbv37k76IXRPYE4IZE_522JOgfI5Y4qD4ldoQpmUCZNpen2hb9Gs748AQASVLBMTtN-4yvuAjavwdhz-NHvGu9I6i9fBdJ94HWUwQ-Md_mgMzo07jOZgk5UzRWsrvOuiaVq8dYNv_aGx_R26qU3b29l_naL3l9V--Zrk2_VmuciTknIxJNZkRV1VQCEtFUsrnlngpRBgalWYVDIjSBk7aWpb1VkcEKqyLLqKsYJTNkWP57td8N-j7Qd99GNw8aVmwAGESCXEFJxTZfB9H2ytu9B8mfCjCegTPh3x6RM-_Y8vrjycVxpr7UWcK6kEYb-N7Go2</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3040055670</pqid></control><display><type>article</type><title>Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies</title><source>IEEE Electronic Library (IEL)</source><creator>Strader, Jared ; Hughes, Nathan ; Chen, William ; Speranzon, Alberto ; Carlone, Luca</creator><creatorcontrib>Strader, Jared ; Hughes, Nathan ; Chen, William ; Speranzon, Alberto ; Carlone, Luca</creatorcontrib><description>This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.</description><identifier>ISSN: 2377-3766</identifier><identifier>EISSN: 2377-3766</identifier><identifier>DOI: 10.1109/LRA.2024.3384084</identifier><identifier>CODEN: IRALC6</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>3D scene graphs ; AI-based methods ; Axioms ; Coastal environments ; Large language models ; Ontologies ; Ontology ; Predictions ; Robots ; Rural environments ; Semantic scene understanding ; Semantics ; Solid modeling ; spatial ontologies ; Tensors ; Three-dimensional displays ; Training ; Training data</subject><ispartof>IEEE robotics and automation letters, 2024-06, Vol.9 (6), p.1-8</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-ea9bfdd0206c836d49e04c550af8ba673a51c50a7afedf967312899af8833b423</cites><orcidid>0000-0002-1201-7032 ; 0000-0003-1884-5397 ; 0000-0002-9203-2901 ; 0000-0002-3978-9542 ; 0009-0002-9193-9197</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10487851$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10487851$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Strader, Jared</creatorcontrib><creatorcontrib>Hughes, Nathan</creatorcontrib><creatorcontrib>Chen, William</creatorcontrib><creatorcontrib>Speranzon, Alberto</creatorcontrib><creatorcontrib>Carlone, Luca</creatorcontrib><title>Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies</title><title>IEEE robotics and automation letters</title><addtitle>LRA</addtitle><description>This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.</description><subject>3D scene graphs</subject><subject>AI-based methods</subject><subject>Axioms</subject><subject>Coastal environments</subject><subject>Large language models</subject><subject>Ontologies</subject><subject>Ontology</subject><subject>Predictions</subject><subject>Robots</subject><subject>Rural environments</subject><subject>Semantic scene understanding</subject><subject>Semantics</subject><subject>Solid modeling</subject><subject>spatial ontologies</subject><subject>Tensors</subject><subject>Three-dimensional displays</subject><subject>Training</subject><subject>Training data</subject><issn>2377-3766</issn><issn>2377-3766</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkL1PwzAQxS0EElXpzsBgiTnl_BU7Y1VKqRSpEi2slpM4JVWwg5MM_Pe4lKHTvbv37k76IXRPYE4IZE_522JOgfI5Y4qD4ldoQpmUCZNpen2hb9Gs748AQASVLBMTtN-4yvuAjavwdhz-NHvGu9I6i9fBdJ94HWUwQ-Md_mgMzo07jOZgk5UzRWsrvOuiaVq8dYNv_aGx_R26qU3b29l_naL3l9V--Zrk2_VmuciTknIxJNZkRV1VQCEtFUsrnlngpRBgalWYVDIjSBk7aWpb1VkcEKqyLLqKsYJTNkWP57td8N-j7Qd99GNw8aVmwAGESCXEFJxTZfB9H2ytu9B8mfCjCegTPh3x6RM-_Y8vrjycVxpr7UWcK6kEYb-N7Go2</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Strader, Jared</creator><creator>Hughes, Nathan</creator><creator>Chen, William</creator><creator>Speranzon, Alberto</creator><creator>Carlone, Luca</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1201-7032</orcidid><orcidid>https://orcid.org/0000-0003-1884-5397</orcidid><orcidid>https://orcid.org/0000-0002-9203-2901</orcidid><orcidid>https://orcid.org/0000-0002-3978-9542</orcidid><orcidid>https://orcid.org/0009-0002-9193-9197</orcidid></search><sort><creationdate>20240601</creationdate><title>Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies</title><author>Strader, Jared ; Hughes, Nathan ; Chen, William ; Speranzon, Alberto ; Carlone, Luca</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-ea9bfdd0206c836d49e04c550af8ba673a51c50a7afedf967312899af8833b423</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>3D scene graphs</topic><topic>AI-based methods</topic><topic>Axioms</topic><topic>Coastal environments</topic><topic>Large language models</topic><topic>Ontologies</topic><topic>Ontology</topic><topic>Predictions</topic><topic>Robots</topic><topic>Rural environments</topic><topic>Semantic scene understanding</topic><topic>Semantics</topic><topic>Solid modeling</topic><topic>spatial ontologies</topic><topic>Tensors</topic><topic>Three-dimensional displays</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Strader, Jared</creatorcontrib><creatorcontrib>Hughes, Nathan</creatorcontrib><creatorcontrib>Chen, William</creatorcontrib><creatorcontrib>Speranzon, Alberto</creatorcontrib><creatorcontrib>Carlone, Luca</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE robotics and automation letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Strader, Jared</au><au>Hughes, Nathan</au><au>Chen, William</au><au>Speranzon, Alberto</au><au>Carlone, Luca</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies</atitle><jtitle>IEEE robotics and automation letters</jtitle><stitle>LRA</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>9</volume><issue>6</issue><spage>1</spage><epage>8</epage><pages>1-8</pages><issn>2377-3766</issn><eissn>2377-3766</eissn><coden>IRALC6</coden><abstract>This paper proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LRA.2024.3384084</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-1201-7032</orcidid><orcidid>https://orcid.org/0000-0003-1884-5397</orcidid><orcidid>https://orcid.org/0000-0002-9203-2901</orcidid><orcidid>https://orcid.org/0000-0002-3978-9542</orcidid><orcidid>https://orcid.org/0009-0002-9193-9197</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2377-3766
ispartof	IEEE robotics and automation letters, 2024-06, Vol.9 (6), p.1-8
issn	2377-3766 2377-3766
language	eng
recordid	cdi_ieee_primary_10487851
source	IEEE Electronic Library (IEL)
subjects	3D scene graphs AI-based methods Axioms Coastal environments Large language models Ontologies Ontology Predictions Robots Rural environments Semantic scene understanding Semantics Solid modeling spatial ontologies Tensors Three-dimensional displays Training Training data
title	Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T05%3A25%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Indoor%20and%20Outdoor%203D%20Scene%20Graph%20Generation%20Via%20Language-Enabled%20Spatial%20Ontologies&rft.jtitle=IEEE%20robotics%20and%20automation%20letters&rft.au=Strader,%20Jared&rft.date=2024-06-01&rft.volume=9&rft.issue=6&rft.spage=1&rft.epage=8&rft.pages=1-8&rft.issn=2377-3766&rft.eissn=2377-3766&rft.coden=IRALC6&rft_id=info:doi/10.1109/LRA.2024.3384084&rft_dat=%3Cproquest_RIE%3E3040055670%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3040055670&rft_id=info:pmid/&rft_ieee_id=10487851&rfr_iscdi=true