THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments

Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. This study presents a 3D shape and color‐based descriptor, TOPS2, for point clouds generated from red green blue‐depth (RGB‐D) images and an accompanying recognition framework, THOR2. Th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Advanced intelligent systems 2024-12
Hauptverfasser:	Samani, Ekta U., Banerjee, Ashis G.
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	Advanced intelligent systems
container_volume
creator	Samani, Ekta U. Banerjee, Ashis G.
description	Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. This study presents a 3D shape and color‐based descriptor, TOPS2, for point clouds generated from red green blue‐depth (RGB‐D) images and an accompanying recognition framework, THOR2. The TOPS2 descriptor embodies object unity, a human cognition mechanism, by retaining the slicing‐based topological representation of 3D shape from the TOPS descriptor (IEEE Trans. Robot. 2024, 40 , 886) while capturing object color information through slicing‐based color embeddings computed using a network of coarse color regions. These color regions, analogous to the MacAdam ellipses identified in human color perception, are obtained using the Mapper algorithm, a topological soft‐clustering technique. THOR2, trained using synthetic data, demonstrates markedly improved recognition accuracy compared to THOR, its 3D shape‐based predecessor, on two benchmark real‐world datasets: the OCID dataset capturing cluttered scenes from different viewpoints and the UW‐IS Occluded dataset reflecting different environmental conditions and degrees of object occlusion recorded using commodity hardware. THOR2 also outperforms baseline deep learning networks and a widely used Vision Transformer adapted for RGB‐D inputs trained using synthetic and limited real‐world data on both the datasets. Therefore, THOR2 is a promising step toward achieving robust recognition in low‐cost robots.
doi_str_mv	10.1002/aisy.202400539
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1002_aisy_202400539</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1002_aisy_202400539</sourcerecordid><originalsourceid>FETCH-LOGICAL-c164t-e8afd1be8d1c214aaf6e8ffd884e28839dd15569d64598cafb36536000ebaf63</originalsourceid><addsrcrecordid>eNpNkM1KAzEUhYMoWGq3rvMCU_M3acZdrdUpFAp1XA-Z_NSUaTIkU6E7H8Fn9Emcooirc8_h3Av3A-AWoylGiNxJl05TgghDKKfFBRgRzlDGcj67_Ddfg0lKezQs4BlGZDYCx6rcbMk9rEIX2rBzSrZw7mV7Si5BGyKkj_DlTXYGSq_hYujEr4_PB5mMhuXxIP3gVj51Lg7Bptkb1cOtUWHnXe-Ch87DV5-M8XDp310M_mB8n27AlZVtMpNfHYPqaVktymy9eV4t5utMYc76zAhpNW6M0FgRzKS03AhrtRDMECFooTXOc15ozvJCKGkbynPKh_9MM3TpGEx_zqoYUorG1l10BxlPNUb1GVt9xlb_YaPftmhj5w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments</title><source>DOAJ Directory of Open Access Journals</source><source>Wiley Online Library Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Wiley Online Library All Journals</source><creator>Samani, Ekta U. ; Banerjee, Ashis G.</creator><creatorcontrib>Samani, Ekta U. ; Banerjee, Ashis G.</creatorcontrib><description>Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. This study presents a 3D shape and color‐based descriptor, TOPS2, for point clouds generated from red green blue‐depth (RGB‐D) images and an accompanying recognition framework, THOR2. The TOPS2 descriptor embodies object unity, a human cognition mechanism, by retaining the slicing‐based topological representation of 3D shape from the TOPS descriptor (IEEE Trans. Robot. 2024, 40 , 886) while capturing object color information through slicing‐based color embeddings computed using a network of coarse color regions. These color regions, analogous to the MacAdam ellipses identified in human color perception, are obtained using the Mapper algorithm, a topological soft‐clustering technique. THOR2, trained using synthetic data, demonstrates markedly improved recognition accuracy compared to THOR, its 3D shape‐based predecessor, on two benchmark real‐world datasets: the OCID dataset capturing cluttered scenes from different viewpoints and the UW‐IS Occluded dataset reflecting different environmental conditions and degrees of object occlusion recorded using commodity hardware. THOR2 also outperforms baseline deep learning networks and a widely used Vision Transformer adapted for RGB‐D inputs trained using synthetic and limited real‐world data on both the datasets. Therefore, THOR2 is a promising step toward achieving robust recognition in low‐cost robots.</description><identifier>ISSN: 2640-4567</identifier><identifier>EISSN: 2640-4567</identifier><identifier>DOI: 10.1002/aisy.202400539</identifier><language>eng</language><ispartof>Advanced intelligent systems, 2024-12</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c164t-e8afd1be8d1c214aaf6e8ffd884e28839dd15569d64598cafb36536000ebaf63</cites><orcidid>0000-0001-5898-7563</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,27923,27924</link.rule.ids></links><search><creatorcontrib>Samani, Ekta U.</creatorcontrib><creatorcontrib>Banerjee, Ashis G.</creatorcontrib><title>THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments</title><title>Advanced intelligent systems</title><description>Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. This study presents a 3D shape and color‐based descriptor, TOPS2, for point clouds generated from red green blue‐depth (RGB‐D) images and an accompanying recognition framework, THOR2. The TOPS2 descriptor embodies object unity, a human cognition mechanism, by retaining the slicing‐based topological representation of 3D shape from the TOPS descriptor (IEEE Trans. Robot. 2024, 40 , 886) while capturing object color information through slicing‐based color embeddings computed using a network of coarse color regions. These color regions, analogous to the MacAdam ellipses identified in human color perception, are obtained using the Mapper algorithm, a topological soft‐clustering technique. THOR2, trained using synthetic data, demonstrates markedly improved recognition accuracy compared to THOR, its 3D shape‐based predecessor, on two benchmark real‐world datasets: the OCID dataset capturing cluttered scenes from different viewpoints and the UW‐IS Occluded dataset reflecting different environmental conditions and degrees of object occlusion recorded using commodity hardware. THOR2 also outperforms baseline deep learning networks and a widely used Vision Transformer adapted for RGB‐D inputs trained using synthetic and limited real‐world data on both the datasets. Therefore, THOR2 is a promising step toward achieving robust recognition in low‐cost robots.</description><issn>2640-4567</issn><issn>2640-4567</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkM1KAzEUhYMoWGq3rvMCU_M3acZdrdUpFAp1XA-Z_NSUaTIkU6E7H8Fn9Emcooirc8_h3Av3A-AWoylGiNxJl05TgghDKKfFBRgRzlDGcj67_Ddfg0lKezQs4BlGZDYCx6rcbMk9rEIX2rBzSrZw7mV7Si5BGyKkj_DlTXYGSq_hYujEr4_PB5mMhuXxIP3gVj51Lg7Bptkb1cOtUWHnXe-Ch87DV5-M8XDp310M_mB8n27AlZVtMpNfHYPqaVktymy9eV4t5utMYc76zAhpNW6M0FgRzKS03AhrtRDMECFooTXOc15ozvJCKGkbynPKh_9MM3TpGEx_zqoYUorG1l10BxlPNUb1GVt9xlb_YaPftmhj5w</recordid><startdate>20241211</startdate><enddate>20241211</enddate><creator>Samani, Ekta U.</creator><creator>Banerjee, Ashis G.</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-5898-7563</orcidid></search><sort><creationdate>20241211</creationdate><title>THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments</title><author>Samani, Ekta U. ; Banerjee, Ashis G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c164t-e8afd1be8d1c214aaf6e8ffd884e28839dd15569d64598cafb36536000ebaf63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Samani, Ekta U.</creatorcontrib><creatorcontrib>Banerjee, Ashis G.</creatorcontrib><collection>CrossRef</collection><jtitle>Advanced intelligent systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Samani, Ekta U.</au><au>Banerjee, Ashis G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments</atitle><jtitle>Advanced intelligent systems</jtitle><date>2024-12-11</date><risdate>2024</risdate><issn>2640-4567</issn><eissn>2640-4567</eissn><abstract>Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. This study presents a 3D shape and color‐based descriptor, TOPS2, for point clouds generated from red green blue‐depth (RGB‐D) images and an accompanying recognition framework, THOR2. The TOPS2 descriptor embodies object unity, a human cognition mechanism, by retaining the slicing‐based topological representation of 3D shape from the TOPS descriptor (IEEE Trans. Robot. 2024, 40 , 886) while capturing object color information through slicing‐based color embeddings computed using a network of coarse color regions. These color regions, analogous to the MacAdam ellipses identified in human color perception, are obtained using the Mapper algorithm, a topological soft‐clustering technique. THOR2, trained using synthetic data, demonstrates markedly improved recognition accuracy compared to THOR, its 3D shape‐based predecessor, on two benchmark real‐world datasets: the OCID dataset capturing cluttered scenes from different viewpoints and the UW‐IS Occluded dataset reflecting different environmental conditions and degrees of object occlusion recorded using commodity hardware. THOR2 also outperforms baseline deep learning networks and a widely used Vision Transformer adapted for RGB‐D inputs trained using synthetic and limited real‐world data on both the datasets. Therefore, THOR2 is a promising step toward achieving robust recognition in low‐cost robots.</abstract><doi>10.1002/aisy.202400539</doi><orcidid>https://orcid.org/0000-0001-5898-7563</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2640-4567
ispartof	Advanced intelligent systems, 2024-12
issn	2640-4567 2640-4567
language	eng
recordid	cdi_crossref_primary_10_1002_aisy_202400539
source	DOAJ Directory of Open Access Journals; Wiley Online Library Open Access; EZB-FREE-00999 freely available EZB journals; Wiley Online Library All Journals
title	THOR2: Topological Analysis for 3D Shape and Color‐Based Human‐Inspired Object Recognition in Unseen Environments
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T11%3A31%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=THOR2:%20Topological%20Analysis%20for%203D%20Shape%20and%20Color%E2%80%90Based%20Human%E2%80%90Inspired%20Object%20Recognition%20in%20Unseen%20Environments&rft.jtitle=Advanced%20intelligent%20systems&rft.au=Samani,%20Ekta%20U.&rft.date=2024-12-11&rft.issn=2640-4567&rft.eissn=2640-4567&rft_id=info:doi/10.1002/aisy.202400539&rft_dat=%3Ccrossref%3E10_1002_aisy_202400539%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true