3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-mod...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	KSII transactions on Internet and information systems 2024-03, Vol.18 (3), p.670-684
Hauptverfasser:	Yeon-seung Choo, Boeun Kim, Hyun-sik Kim, Yong-suk Park
Format:	Artikel
Sprache:	kor
Schlagworte:	Center Loss Cross-Modal Object Retrieval Representation Learning Self-Supervised Learning Supervised Learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	684
container_issue	3
container_start_page	670
container_title	KSII transactions on Internet and information systems
container_volume	18
creator	Yeon-seung Choo Boeun Kim Hyun-sik Kim Yong-suk Park
description	3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.
format	Article
fullrecord	<record><control><sourceid>kiss_kisti</sourceid><recordid>TN_cdi_kisti_ndsl_JAKO202412857665803</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><kiss_id>4084388</kiss_id><sourcerecordid>4084388</sourcerecordid><originalsourceid>FETCH-LOGICAL-k503-39599f91da2e6503b9adade3a4da703d525ae19e5f8d3394f9fe62137e7277943</originalsourceid><addsrcrecordid>eNpNj0tLxDAcxIMouKz7Cbzk4rGQ5p3jWt9WF2z3XP5rEg3bhyRF2G9vRBFPMwy_GZgjtCiNkoWiSh3_86dolVLYkZJqKrnWC7RlV7iKU0rF02Shxy9ujsF9ZrdNYXzDz1NIB1y5cXYR15nDMFrchKEJMGA_RdwM0Pf4EubXd9xGCGOunaETD31yq19dovbmuq3uinpze1-t62IvCCuYEcZ4U1qgTuZgZ8CCdQy4BUWYFVSAK40TXlvGDPfGO0lLptz3F8PZEl38zO5DmkM32tR3D-vHDSWU54tCSSk0YZk7_-NS9xHDAPHQcaI505p9AWpBVLo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Yeon-seung Choo ; Boeun Kim ; Hyun-sik Kim ; Yong-suk Park</creator><creatorcontrib>Yeon-seung Choo ; Boeun Kim ; Hyun-sik Kim ; Yong-suk Park</creatorcontrib><description>3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.</description><identifier>ISSN: 1976-7277</identifier><identifier>EISSN: 1976-7277</identifier><language>kor</language><publisher>한국인터넷정보학회</publisher><subject>Center Loss ; Cross-Modal ; Object Retrieval ; Representation Learning ; Self-Supervised Learning ; Supervised Learning</subject><ispartof>KSII transactions on Internet and information systems, 2024-03, Vol.18 (3), p.670-684</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885</link.rule.ids></links><search><creatorcontrib>Yeon-seung Choo</creatorcontrib><creatorcontrib>Boeun Kim</creatorcontrib><creatorcontrib>Hyun-sik Kim</creatorcontrib><creatorcontrib>Yong-suk Park</creatorcontrib><title>3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training</title><title>KSII transactions on Internet and information systems</title><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><description>3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.</description><subject>Center Loss</subject><subject>Cross-Modal</subject><subject>Object Retrieval</subject><subject>Representation Learning</subject><subject>Self-Supervised Learning</subject><subject>Supervised Learning</subject><issn>1976-7277</issn><issn>1976-7277</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>JDI</sourceid><recordid>eNpNj0tLxDAcxIMouKz7Cbzk4rGQ5p3jWt9WF2z3XP5rEg3bhyRF2G9vRBFPMwy_GZgjtCiNkoWiSh3_86dolVLYkZJqKrnWC7RlV7iKU0rF02Shxy9ujsF9ZrdNYXzDz1NIB1y5cXYR15nDMFrchKEJMGA_RdwM0Pf4EubXd9xGCGOunaETD31yq19dovbmuq3uinpze1-t62IvCCuYEcZ4U1qgTuZgZ8CCdQy4BUWYFVSAK40TXlvGDPfGO0lLptz3F8PZEl38zO5DmkM32tR3D-vHDSWU54tCSSk0YZk7_-NS9xHDAPHQcaI505p9AWpBVLo</recordid><startdate>20240330</startdate><enddate>20240330</enddate><creator>Yeon-seung Choo</creator><creator>Boeun Kim</creator><creator>Hyun-sik Kim</creator><creator>Yong-suk Park</creator><general>한국인터넷정보학회</general><scope>HZB</scope><scope>Q5X</scope><scope>JDI</scope></search><sort><creationdate>20240330</creationdate><title>3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training</title><author>Yeon-seung Choo ; Boeun Kim ; Hyun-sik Kim ; Yong-suk Park</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-k503-39599f91da2e6503b9adade3a4da703d525ae19e5f8d3394f9fe62137e7277943</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>kor</language><creationdate>2024</creationdate><topic>Center Loss</topic><topic>Cross-Modal</topic><topic>Object Retrieval</topic><topic>Representation Learning</topic><topic>Self-Supervised Learning</topic><topic>Supervised Learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yeon-seung Choo</creatorcontrib><creatorcontrib>Boeun Kim</creatorcontrib><creatorcontrib>Hyun-sik Kim</creatorcontrib><creatorcontrib>Yong-suk Park</creatorcontrib><collection>Korean Studies Information Service System (KISS)</collection><collection>Korean Studies Information Service System (KISS) B-Type</collection><collection>KoreaScience</collection><jtitle>KSII transactions on Internet and information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yeon-seung Choo</au><au>Boeun Kim</au><au>Hyun-sik Kim</au><au>Yong-suk Park</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training</atitle><jtitle>KSII transactions on Internet and information systems</jtitle><addtitle>KSII Transactions on Internet and Information Systems (TIIS)</addtitle><date>2024-03-30</date><risdate>2024</risdate><volume>18</volume><issue>3</issue><spage>670</spage><epage>684</epage><pages>670-684</pages><issn>1976-7277</issn><eissn>1976-7277</eissn><abstract>3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.</abstract><pub>한국인터넷정보학회</pub><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1976-7277
ispartof	KSII transactions on Internet and information systems, 2024-03, Vol.18 (3), p.670-684
issn	1976-7277 1976-7277
language	kor
recordid	cdi_kisti_ndsl_JAKO202412857665803
source	EZB-FREE-00999 freely available EZB journals
subjects	Center Loss Cross-Modal Object Retrieval Representation Learning Self-Supervised Learning Supervised Learning
title	3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T05%3A27%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-kiss_kisti&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3D%20Cross-Modal%20Retrieval%20Using%20Noisy%20Center%20Loss%20and%20SimSiam%20for%20Small%20Batch%20Training&rft.jtitle=KSII%20transactions%20on%20Internet%20and%20information%20systems&rft.au=Yeon-seung%20Choo&rft.date=2024-03-30&rft.volume=18&rft.issue=3&rft.spage=670&rft.epage=684&rft.pages=670-684&rft.issn=1976-7277&rft.eissn=1976-7277&rft_id=info:doi/&rft_dat=%3Ckiss_kisti%3E4084388%3C/kiss_kisti%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_kiss_id=4084388&rfr_iscdi=true