Annotate and retrieve in vivo images using hybrid self-organizing map

Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the se...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Visual computer 2024-08, Vol.40 (8), p.5619-5638
Hauptverfasser:	Kaur, Parminder, Malhi, Avleen, Pannu, Husanbir
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Annotations Artificial Intelligence Brain research Comparative analysis Computer Graphics Computer Science Deep learning Effectiveness Endoscopy Heterogeneity Image Processing and Computer Vision Image retrieval In vivo methods and tests Information retrieval Keywords Linguistics Neural networks Neurosciences Original Article R&D Research & development Self organizing maps Semantics Support vector machines
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5638
container_issue	8
container_start_page	5619
container_title	The Visual computer
container_volume	40
creator	Kaur, Parminder Malhi, Avleen Pannu, Husanbir
description	Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.
doi_str_mv	10.1007/s00371-023-03126-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3084113825</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3084113825</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-743437f150445c5d9955be64ccf955809360026002be4c9623330b4cc385c373</originalsourceid><addsrcrecordid>eNp9UE1LAzEQDaJgrf4BTwHP0Ukm2eweS6kfIHjpPeym2TWlzdZkW2h_vakrePMwzNd7b5hHyD2HRw6gnxIAas5AIAPkomCnCzLhEgUTyNUlmQDXJRO6rK7JTUpryL2W1YQsZiH0Qz04WocVjW6I3h0c9YEe_KGnflt3LtF98qGjn8cm-hVNbtOyPnZ18KfzeFvvbslVW2-Su_vNU7J8Xiznr-z94-VtPntnFrkcmJYoUbdcgZTKqlVVKdW4Qlrb5qqECgsAcY7GSVsVAhGhyWsslUWNU_Iwyu5i_7V3aTDrfh9DvmgQSsk5lkJllBhRNvYpRdeaXcx_xKPhYM5umdEtk90yP26ZUybhSEoZHDoX_6T_YX0DLkZroQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3084113825</pqid></control><display><type>article</type><title>Annotate and retrieve in vivo images using hybrid self-organizing map</title><source>SpringerLink Journals</source><creator>Kaur, Parminder ; Malhi, Avleen ; Pannu, Husanbir</creator><creatorcontrib>Kaur, Parminder ; Malhi, Avleen ; Pannu, Husanbir</creatorcontrib><description>Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-023-03126-z</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Annotations ; Artificial Intelligence ; Brain research ; Comparative analysis ; Computer Graphics ; Computer Science ; Deep learning ; Effectiveness ; Endoscopy ; Heterogeneity ; Image Processing and Computer Vision ; Image retrieval ; In vivo methods and tests ; Information retrieval ; Keywords ; Linguistics ; Neural networks ; Neurosciences ; Original Article ; R&D ; Research & development ; Self organizing maps ; Semantics ; Support vector machines</subject><ispartof>The Visual computer, 2024-08, Vol.40 (8), p.5619-5638</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-743437f150445c5d9955be64ccf955809360026002be4c9623330b4cc385c373</cites><orcidid>0000-0002-1306-3137</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00371-023-03126-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00371-023-03126-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Kaur, Parminder</creatorcontrib><creatorcontrib>Malhi, Avleen</creatorcontrib><creatorcontrib>Pannu, Husanbir</creatorcontrib><title>Annotate and retrieve in vivo images using hybrid self-organizing map</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.</description><subject>Algorithms</subject><subject>Annotations</subject><subject>Artificial Intelligence</subject><subject>Brain research</subject><subject>Comparative analysis</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Deep learning</subject><subject>Effectiveness</subject><subject>Endoscopy</subject><subject>Heterogeneity</subject><subject>Image Processing and Computer Vision</subject><subject>Image retrieval</subject><subject>In vivo methods and tests</subject><subject>Information retrieval</subject><subject>Keywords</subject><subject>Linguistics</subject><subject>Neural networks</subject><subject>Neurosciences</subject><subject>Original Article</subject><subject>R&D</subject><subject>Research & development</subject><subject>Self organizing maps</subject><subject>Semantics</subject><subject>Support vector machines</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9UE1LAzEQDaJgrf4BTwHP0Ukm2eweS6kfIHjpPeym2TWlzdZkW2h_vakrePMwzNd7b5hHyD2HRw6gnxIAas5AIAPkomCnCzLhEgUTyNUlmQDXJRO6rK7JTUpryL2W1YQsZiH0Qz04WocVjW6I3h0c9YEe_KGnflt3LtF98qGjn8cm-hVNbtOyPnZ18KfzeFvvbslVW2-Su_vNU7J8Xiznr-z94-VtPntnFrkcmJYoUbdcgZTKqlVVKdW4Qlrb5qqECgsAcY7GSVsVAhGhyWsslUWNU_Iwyu5i_7V3aTDrfh9DvmgQSsk5lkJllBhRNvYpRdeaXcx_xKPhYM5umdEtk90yP26ZUybhSEoZHDoX_6T_YX0DLkZroQ</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Kaur, Parminder</creator><creator>Malhi, Avleen</creator><creator>Pannu, Husanbir</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0002-1306-3137</orcidid></search><sort><creationdate>20240801</creationdate><title>Annotate and retrieve in vivo images using hybrid self-organizing map</title><author>Kaur, Parminder ; Malhi, Avleen ; Pannu, Husanbir</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-743437f150445c5d9955be64ccf955809360026002be4c9623330b4cc385c373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Annotations</topic><topic>Artificial Intelligence</topic><topic>Brain research</topic><topic>Comparative analysis</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Deep learning</topic><topic>Effectiveness</topic><topic>Endoscopy</topic><topic>Heterogeneity</topic><topic>Image Processing and Computer Vision</topic><topic>Image retrieval</topic><topic>In vivo methods and tests</topic><topic>Information retrieval</topic><topic>Keywords</topic><topic>Linguistics</topic><topic>Neural networks</topic><topic>Neurosciences</topic><topic>Original Article</topic><topic>R&D</topic><topic>Research & development</topic><topic>Self organizing maps</topic><topic>Semantics</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kaur, Parminder</creatorcontrib><creatorcontrib>Malhi, Avleen</creatorcontrib><creatorcontrib>Pannu, Husanbir</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kaur, Parminder</au><au>Malhi, Avleen</au><au>Pannu, Husanbir</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Annotate and retrieve in vivo images using hybrid self-organizing map</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>40</volume><issue>8</issue><spage>5619</spage><epage>5638</epage><pages>5619-5638</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-023-03126-z</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0002-1306-3137</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0178-2789
ispartof	The Visual computer, 2024-08, Vol.40 (8), p.5619-5638
issn	0178-2789 1432-2315
language	eng
recordid	cdi_proquest_journals_3084113825
source	SpringerLink Journals
subjects	Algorithms Annotations Artificial Intelligence Brain research Comparative analysis Computer Graphics Computer Science Deep learning Effectiveness Endoscopy Heterogeneity Image Processing and Computer Vision Image retrieval In vivo methods and tests Information retrieval Keywords Linguistics Neural networks Neurosciences Original Article R&D Research & development Self organizing maps Semantics Support vector machines
title	Annotate and retrieve in vivo images using hybrid self-organizing map
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T08%3A07%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Annotate%20and%20retrieve%20in%20vivo%20images%20using%20hybrid%20self-organizing%20map&rft.jtitle=The%20Visual%20computer&rft.au=Kaur,%20Parminder&rft.date=2024-08-01&rft.volume=40&rft.issue=8&rft.spage=5619&rft.epage=5638&rft.pages=5619-5638&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-023-03126-z&rft_dat=%3Cproquest_cross%3E3084113825%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3084113825&rft_id=info:pmid/&rfr_iscdi=true