Integrating YOLO and WordNet for automated image object summarization

The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the thing...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal, image and video processing image and video processing, 2024-12, Vol.18 (12), p.9465-9481
Hauptverfasser:	Saqib, Sheikh Muhammad, Aftab, Aamir, Mazhar, Tehseen, Iqbal, Muhammad, Shahazad, Tariq, Almogren, Ahmad, Hamam, Habib
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Imaging Computer Science Computer vision Image Processing and Computer Vision Multimedia Information Systems Natural language processing Object recognition Original Paper Pattern Recognition and Graphics Search engines Signal,Image and Speech Processing Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	9481
container_issue	12
container_start_page	9465
container_title	Signal, image and video processing
container_volume	18
creator	Saqib, Sheikh Muhammad Aftab, Aamir Mazhar, Tehseen Iqbal, Muhammad Shahazad, Tariq Almogren, Ahmad Hamam, Habib
description	The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.
doi_str_mv	10.1007/s11760-024-03560-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3124100978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3124100978</sourcerecordid><originalsourceid>FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</originalsourceid><addsrcrecordid>eNp9kD9PwzAQxS0EElXpF2CyxBw4_0nsjKgqUKmiCwgxWXZiR61IXGxnoJ8eQxBs3HJveO_d6YfQJYFrAiBuIiGiggIoL4CVWR1P0IzIihVEEHL6q4Gdo0WMe8jDqJCVnKHVeki2Czrthg6_bjdbrIcWv_jQPtqEnQ9Yj8n3OtkW73rdWezN3jYJx7Hvddgdc9IPF-jM6bdoFz97jp7vVk_Lh2KzvV8vbzdFQwFSoZktwQDnbWOoaZoatJGihdK1rXYMOHW8ZIJK5qSrrS0ZN6IilbFVVQMFNkdXU-8h-PfRxqT2fgxDPqkYoTzTqIXMLjq5muBjDNapQ8i_hw9FQH0RUxMxlYmpb2LqmENsCsVsHjob_qr_SX0CkfZuIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3124100978</pqid></control><display><type>article</type><title>Integrating YOLO and WordNet for automated image object summarization</title><source>Springer Nature - Complete Springer Journals</source><creator>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</creator><creatorcontrib>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</creatorcontrib><description>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</description><identifier>ISSN: 1863-1703</identifier><identifier>EISSN: 1863-1711</identifier><identifier>DOI: 10.1007/s11760-024-03560-z</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer Imaging ; Computer Science ; Computer vision ; Image Processing and Computer Vision ; Multimedia Information Systems ; Natural language processing ; Object recognition ; Original Paper ; Pattern Recognition and Graphics ; Search engines ; Signal,Image and Speech Processing ; Vision</subject><ispartof>Signal, image and video processing, 2024-12, Vol.18 (12), p.9465-9481</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11760-024-03560-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11760-024-03560-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Saqib, Sheikh Muhammad</creatorcontrib><creatorcontrib>Aftab, Aamir</creatorcontrib><creatorcontrib>Mazhar, Tehseen</creatorcontrib><creatorcontrib>Iqbal, Muhammad</creatorcontrib><creatorcontrib>Shahazad, Tariq</creatorcontrib><creatorcontrib>Almogren, Ahmad</creatorcontrib><creatorcontrib>Hamam, Habib</creatorcontrib><title>Integrating YOLO and WordNet for automated image object summarization</title><title>Signal, image and video processing</title><addtitle>SIViP</addtitle><description>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</description><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Image Processing and Computer Vision</subject><subject>Multimedia Information Systems</subject><subject>Natural language processing</subject><subject>Object recognition</subject><subject>Original Paper</subject><subject>Pattern Recognition and Graphics</subject><subject>Search engines</subject><subject>Signal,Image and Speech Processing</subject><subject>Vision</subject><issn>1863-1703</issn><issn>1863-1711</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kD9PwzAQxS0EElXpF2CyxBw4_0nsjKgqUKmiCwgxWXZiR61IXGxnoJ8eQxBs3HJveO_d6YfQJYFrAiBuIiGiggIoL4CVWR1P0IzIihVEEHL6q4Gdo0WMe8jDqJCVnKHVeki2Czrthg6_bjdbrIcWv_jQPtqEnQ9Yj8n3OtkW73rdWezN3jYJx7Hvddgdc9IPF-jM6bdoFz97jp7vVk_Lh2KzvV8vbzdFQwFSoZktwQDnbWOoaZoatJGihdK1rXYMOHW8ZIJK5qSrrS0ZN6IilbFVVQMFNkdXU-8h-PfRxqT2fgxDPqkYoTzTqIXMLjq5muBjDNapQ8i_hw9FQH0RUxMxlYmpb2LqmENsCsVsHjob_qr_SX0CkfZuIA</recordid><startdate>20241201</startdate><enddate>20241201</enddate><creator>Saqib, Sheikh Muhammad</creator><creator>Aftab, Aamir</creator><creator>Mazhar, Tehseen</creator><creator>Iqbal, Muhammad</creator><creator>Shahazad, Tariq</creator><creator>Almogren, Ahmad</creator><creator>Hamam, Habib</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20241201</creationdate><title>Integrating YOLO and WordNet for automated image object summarization</title><author>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Image Processing and Computer Vision</topic><topic>Multimedia Information Systems</topic><topic>Natural language processing</topic><topic>Object recognition</topic><topic>Original Paper</topic><topic>Pattern Recognition and Graphics</topic><topic>Search engines</topic><topic>Signal,Image and Speech Processing</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saqib, Sheikh Muhammad</creatorcontrib><creatorcontrib>Aftab, Aamir</creatorcontrib><creatorcontrib>Mazhar, Tehseen</creatorcontrib><creatorcontrib>Iqbal, Muhammad</creatorcontrib><creatorcontrib>Shahazad, Tariq</creatorcontrib><creatorcontrib>Almogren, Ahmad</creatorcontrib><creatorcontrib>Hamam, Habib</creatorcontrib><collection>CrossRef</collection><jtitle>Signal, image and video processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saqib, Sheikh Muhammad</au><au>Aftab, Aamir</au><au>Mazhar, Tehseen</au><au>Iqbal, Muhammad</au><au>Shahazad, Tariq</au><au>Almogren, Ahmad</au><au>Hamam, Habib</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrating YOLO and WordNet for automated image object summarization</atitle><jtitle>Signal, image and video processing</jtitle><stitle>SIViP</stitle><date>2024-12-01</date><risdate>2024</risdate><volume>18</volume><issue>12</issue><spage>9465</spage><epage>9481</epage><pages>9465-9481</pages><issn>1863-1703</issn><eissn>1863-1711</eissn><abstract>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s11760-024-03560-z</doi><tpages>17</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1863-1703
ispartof	Signal, image and video processing, 2024-12, Vol.18 (12), p.9465-9481
issn	1863-1703 1863-1711
language	eng
recordid	cdi_proquest_journals_3124100978
source	Springer Nature - Complete Springer Journals
subjects	Computer Imaging Computer Science Computer vision Image Processing and Computer Vision Multimedia Information Systems Natural language processing Object recognition Original Paper Pattern Recognition and Graphics Search engines Signal,Image and Speech Processing Vision
title	Integrating YOLO and WordNet for automated image object summarization
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T20%3A28%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrating%20YOLO%20and%20WordNet%20for%20automated%20image%20object%20summarization&rft.jtitle=Signal,%20image%20and%20video%20processing&rft.au=Saqib,%20Sheikh%20Muhammad&rft.date=2024-12-01&rft.volume=18&rft.issue=12&rft.spage=9465&rft.epage=9481&rft.pages=9465-9481&rft.issn=1863-1703&rft.eissn=1863-1711&rft_id=info:doi/10.1007/s11760-024-03560-z&rft_dat=%3Cproquest_cross%3E3124100978%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3124100978&rft_id=info:pmid/&rfr_iscdi=true