Integrating YOLO and WordNet for automated image object summarization
The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the thing...
Gespeichert in:
Veröffentlicht in: | Signal, image and video processing image and video processing, 2024-12, Vol.18 (12), p.9465-9481 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 9481 |
---|---|
container_issue | 12 |
container_start_page | 9465 |
container_title | Signal, image and video processing |
container_volume | 18 |
creator | Saqib, Sheikh Muhammad Aftab, Aamir Mazhar, Tehseen Iqbal, Muhammad Shahazad, Tariq Almogren, Ahmad Hamam, Habib |
description | The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%. |
doi_str_mv | 10.1007/s11760-024-03560-z |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3124100978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3124100978</sourcerecordid><originalsourceid>FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</originalsourceid><addsrcrecordid>eNp9kD9PwzAQxS0EElXpF2CyxBw4_0nsjKgqUKmiCwgxWXZiR61IXGxnoJ8eQxBs3HJveO_d6YfQJYFrAiBuIiGiggIoL4CVWR1P0IzIihVEEHL6q4Gdo0WMe8jDqJCVnKHVeki2Czrthg6_bjdbrIcWv_jQPtqEnQ9Yj8n3OtkW73rdWezN3jYJx7Hvddgdc9IPF-jM6bdoFz97jp7vVk_Lh2KzvV8vbzdFQwFSoZktwQDnbWOoaZoatJGihdK1rXYMOHW8ZIJK5qSrrS0ZN6IilbFVVQMFNkdXU-8h-PfRxqT2fgxDPqkYoTzTqIXMLjq5muBjDNapQ8i_hw9FQH0RUxMxlYmpb2LqmENsCsVsHjob_qr_SX0CkfZuIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3124100978</pqid></control><display><type>article</type><title>Integrating YOLO and WordNet for automated image object summarization</title><source>Springer Nature - Complete Springer Journals</source><creator>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</creator><creatorcontrib>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</creatorcontrib><description>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</description><identifier>ISSN: 1863-1703</identifier><identifier>EISSN: 1863-1711</identifier><identifier>DOI: 10.1007/s11760-024-03560-z</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer Imaging ; Computer Science ; Computer vision ; Image Processing and Computer Vision ; Multimedia Information Systems ; Natural language processing ; Object recognition ; Original Paper ; Pattern Recognition and Graphics ; Search engines ; Signal,Image and Speech Processing ; Vision</subject><ispartof>Signal, image and video processing, 2024-12, Vol.18 (12), p.9465-9481</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11760-024-03560-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11760-024-03560-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Saqib, Sheikh Muhammad</creatorcontrib><creatorcontrib>Aftab, Aamir</creatorcontrib><creatorcontrib>Mazhar, Tehseen</creatorcontrib><creatorcontrib>Iqbal, Muhammad</creatorcontrib><creatorcontrib>Shahazad, Tariq</creatorcontrib><creatorcontrib>Almogren, Ahmad</creatorcontrib><creatorcontrib>Hamam, Habib</creatorcontrib><title>Integrating YOLO and WordNet for automated image object summarization</title><title>Signal, image and video processing</title><addtitle>SIViP</addtitle><description>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</description><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Image Processing and Computer Vision</subject><subject>Multimedia Information Systems</subject><subject>Natural language processing</subject><subject>Object recognition</subject><subject>Original Paper</subject><subject>Pattern Recognition and Graphics</subject><subject>Search engines</subject><subject>Signal,Image and Speech Processing</subject><subject>Vision</subject><issn>1863-1703</issn><issn>1863-1711</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kD9PwzAQxS0EElXpF2CyxBw4_0nsjKgqUKmiCwgxWXZiR61IXGxnoJ8eQxBs3HJveO_d6YfQJYFrAiBuIiGiggIoL4CVWR1P0IzIihVEEHL6q4Gdo0WMe8jDqJCVnKHVeki2Czrthg6_bjdbrIcWv_jQPtqEnQ9Yj8n3OtkW73rdWezN3jYJx7Hvddgdc9IPF-jM6bdoFz97jp7vVk_Lh2KzvV8vbzdFQwFSoZktwQDnbWOoaZoatJGihdK1rXYMOHW8ZIJK5qSrrS0ZN6IilbFVVQMFNkdXU-8h-PfRxqT2fgxDPqkYoTzTqIXMLjq5muBjDNapQ8i_hw9FQH0RUxMxlYmpb2LqmENsCsVsHjob_qr_SX0CkfZuIA</recordid><startdate>20241201</startdate><enddate>20241201</enddate><creator>Saqib, Sheikh Muhammad</creator><creator>Aftab, Aamir</creator><creator>Mazhar, Tehseen</creator><creator>Iqbal, Muhammad</creator><creator>Shahazad, Tariq</creator><creator>Almogren, Ahmad</creator><creator>Hamam, Habib</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20241201</creationdate><title>Integrating YOLO and WordNet for automated image object summarization</title><author>Saqib, Sheikh Muhammad ; Aftab, Aamir ; Mazhar, Tehseen ; Iqbal, Muhammad ; Shahazad, Tariq ; Almogren, Ahmad ; Hamam, Habib</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c200t-a3e50b044dcb2bcc90ab87d05fddaf3042f4537283f8f9ee534b7616be6690203</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Image Processing and Computer Vision</topic><topic>Multimedia Information Systems</topic><topic>Natural language processing</topic><topic>Object recognition</topic><topic>Original Paper</topic><topic>Pattern Recognition and Graphics</topic><topic>Search engines</topic><topic>Signal,Image and Speech Processing</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saqib, Sheikh Muhammad</creatorcontrib><creatorcontrib>Aftab, Aamir</creatorcontrib><creatorcontrib>Mazhar, Tehseen</creatorcontrib><creatorcontrib>Iqbal, Muhammad</creatorcontrib><creatorcontrib>Shahazad, Tariq</creatorcontrib><creatorcontrib>Almogren, Ahmad</creatorcontrib><creatorcontrib>Hamam, Habib</creatorcontrib><collection>CrossRef</collection><jtitle>Signal, image and video processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saqib, Sheikh Muhammad</au><au>Aftab, Aamir</au><au>Mazhar, Tehseen</au><au>Iqbal, Muhammad</au><au>Shahazad, Tariq</au><au>Almogren, Ahmad</au><au>Hamam, Habib</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrating YOLO and WordNet for automated image object summarization</atitle><jtitle>Signal, image and video processing</jtitle><stitle>SIViP</stitle><date>2024-12-01</date><risdate>2024</risdate><volume>18</volume><issue>12</issue><spage>9465</spage><epage>9481</epage><pages>9465-9481</pages><issn>1863-1703</issn><eissn>1863-1711</eissn><abstract>The demand for methods that automatically create text summaries from images containing many things has recently grown. Our research introduces a fresh and creative way to achieve this. We bring together the WordNet dictionary and the YOLO model to make this happen. YOLO helps us find where the things are in the images, while WordNet provides their meanings. Our process then crafts a summary for each object found. This new technique can have a big impact on computer vision and natural language processing. It can make understanding complicated images, filled with lots of things, much simpler. To test our approach, we used 1381 pictures from the Google Image search engine. Our results showed high accuracy, with 72% for object detection. The precision was 85%, the recall was 72%, and the F1-score was 74%.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s11760-024-03560-z</doi><tpages>17</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1863-1703 |
ispartof | Signal, image and video processing, 2024-12, Vol.18 (12), p.9465-9481 |
issn | 1863-1703 1863-1711 |
language | eng |
recordid | cdi_proquest_journals_3124100978 |
source | Springer Nature - Complete Springer Journals |
subjects | Computer Imaging Computer Science Computer vision Image Processing and Computer Vision Multimedia Information Systems Natural language processing Object recognition Original Paper Pattern Recognition and Graphics Search engines Signal,Image and Speech Processing Vision |
title | Integrating YOLO and WordNet for automated image object summarization |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T20%3A28%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrating%20YOLO%20and%20WordNet%20for%20automated%20image%20object%20summarization&rft.jtitle=Signal,%20image%20and%20video%20processing&rft.au=Saqib,%20Sheikh%20Muhammad&rft.date=2024-12-01&rft.volume=18&rft.issue=12&rft.spage=9465&rft.epage=9481&rft.pages=9465-9481&rft.issn=1863-1703&rft.eissn=1863-1711&rft_id=info:doi/10.1007/s11760-024-03560-z&rft_dat=%3Cproquest_cross%3E3124100978%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3124100978&rft_id=info:pmid/&rfr_iscdi=true |