YOLOAL: Focusing on the Object Location for Detection on Drone Imagery

Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is nec...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023, Vol.11, p.128886-128897
Hauptverfasser:	Chen, Xinting, Yang, Wenzhu, Zeng, Shuang, Geng, Lei, Jiao, Yanyan
Format:	Artikel
Sprache:	eng
Schlagworte:	attention mechanism Convolutional neural networks Datasets Drone Drones Feature extraction Focusing Image coding loss function Object recognition Real-time systems small dense objects detection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	128897
container_issue
container_start_page	128886
container_title	IEEE access
container_volume	11
creator	Chen, Xinting Yang, Wenzhu Zeng, Shuang Geng, Lei Jiao, Yanyan
description	Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is necessary to improve the ability of the object detection model to rapidly detect small dense objects. To address these issues, we propose YOLOAL, a model that emphasizes the location information of the objects. It incorporates a new attention mechanism called the Convolution and Coordinate Attention Module (CCAM) into its design. This mechanism performs better than traditional ones in dense small object scenes because it adds coordinates that help identify attention regions in such scenarios. Furthermore, our model uses a new loss function combined with the Efficient IoU (EIoU) and Alpha-IoU methods that achieve better results than the traditional approaches. The proposed model achieved state-of-the-art performance on the VisDrone and DOTA datasets. YOLOAL reaches an AP50 (average accuracy when Intersection over Union threshold is 0.5) of 63.6% and an mAP (average of 10 IoU thresholds, ranging from 0.5 to 0.95) of 40.8% at a real-time speed of 0.27 seconds on the VisDrone dataset, and the mAP on the DOTA dataset even reaches 39% on an NVIDIA A4000.
doi_str_mv	10.1109/ACCESS.2023.3332815
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10318136</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10318136</ieee_id><doaj_id>oai_doaj_org_article_44b1de9ba142428780b00256600f599e</doaj_id><sourcerecordid>2892376528</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-6e576e59617a05f357468141ebcd8e6a8a907e3ef312cbe8523dbfd09dbe46fd3</originalsourceid><addsrcrecordid>eNpNUE1rwzAMDWODla6_YDsEdk5nW7Fj71b6sRUCOXQ77GScROkS2rhz0kP__dyljBoZSQ-9J_GC4JGSKaVEvczm8-VmM2WEwRQAmKT8JhgxKlQEHMTtVX0fTLquIf5JD_FkFKy-sjSbpa_hyhbHrm63oW3D_hvDLG-w6MPUFqavPVZZFy6w99i587FwtsVwvTdbdKeH4K4yuw4nlzwOPlfLj_l7lGZv6_ksjQrgqo8E8sR_JWhiCK-AJ7GQNKaYF6VEYaRRJEHACigrcpScQZlXJVFljrGoShgH60G3tKbRB1fvjTtpa2r9B1i31cb1dbFDHcc5LVHlhsYsZjKRJCeEcSEIqbhS6LWeB62Dsz9H7Hrd2KNr_fmaScUgEZxJPwXDVOFs1zms_rdSos_-68F_ffZfX_z3rKeBVSPiFQOopCDgF8FKfos</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2892376528</pqid></control><display><type>article</type><title>YOLOAL: Focusing on the Object Location for Detection on Drone Imagery</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Chen, Xinting ; Yang, Wenzhu ; Zeng, Shuang ; Geng, Lei ; Jiao, Yanyan</creator><creatorcontrib>Chen, Xinting ; Yang, Wenzhu ; Zeng, Shuang ; Geng, Lei ; Jiao, Yanyan</creatorcontrib><description>Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is necessary to improve the ability of the object detection model to rapidly detect small dense objects. To address these issues, we propose YOLOAL, a model that emphasizes the location information of the objects. It incorporates a new attention mechanism called the Convolution and Coordinate Attention Module (CCAM) into its design. This mechanism performs better than traditional ones in dense small object scenes because it adds coordinates that help identify attention regions in such scenarios. Furthermore, our model uses a new loss function combined with the Efficient IoU (EIoU) and Alpha-IoU methods that achieve better results than the traditional approaches. The proposed model achieved state-of-the-art performance on the VisDrone and DOTA datasets. YOLOAL reaches an AP50 (average accuracy when Intersection over Union threshold is 0.5) of 63.6% and an mAP (average of 10 IoU thresholds, ranging from 0.5 to 0.95) of 40.8% at a real-time speed of 0.27 seconds on the VisDrone dataset, and the mAP on the DOTA dataset even reaches 39% on an NVIDIA A4000.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3332815</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>attention mechanism ; Convolutional neural networks ; Datasets ; Drone ; Drones ; Feature extraction ; Focusing ; Image coding ; loss function ; Object recognition ; Real-time systems ; small dense objects detection</subject><ispartof>IEEE access, 2023, Vol.11, p.128886-128897</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-6e576e59617a05f357468141ebcd8e6a8a907e3ef312cbe8523dbfd09dbe46fd3</cites><orcidid>0009-0007-3046-6249</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10318136$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,777,781,861,2096,4010,27614,27904,27905,27906,54914</link.rule.ids></links><search><creatorcontrib>Chen, Xinting</creatorcontrib><creatorcontrib>Yang, Wenzhu</creatorcontrib><creatorcontrib>Zeng, Shuang</creatorcontrib><creatorcontrib>Geng, Lei</creatorcontrib><creatorcontrib>Jiao, Yanyan</creatorcontrib><title>YOLOAL: Focusing on the Object Location for Detection on Drone Imagery</title><title>IEEE access</title><addtitle>Access</addtitle><description>Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is necessary to improve the ability of the object detection model to rapidly detect small dense objects. To address these issues, we propose YOLOAL, a model that emphasizes the location information of the objects. It incorporates a new attention mechanism called the Convolution and Coordinate Attention Module (CCAM) into its design. This mechanism performs better than traditional ones in dense small object scenes because it adds coordinates that help identify attention regions in such scenarios. Furthermore, our model uses a new loss function combined with the Efficient IoU (EIoU) and Alpha-IoU methods that achieve better results than the traditional approaches. The proposed model achieved state-of-the-art performance on the VisDrone and DOTA datasets. YOLOAL reaches an AP50 (average accuracy when Intersection over Union threshold is 0.5) of 63.6% and an mAP (average of 10 IoU thresholds, ranging from 0.5 to 0.95) of 40.8% at a real-time speed of 0.27 seconds on the VisDrone dataset, and the mAP on the DOTA dataset even reaches 39% on an NVIDIA A4000.</description><subject>attention mechanism</subject><subject>Convolutional neural networks</subject><subject>Datasets</subject><subject>Drone</subject><subject>Drones</subject><subject>Feature extraction</subject><subject>Focusing</subject><subject>Image coding</subject><subject>loss function</subject><subject>Object recognition</subject><subject>Real-time systems</subject><subject>small dense objects detection</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUE1rwzAMDWODla6_YDsEdk5nW7Fj71b6sRUCOXQ77GScROkS2rhz0kP__dyljBoZSQ-9J_GC4JGSKaVEvczm8-VmM2WEwRQAmKT8JhgxKlQEHMTtVX0fTLquIf5JD_FkFKy-sjSbpa_hyhbHrm63oW3D_hvDLG-w6MPUFqavPVZZFy6w99i587FwtsVwvTdbdKeH4K4yuw4nlzwOPlfLj_l7lGZv6_ksjQrgqo8E8sR_JWhiCK-AJ7GQNKaYF6VEYaRRJEHACigrcpScQZlXJVFljrGoShgH60G3tKbRB1fvjTtpa2r9B1i31cb1dbFDHcc5LVHlhsYsZjKRJCeEcSEIqbhS6LWeB62Dsz9H7Hrd2KNr_fmaScUgEZxJPwXDVOFs1zms_rdSos_-68F_ffZfX_z3rKeBVSPiFQOopCDgF8FKfos</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Chen, Xinting</creator><creator>Yang, Wenzhu</creator><creator>Zeng, Shuang</creator><creator>Geng, Lei</creator><creator>Jiao, Yanyan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0009-0007-3046-6249</orcidid></search><sort><creationdate>2023</creationdate><title>YOLOAL: Focusing on the Object Location for Detection on Drone Imagery</title><author>Chen, Xinting ; Yang, Wenzhu ; Zeng, Shuang ; Geng, Lei ; Jiao, Yanyan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-6e576e59617a05f357468141ebcd8e6a8a907e3ef312cbe8523dbfd09dbe46fd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>attention mechanism</topic><topic>Convolutional neural networks</topic><topic>Datasets</topic><topic>Drone</topic><topic>Drones</topic><topic>Feature extraction</topic><topic>Focusing</topic><topic>Image coding</topic><topic>loss function</topic><topic>Object recognition</topic><topic>Real-time systems</topic><topic>small dense objects detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Xinting</creatorcontrib><creatorcontrib>Yang, Wenzhu</creatorcontrib><creatorcontrib>Zeng, Shuang</creatorcontrib><creatorcontrib>Geng, Lei</creatorcontrib><creatorcontrib>Jiao, Yanyan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Xinting</au><au>Yang, Wenzhu</au><au>Zeng, Shuang</au><au>Geng, Lei</au><au>Jiao, Yanyan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>YOLOAL: Focusing on the Object Location for Detection on Drone Imagery</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023</date><risdate>2023</risdate><volume>11</volume><spage>128886</spage><epage>128897</epage><pages>128886-128897</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is necessary to improve the ability of the object detection model to rapidly detect small dense objects. To address these issues, we propose YOLOAL, a model that emphasizes the location information of the objects. It incorporates a new attention mechanism called the Convolution and Coordinate Attention Module (CCAM) into its design. This mechanism performs better than traditional ones in dense small object scenes because it adds coordinates that help identify attention regions in such scenarios. Furthermore, our model uses a new loss function combined with the Efficient IoU (EIoU) and Alpha-IoU methods that achieve better results than the traditional approaches. The proposed model achieved state-of-the-art performance on the VisDrone and DOTA datasets. YOLOAL reaches an AP50 (average accuracy when Intersection over Union threshold is 0.5) of 63.6% and an mAP (average of 10 IoU thresholds, ranging from 0.5 to 0.95) of 40.8% at a real-time speed of 0.27 seconds on the VisDrone dataset, and the mAP on the DOTA dataset even reaches 39% on an NVIDIA A4000.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3332815</doi><tpages>12</tpages><orcidid>https://orcid.org/0009-0007-3046-6249</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2023, Vol.11, p.128886-128897
issn	2169-3536 2169-3536
language	eng
recordid	cdi_ieee_primary_10318136
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	attention mechanism Convolutional neural networks Datasets Drone Drones Feature extraction Focusing Image coding loss function Object recognition Real-time systems small dense objects detection
title	YOLOAL: Focusing on the Object Location for Detection on Drone Imagery
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T21%3A12%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=YOLOAL:%20Focusing%20on%20the%20Object%20Location%20for%20Detection%20on%20Drone%20Imagery&rft.jtitle=IEEE%20access&rft.au=Chen,%20Xinting&rft.date=2023&rft.volume=11&rft.spage=128886&rft.epage=128897&rft.pages=128886-128897&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3332815&rft_dat=%3Cproquest_ieee_%3E2892376528%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2892376528&rft_id=info:pmid/&rft_ieee_id=10318136&rft_doaj_id=oai_doaj_org_article_44b1de9ba142428780b00256600f599e&rfr_iscdi=true