DM-YOLOX aerial object detection method with intensive attention mechanism

In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing 2024-06, Vol.80 (9), p.12790-12812
Hauptverfasser:	Li, Xiangyu, Wang, Fengping, Wang, Wei, Han, Yanjiang, Zhang, Jianyang
Format:	Artikel
Sprache:	eng
Schlagworte:	Compilers Computer Science Feature extraction Frames per second Image detection Image enhancement Interpreters Object recognition Occlusion Processor Architectures Programming Languages Target detection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	12812
container_issue	9
container_start_page	12790
container_title	The Journal of supercomputing
container_volume	80
creator	Li, Xiangyu Wang, Fengping Wang, Wei Han, Yanjiang Zhang, Jianyang
description	In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the proposed approach incorporates coordinate attention (CA) and a dense connection method into the backbone network architecture, enabling adaptive channel weighting throughout the feature extraction process. This facilitates the enhancement of significant features while suppressing less relevant ones, thereby augmenting the network’s capacity to represent object features and ensuring retention and reinforcement of key features. Secondly, the multibranch extraction module (MBE) is incorporated into the feature fusion network to enhance the network’s ability in extracting multi-scale feature information from images with extensive coverage, thereby enhancing the detection accuracy and efficiency of small- and medium-sized objects in complex scenes. Finally, the utilization of SIoU instead of IoU as the bounding box loss function effectively addresses the issue of mismatch between real and predicted boxes, leading to accelerated network convergence and improved performance during model training. After training and testing on the VisDrone 2019 dataset, this method effectively detects small objects in complex environments. The DM-YOLOX model shows a significant improvement of 2.7% in mAP compared to the baseline network, while achieving an 8% increase in frames per second (FPS).
doi_str_mv	10.1007/s11227-024-05944-x
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3064674539</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3064674539</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-99aa89e6e5a485c2f12895c7b2bd49d73a7ca2463be2152cdf57dea410621dfe3</originalsourceid><addsrcrecordid>eNp9kEtPwzAQhC0EEqXwBzhF4mxYv-L4iMpbRb2ABCfLcTY0VZsU24Xy7wmkEjdOM9LOzEofIacMzhmAvoiMca4pcElBGSnpdo-MmNKCgizkPhmB4UALJfkhOYpxAQBSaDEiD1eP9HU2nb1kDkPjlllXLtCnrMLUS9O12QrTvKuyzybNs6ZN2MbmAzOXere7-7lrm7g6Jge1W0Y82emYPN9cP03u6HR2ez-5nFLPNSRqjHOFwRyVk4XyvGa8MMrrkpeVNJUWTnvHZS5K5ExxX9VKV-gkg5yzqkYxJmfD7jp07xuMyS66TWj7l1ZALnMtlTB9ig8pH7oYA9Z2HZqVC1-Wgf1hZgdmtmdmf5nZbV8SQyn24fYNw9_0P61vPWBvsw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3064674539</pqid></control><display><type>article</type><title>DM-YOLOX aerial object detection method with intensive attention mechanism</title><source>Springer Nature - Complete Springer Journals</source><creator>Li, Xiangyu ; Wang, Fengping ; Wang, Wei ; Han, Yanjiang ; Zhang, Jianyang</creator><creatorcontrib>Li, Xiangyu ; Wang, Fengping ; Wang, Wei ; Han, Yanjiang ; Zhang, Jianyang</creatorcontrib><description>In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the proposed approach incorporates coordinate attention (CA) and a dense connection method into the backbone network architecture, enabling adaptive channel weighting throughout the feature extraction process. This facilitates the enhancement of significant features while suppressing less relevant ones, thereby augmenting the network’s capacity to represent object features and ensuring retention and reinforcement of key features. Secondly, the multibranch extraction module (MBE) is incorporated into the feature fusion network to enhance the network’s ability in extracting multi-scale feature information from images with extensive coverage, thereby enhancing the detection accuracy and efficiency of small- and medium-sized objects in complex scenes. Finally, the utilization of SIoU instead of IoU as the bounding box loss function effectively addresses the issue of mismatch between real and predicted boxes, leading to accelerated network convergence and improved performance during model training. After training and testing on the VisDrone 2019 dataset, this method effectively detects small objects in complex environments. The DM-YOLOX model shows a significant improvement of 2.7% in mAP compared to the baseline network, while achieving an 8% increase in frames per second (FPS).</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-024-05944-x</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Compilers ; Computer Science ; Feature extraction ; Frames per second ; Image detection ; Image enhancement ; Interpreters ; Object recognition ; Occlusion ; Processor Architectures ; Programming Languages ; Target detection</subject><ispartof>The Journal of supercomputing, 2024-06, Vol.80 (9), p.12790-12812</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-99aa89e6e5a485c2f12895c7b2bd49d73a7ca2463be2152cdf57dea410621dfe3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-024-05944-x$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-024-05944-x$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Li, Xiangyu</creatorcontrib><creatorcontrib>Wang, Fengping</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Han, Yanjiang</creatorcontrib><creatorcontrib>Zhang, Jianyang</creatorcontrib><title>DM-YOLOX aerial object detection method with intensive attention mechanism</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the proposed approach incorporates coordinate attention (CA) and a dense connection method into the backbone network architecture, enabling adaptive channel weighting throughout the feature extraction process. This facilitates the enhancement of significant features while suppressing less relevant ones, thereby augmenting the network’s capacity to represent object features and ensuring retention and reinforcement of key features. Secondly, the multibranch extraction module (MBE) is incorporated into the feature fusion network to enhance the network’s ability in extracting multi-scale feature information from images with extensive coverage, thereby enhancing the detection accuracy and efficiency of small- and medium-sized objects in complex scenes. Finally, the utilization of SIoU instead of IoU as the bounding box loss function effectively addresses the issue of mismatch between real and predicted boxes, leading to accelerated network convergence and improved performance during model training. After training and testing on the VisDrone 2019 dataset, this method effectively detects small objects in complex environments. The DM-YOLOX model shows a significant improvement of 2.7% in mAP compared to the baseline network, while achieving an 8% increase in frames per second (FPS).</description><subject>Compilers</subject><subject>Computer Science</subject><subject>Feature extraction</subject><subject>Frames per second</subject><subject>Image detection</subject><subject>Image enhancement</subject><subject>Interpreters</subject><subject>Object recognition</subject><subject>Occlusion</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Target detection</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kEtPwzAQhC0EEqXwBzhF4mxYv-L4iMpbRb2ABCfLcTY0VZsU24Xy7wmkEjdOM9LOzEofIacMzhmAvoiMca4pcElBGSnpdo-MmNKCgizkPhmB4UALJfkhOYpxAQBSaDEiD1eP9HU2nb1kDkPjlllXLtCnrMLUS9O12QrTvKuyzybNs6ZN2MbmAzOXere7-7lrm7g6Jge1W0Y82emYPN9cP03u6HR2ez-5nFLPNSRqjHOFwRyVk4XyvGa8MMrrkpeVNJUWTnvHZS5K5ExxX9VKV-gkg5yzqkYxJmfD7jp07xuMyS66TWj7l1ZALnMtlTB9ig8pH7oYA9Z2HZqVC1-Wgf1hZgdmtmdmf5nZbV8SQyn24fYNw9_0P61vPWBvsw</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Li, Xiangyu</creator><creator>Wang, Fengping</creator><creator>Wang, Wei</creator><creator>Han, Yanjiang</creator><creator>Zhang, Jianyang</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240601</creationdate><title>DM-YOLOX aerial object detection method with intensive attention mechanism</title><author>Li, Xiangyu ; Wang, Fengping ; Wang, Wei ; Han, Yanjiang ; Zhang, Jianyang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-99aa89e6e5a485c2f12895c7b2bd49d73a7ca2463be2152cdf57dea410621dfe3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Compilers</topic><topic>Computer Science</topic><topic>Feature extraction</topic><topic>Frames per second</topic><topic>Image detection</topic><topic>Image enhancement</topic><topic>Interpreters</topic><topic>Object recognition</topic><topic>Occlusion</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Target detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Xiangyu</creatorcontrib><creatorcontrib>Wang, Fengping</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Han, Yanjiang</creatorcontrib><creatorcontrib>Zhang, Jianyang</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Xiangyu</au><au>Wang, Fengping</au><au>Wang, Wei</au><au>Han, Yanjiang</au><au>Zhang, Jianyang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DM-YOLOX aerial object detection method with intensive attention mechanism</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>80</volume><issue>9</issue><spage>12790</spage><epage>12812</epage><pages>12790-12812</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the proposed approach incorporates coordinate attention (CA) and a dense connection method into the backbone network architecture, enabling adaptive channel weighting throughout the feature extraction process. This facilitates the enhancement of significant features while suppressing less relevant ones, thereby augmenting the network’s capacity to represent object features and ensuring retention and reinforcement of key features. Secondly, the multibranch extraction module (MBE) is incorporated into the feature fusion network to enhance the network’s ability in extracting multi-scale feature information from images with extensive coverage, thereby enhancing the detection accuracy and efficiency of small- and medium-sized objects in complex scenes. Finally, the utilization of SIoU instead of IoU as the bounding box loss function effectively addresses the issue of mismatch between real and predicted boxes, leading to accelerated network convergence and improved performance during model training. After training and testing on the VisDrone 2019 dataset, this method effectively detects small objects in complex environments. The DM-YOLOX model shows a significant improvement of 2.7% in mAP compared to the baseline network, while achieving an 8% increase in frames per second (FPS).</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-024-05944-x</doi><tpages>23</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0920-8542
ispartof	The Journal of supercomputing, 2024-06, Vol.80 (9), p.12790-12812
issn	0920-8542 1573-0484
language	eng
recordid	cdi_proquest_journals_3064674539
source	Springer Nature - Complete Springer Journals
subjects	Compilers Computer Science Feature extraction Frames per second Image detection Image enhancement Interpreters Object recognition Occlusion Processor Architectures Programming Languages Target detection
title	DM-YOLOX aerial object detection method with intensive attention mechanism
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T21%3A24%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DM-YOLOX%20aerial%20object%20detection%20method%20with%20intensive%20attention%20mechanism&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Li,%20Xiangyu&rft.date=2024-06-01&rft.volume=80&rft.issue=9&rft.spage=12790&rft.epage=12812&rft.pages=12790-12812&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-024-05944-x&rft_dat=%3Cproquest_cross%3E3064674539%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3064674539&rft_id=info:pmid/&rfr_iscdi=true