PANetW: PANet with wider receptive fields for object detection

PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2024-01, Vol.83 (25), p.66517-66538
Hauptverfasser:	Chen, Ran, Xin, Dongjun, Wang, Chuanli, Wang, Peng, Tan, Junwen, Kang, Wenjie
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Modules Multimedia Multimedia Information Systems Neural networks Object recognition Semantics Special Purpose and Application-Based Systems Task complexity Telematics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	66538
container_issue	25
container_start_page	66517
container_title	Multimedia tools and applications
container_volume	83
creator	Chen, Ran Xin, Dongjun Wang, Chuanli Wang, Peng Tan, Junwen Kang, Wenjie
description	PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.
doi_str_mv	10.1007/s11042-024-18219-7
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3077577810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3077577810</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3077577810</pqid></control><display><type>article</type><title>PANetW: PANet with wider receptive fields for object detection</title><source>SpringerLink Journals - AutoHoldings</source><creator>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creator><creatorcontrib>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creatorcontrib><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18219-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Datasets ; Modules ; Multimedia ; Multimedia Information Systems ; Neural networks ; Object recognition ; Semantics ; Special Purpose and Application-Based Systems ; Task complexity ; Telematics</subject><ispartof>Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18219-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18219-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><title>PANetW: PANet with wider receptive fields for object detection</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><subject>Accuracy</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Datasets</subject><subject>Modules</subject><subject>Multimedia</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Semantics</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Task complexity</subject><subject>Telematics</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</recordid><startdate>20240124</startdate><enddate>20240124</enddate><creator>Chen, Ran</creator><creator>Xin, Dongjun</creator><creator>Wang, Chuanli</creator><creator>Wang, Peng</creator><creator>Tan, Junwen</creator><creator>Kang, Wenjie</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20240124</creationdate><title>PANetW: PANet with wider receptive fields for object detection</title><author>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Datasets</topic><topic>Modules</topic><topic>Multimedia</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Semantics</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Task complexity</topic><topic>Telematics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Ran</au><au>Xin, Dongjun</au><au>Wang, Chuanli</au><au>Wang, Peng</au><au>Tan, Junwen</au><au>Kang, Wenjie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PANetW: PANet with wider receptive fields for object detection</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-01-24</date><risdate>2024</risdate><volume>83</volume><issue>25</issue><spage>66517</spage><epage>66538</epage><pages>66517-66538</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18219-7</doi><tpages>22</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1573-7721
ispartof	Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538
issn	1573-7721 1380-7501 1573-7721
language	eng
recordid	cdi_proquest_journals_3077577810
source	SpringerLink Journals - AutoHoldings
subjects	Accuracy Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Modules Multimedia Multimedia Information Systems Neural networks Object recognition Semantics Special Purpose and Application-Based Systems Task complexity Telematics
title	PANetW: PANet with wider receptive fields for object detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A43%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PANetW:%20PANet%20with%20wider%20receptive%20fields%20for%20object%20detection&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Chen,%20Ran&rft.date=2024-01-24&rft.volume=83&rft.issue=25&rft.spage=66517&rft.epage=66538&rft.pages=66517-66538&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18219-7&rft_dat=%3Cproquest_cross%3E3077577810%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3077577810&rft_id=info:pmid/&rfr_iscdi=true