PANetW: PANet with wider receptive fields for object detection

PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-01, Vol.83 (25), p.66517-66538
Hauptverfasser: Chen, Ran, Xin, Dongjun, Wang, Chuanli, Wang, Peng, Tan, Junwen, Kang, Wenjie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 66538
container_issue 25
container_start_page 66517
container_title Multimedia tools and applications
container_volume 83
creator Chen, Ran
Xin, Dongjun
Wang, Chuanli
Wang, Peng
Tan, Junwen
Kang, Wenjie
description PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.
doi_str_mv 10.1007/s11042-024-18219-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3077577810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3077577810</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3077577810</pqid></control><display><type>article</type><title>PANetW: PANet with wider receptive fields for object detection</title><source>SpringerLink Journals - AutoHoldings</source><creator>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creator><creatorcontrib>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creatorcontrib><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18219-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Datasets ; Modules ; Multimedia ; Multimedia Information Systems ; Neural networks ; Object recognition ; Semantics ; Special Purpose and Application-Based Systems ; Task complexity ; Telematics</subject><ispartof>Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18219-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18219-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><title>PANetW: PANet with wider receptive fields for object detection</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><subject>Accuracy</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Datasets</subject><subject>Modules</subject><subject>Multimedia</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Semantics</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Task complexity</subject><subject>Telematics</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</recordid><startdate>20240124</startdate><enddate>20240124</enddate><creator>Chen, Ran</creator><creator>Xin, Dongjun</creator><creator>Wang, Chuanli</creator><creator>Wang, Peng</creator><creator>Tan, Junwen</creator><creator>Kang, Wenjie</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20240124</creationdate><title>PANetW: PANet with wider receptive fields for object detection</title><author>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Datasets</topic><topic>Modules</topic><topic>Multimedia</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Semantics</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Task complexity</topic><topic>Telematics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Ran</au><au>Xin, Dongjun</au><au>Wang, Chuanli</au><au>Wang, Peng</au><au>Tan, Junwen</au><au>Kang, Wenjie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PANetW: PANet with wider receptive fields for object detection</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-01-24</date><risdate>2024</risdate><volume>83</volume><issue>25</issue><spage>66517</spage><epage>66538</epage><pages>66517-66538</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18219-7</doi><tpages>22</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1573-7721
ispartof Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538
issn 1573-7721
1380-7501
1573-7721
language eng
recordid cdi_proquest_journals_3077577810
source SpringerLink Journals - AutoHoldings
subjects Accuracy
Computer Communication Networks
Computer Science
Data Structures and Information Theory
Datasets
Modules
Multimedia
Multimedia Information Systems
Neural networks
Object recognition
Semantics
Special Purpose and Application-Based Systems
Task complexity
Telematics
title PANetW: PANet with wider receptive fields for object detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A43%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PANetW:%20PANet%20with%20wider%20receptive%20fields%20for%20object%20detection&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Chen,%20Ran&rft.date=2024-01-24&rft.volume=83&rft.issue=25&rft.spage=66517&rft.epage=66538&rft.pages=66517-66538&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18219-7&rft_dat=%3Cproquest_cross%3E3077577810%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3077577810&rft_id=info:pmid/&rfr_iscdi=true