PANetW: PANet with wider receptive fields for object detection
PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024-01, Vol.83 (25), p.66517-66538 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 66538 |
---|---|
container_issue | 25 |
container_start_page | 66517 |
container_title | Multimedia tools and applications |
container_volume | 83 |
creator | Chen, Ran Xin, Dongjun Wang, Chuanli Wang, Peng Tan, Junwen Kang, Wenjie |
description | PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module. |
doi_str_mv | 10.1007/s11042-024-18219-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3077577810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3077577810</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3077577810</pqid></control><display><type>article</type><title>PANetW: PANet with wider receptive fields for object detection</title><source>SpringerLink Journals - AutoHoldings</source><creator>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creator><creatorcontrib>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</creatorcontrib><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18219-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Datasets ; Modules ; Multimedia ; Multimedia Information Systems ; Neural networks ; Object recognition ; Semantics ; Special Purpose and Application-Based Systems ; Task complexity ; Telematics</subject><ispartof>Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18219-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18219-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><title>PANetW: PANet with wider receptive fields for object detection</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</description><subject>Accuracy</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Datasets</subject><subject>Modules</subject><subject>Multimedia</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Semantics</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Task complexity</subject><subject>Telematics</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFPA8-pMNptkPQil-AWiHhSPId2d6JbarUmq-O-NXUFPXuadw_sBD2OHCMcIoE8iIkhRgJAFGoF1obfYCCtdFloL3P7z77K9GOcAqCohR-zsfnJL6emUb5R_dOkln5YCD9TQKnXvxH1HizZy3wfez-bUJN5SytL1y322490i0sGPjtnjxfnD9Kq4ubu8nk5uikZoSAUCzpR0HqVyRKoG44zzjkxVkaikd-AcKWylUFXTGi9aVRvvXAnSo6pFOWZHQ-8q9G9risnO-3VY5klbgtaV1gYhu8TgakIfYyBvV6F7deHTIthvTnbgZDMnu-FkdQ6VQyhm8_KZwm_1P6kvEj9pwA</recordid><startdate>20240124</startdate><enddate>20240124</enddate><creator>Chen, Ran</creator><creator>Xin, Dongjun</creator><creator>Wang, Chuanli</creator><creator>Wang, Peng</creator><creator>Tan, Junwen</creator><creator>Kang, Wenjie</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20240124</creationdate><title>PANetW: PANet with wider receptive fields for object detection</title><author>Chen, Ran ; Xin, Dongjun ; Wang, Chuanli ; Wang, Peng ; Tan, Junwen ; Kang, Wenjie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-101b64af146aee6908a8afae855e254fa0aae61d4265cd8f2d698faa304f16923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Datasets</topic><topic>Modules</topic><topic>Multimedia</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Semantics</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Task complexity</topic><topic>Telematics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Ran</creatorcontrib><creatorcontrib>Xin, Dongjun</creatorcontrib><creatorcontrib>Wang, Chuanli</creatorcontrib><creatorcontrib>Wang, Peng</creatorcontrib><creatorcontrib>Tan, Junwen</creatorcontrib><creatorcontrib>Kang, Wenjie</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Ran</au><au>Xin, Dongjun</au><au>Wang, Chuanli</au><au>Wang, Peng</au><au>Tan, Junwen</au><au>Kang, Wenjie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PANetW: PANet with wider receptive fields for object detection</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-01-24</date><risdate>2024</risdate><volume>83</volume><issue>25</issue><spage>66517</spage><epage>66538</epage><pages>66517-66538</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>PANet is widely used in various object detection tasks due to its powerful feature expression ability. However, PANet’s performance in complex scenarios is subpar, with frequent object omission or misidentification. We find that the reason for this phenomenon is that the receptive field of PANet can’t cover sufficient feature information, to deal with drastic changes of source object size. In order to solve this problem, this paper adopts dilated convolution technology and applies it to each parallel branch directly following the PANet network. This method can effectively represent the feature information of objects at different scales by integrating the information from small and large receptive fields into a new feature output. We also introduce residual structure to circumvent the network degradation caused by excessive convolutions. By combining the above methods, we build a new module named PANetW (PANet with Wider Receptive Fields). Taking YOLOX-S as the baseline, we comprehensively evaluated the proposed module PANetW on two datasets, VOC2007 and MSCOCO2017. The test results show that our PANetW achieves a high level of mean average precision (AP). On the VOC2007 dataset, the AP of our PANetW improves by 4.9% to 43.0%; on the MS COCO2017 dataset, the AP of PANetW is as high as 44.3%, far exceeding the current mainstream modules. The experimental results fully demonstrate the effectiveness of our module.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18219-7</doi><tpages>22</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1573-7721 |
ispartof | Multimedia tools and applications, 2024-01, Vol.83 (25), p.66517-66538 |
issn | 1573-7721 1380-7501 1573-7721 |
language | eng |
recordid | cdi_proquest_journals_3077577810 |
source | SpringerLink Journals - AutoHoldings |
subjects | Accuracy Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Modules Multimedia Multimedia Information Systems Neural networks Object recognition Semantics Special Purpose and Application-Based Systems Task complexity Telematics |
title | PANetW: PANet with wider receptive fields for object detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A43%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PANetW:%20PANet%20with%20wider%20receptive%20fields%20for%20object%20detection&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Chen,%20Ran&rft.date=2024-01-24&rft.volume=83&rft.issue=25&rft.spage=66517&rft.epage=66538&rft.pages=66517-66538&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18219-7&rft_dat=%3Cproquest_cross%3E3077577810%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3077577810&rft_id=info:pmid/&rfr_iscdi=true |