An approach to improve SSD through mask prediction of multi-scale feature maps
We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise pr...
Gespeichert in:
Veröffentlicht in: | Pattern analysis and applications : PAA 2021-08, Vol.24 (3), p.1357-1366 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1366 |
---|---|
container_issue | 3 |
container_start_page | 1357 |
container_title | Pattern analysis and applications : PAA |
container_volume | 24 |
creator | Sun, Peng Zhao, Yaqin Zhu, Songhao |
description | We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts. |
doi_str_mv | 10.1007/s10044-021-00993-x |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2557061121</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2557061121</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhC0EEqXwApwscTb4L3F8rAoUpAoOBYmb5ThOm9LEwXZQeXsMQXDjsjuHb2ZXA8A5wZcEY3EV0uQcYUoQxlIytD8AE8IZQyLLXg5_NSfH4CSELcaMMVpMwMOsg7rvvdNmA6ODTZv0u4Wr1TWMG--G9Qa2OrzC3tuqMbFxHXQ1bIddbFAwemdhbXUcvE1YH07BUa13wZ797Cl4vr15mt-h5ePifj5bIsOIjIjSMqtlQaSlpjSclznhuuBGS67zXGY8z7QROTWiErq0pWaVkLLMJS8ME5VlU3Ax5qZv3wYbotq6wXfppKJZJnBOCCWJoiNlvAvB21r1vmm1_1AEq6_e1NibSr2p797UPpnYaAoJ7tbW_0X_4_oEOT1w2w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2557061121</pqid></control><display><type>article</type><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><source>SpringerLink Journals - AutoHoldings</source><creator>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</creator><creatorcontrib>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</creatorcontrib><description>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</description><identifier>ISSN: 1433-7541</identifier><identifier>EISSN: 1433-755X</identifier><identifier>DOI: 10.1007/s10044-021-00993-x</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer networks ; Computer Science ; Feature extraction ; Feature maps ; Object recognition ; Pattern Recognition ; Performance enhancement ; Semantics ; Short Paper</subject><ispartof>Pattern analysis and applications : PAA, 2021-08, Vol.24 (3), p.1357-1366</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</citedby><cites>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</cites><orcidid>0000-0002-9891-5692</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10044-021-00993-x$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10044-021-00993-x$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,41487,42556,51318</link.rule.ids></links><search><creatorcontrib>Sun, Peng</creatorcontrib><creatorcontrib>Zhao, Yaqin</creatorcontrib><creatorcontrib>Zhu, Songhao</creatorcontrib><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><title>Pattern analysis and applications : PAA</title><addtitle>Pattern Anal Applic</addtitle><description>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</description><subject>Computer networks</subject><subject>Computer Science</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Object recognition</subject><subject>Pattern Recognition</subject><subject>Performance enhancement</subject><subject>Semantics</subject><subject>Short Paper</subject><issn>1433-7541</issn><issn>1433-755X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kM1OwzAQhC0EEqXwApwscTb4L3F8rAoUpAoOBYmb5ThOm9LEwXZQeXsMQXDjsjuHb2ZXA8A5wZcEY3EV0uQcYUoQxlIytD8AE8IZQyLLXg5_NSfH4CSELcaMMVpMwMOsg7rvvdNmA6ODTZv0u4Wr1TWMG--G9Qa2OrzC3tuqMbFxHXQ1bIddbFAwemdhbXUcvE1YH07BUa13wZ797Cl4vr15mt-h5ePifj5bIsOIjIjSMqtlQaSlpjSclznhuuBGS67zXGY8z7QROTWiErq0pWaVkLLMJS8ME5VlU3Ax5qZv3wYbotq6wXfppKJZJnBOCCWJoiNlvAvB21r1vmm1_1AEq6_e1NibSr2p797UPpnYaAoJ7tbW_0X_4_oEOT1w2w</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Sun, Peng</creator><creator>Zhao, Yaqin</creator><creator>Zhu, Songhao</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-9891-5692</orcidid></search><sort><creationdate>20210801</creationdate><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><author>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer networks</topic><topic>Computer Science</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Object recognition</topic><topic>Pattern Recognition</topic><topic>Performance enhancement</topic><topic>Semantics</topic><topic>Short Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Peng</creatorcontrib><creatorcontrib>Zhao, Yaqin</creatorcontrib><creatorcontrib>Zhu, Songhao</creatorcontrib><collection>CrossRef</collection><jtitle>Pattern analysis and applications : PAA</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sun, Peng</au><au>Zhao, Yaqin</au><au>Zhu, Songhao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An approach to improve SSD through mask prediction of multi-scale feature maps</atitle><jtitle>Pattern analysis and applications : PAA</jtitle><stitle>Pattern Anal Applic</stitle><date>2021-08-01</date><risdate>2021</risdate><volume>24</volume><issue>3</issue><spage>1357</spage><epage>1366</epage><pages>1357-1366</pages><issn>1433-7541</issn><eissn>1433-755X</eissn><abstract>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10044-021-00993-x</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-9891-5692</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1433-7541 |
ispartof | Pattern analysis and applications : PAA, 2021-08, Vol.24 (3), p.1357-1366 |
issn | 1433-7541 1433-755X |
language | eng |
recordid | cdi_proquest_journals_2557061121 |
source | SpringerLink Journals - AutoHoldings |
subjects | Computer networks Computer Science Feature extraction Feature maps Object recognition Pattern Recognition Performance enhancement Semantics Short Paper |
title | An approach to improve SSD through mask prediction of multi-scale feature maps |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T19%3A50%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20approach%20to%20improve%20SSD%20through%20mask%20prediction%20of%20multi-scale%20feature%20maps&rft.jtitle=Pattern%20analysis%20and%20applications%20:%20PAA&rft.au=Sun,%20Peng&rft.date=2021-08-01&rft.volume=24&rft.issue=3&rft.spage=1357&rft.epage=1366&rft.pages=1357-1366&rft.issn=1433-7541&rft.eissn=1433-755X&rft_id=info:doi/10.1007/s10044-021-00993-x&rft_dat=%3Cproquest_cross%3E2557061121%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2557061121&rft_id=info:pmid/&rfr_iscdi=true |