An approach to improve SSD through mask prediction of multi-scale feature maps

We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern analysis and applications : PAA 2021-08, Vol.24 (3), p.1357-1366
Hauptverfasser: Sun, Peng, Zhao, Yaqin, Zhu, Songhao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1366
container_issue 3
container_start_page 1357
container_title Pattern analysis and applications : PAA
container_volume 24
creator Sun, Peng
Zhao, Yaqin
Zhu, Songhao
description We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.
doi_str_mv 10.1007/s10044-021-00993-x
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2557061121</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2557061121</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhC0EEqXwApwscTb4L3F8rAoUpAoOBYmb5ThOm9LEwXZQeXsMQXDjsjuHb2ZXA8A5wZcEY3EV0uQcYUoQxlIytD8AE8IZQyLLXg5_NSfH4CSELcaMMVpMwMOsg7rvvdNmA6ODTZv0u4Wr1TWMG--G9Qa2OrzC3tuqMbFxHXQ1bIddbFAwemdhbXUcvE1YH07BUa13wZ797Cl4vr15mt-h5ePifj5bIsOIjIjSMqtlQaSlpjSclznhuuBGS67zXGY8z7QROTWiErq0pWaVkLLMJS8ME5VlU3Ax5qZv3wYbotq6wXfppKJZJnBOCCWJoiNlvAvB21r1vmm1_1AEq6_e1NibSr2p797UPpnYaAoJ7tbW_0X_4_oEOT1w2w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2557061121</pqid></control><display><type>article</type><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><source>SpringerLink Journals - AutoHoldings</source><creator>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</creator><creatorcontrib>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</creatorcontrib><description>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</description><identifier>ISSN: 1433-7541</identifier><identifier>EISSN: 1433-755X</identifier><identifier>DOI: 10.1007/s10044-021-00993-x</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer networks ; Computer Science ; Feature extraction ; Feature maps ; Object recognition ; Pattern Recognition ; Performance enhancement ; Semantics ; Short Paper</subject><ispartof>Pattern analysis and applications : PAA, 2021-08, Vol.24 (3), p.1357-1366</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</citedby><cites>FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</cites><orcidid>0000-0002-9891-5692</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10044-021-00993-x$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10044-021-00993-x$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,41487,42556,51318</link.rule.ids></links><search><creatorcontrib>Sun, Peng</creatorcontrib><creatorcontrib>Zhao, Yaqin</creatorcontrib><creatorcontrib>Zhu, Songhao</creatorcontrib><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><title>Pattern analysis and applications : PAA</title><addtitle>Pattern Anal Applic</addtitle><description>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</description><subject>Computer networks</subject><subject>Computer Science</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Object recognition</subject><subject>Pattern Recognition</subject><subject>Performance enhancement</subject><subject>Semantics</subject><subject>Short Paper</subject><issn>1433-7541</issn><issn>1433-755X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kM1OwzAQhC0EEqXwApwscTb4L3F8rAoUpAoOBYmb5ThOm9LEwXZQeXsMQXDjsjuHb2ZXA8A5wZcEY3EV0uQcYUoQxlIytD8AE8IZQyLLXg5_NSfH4CSELcaMMVpMwMOsg7rvvdNmA6ODTZv0u4Wr1TWMG--G9Qa2OrzC3tuqMbFxHXQ1bIddbFAwemdhbXUcvE1YH07BUa13wZ797Cl4vr15mt-h5ePifj5bIsOIjIjSMqtlQaSlpjSclznhuuBGS67zXGY8z7QROTWiErq0pWaVkLLMJS8ME5VlU3Ax5qZv3wYbotq6wXfppKJZJnBOCCWJoiNlvAvB21r1vmm1_1AEq6_e1NibSr2p797UPpnYaAoJ7tbW_0X_4_oEOT1w2w</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Sun, Peng</creator><creator>Zhao, Yaqin</creator><creator>Zhu, Songhao</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-9891-5692</orcidid></search><sort><creationdate>20210801</creationdate><title>An approach to improve SSD through mask prediction of multi-scale feature maps</title><author>Sun, Peng ; Zhao, Yaqin ; Zhu, Songhao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-22b5f9819e2cbc44b614a84ca94a6695465ac762c7d7abeba3d799b6948c37de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer networks</topic><topic>Computer Science</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Object recognition</topic><topic>Pattern Recognition</topic><topic>Performance enhancement</topic><topic>Semantics</topic><topic>Short Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Peng</creatorcontrib><creatorcontrib>Zhao, Yaqin</creatorcontrib><creatorcontrib>Zhu, Songhao</creatorcontrib><collection>CrossRef</collection><jtitle>Pattern analysis and applications : PAA</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sun, Peng</au><au>Zhao, Yaqin</au><au>Zhu, Songhao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An approach to improve SSD through mask prediction of multi-scale feature maps</atitle><jtitle>Pattern analysis and applications : PAA</jtitle><stitle>Pattern Anal Applic</stitle><date>2021-08-01</date><risdate>2021</risdate><volume>24</volume><issue>3</issue><spage>1357</spage><epage>1366</epage><pages>1357-1366</pages><issn>1433-7541</issn><eissn>1433-755X</eissn><abstract>We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 × 300 input images (81.2 mAP by 512 × 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10044-021-00993-x</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-9891-5692</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1433-7541
ispartof Pattern analysis and applications : PAA, 2021-08, Vol.24 (3), p.1357-1366
issn 1433-7541
1433-755X
language eng
recordid cdi_proquest_journals_2557061121
source SpringerLink Journals - AutoHoldings
subjects Computer networks
Computer Science
Feature extraction
Feature maps
Object recognition
Pattern Recognition
Performance enhancement
Semantics
Short Paper
title An approach to improve SSD through mask prediction of multi-scale feature maps
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T19%3A50%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20approach%20to%20improve%20SSD%20through%20mask%20prediction%20of%20multi-scale%20feature%20maps&rft.jtitle=Pattern%20analysis%20and%20applications%20:%20PAA&rft.au=Sun,%20Peng&rft.date=2021-08-01&rft.volume=24&rft.issue=3&rft.spage=1357&rft.epage=1366&rft.pages=1357-1366&rft.issn=1433-7541&rft.eissn=1433-755X&rft_id=info:doi/10.1007/s10044-021-00993-x&rft_dat=%3Cproquest_cross%3E2557061121%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2557061121&rft_id=info:pmid/&rfr_iscdi=true