HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection

Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besid...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2021, Vol.30, p.9456-9469
Hauptverfasser: Li, Wuyang, Chen, Zhen, Li, Baopu, Zhang, Dingwen, Yuan, Yixuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 9469
container_issue
container_start_page 9456
container_title IEEE transactions on image processing
container_volume 30
creator Li, Wuyang
Chen, Zhen
Li, Baopu
Zhang, Dingwen
Yuan, Yixuan
description Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besides, the shared knowledge that may benefit the two branches is neglected, leading to potential excessive decoupling and semantic inconsistency. To address these two issues, we propose Heterogeneous task decoupling (HTD) framework for object detection, which utilizes a Progressive Graph (PGraph) module and a Border-aware Adaptation (BA) module for task-decoupling. Specifically, we first devise a Semantic Feature Aggregation (SFA) module to aggregate global semantics with image-level supervision, serving as the shared knowledge for the task-decoupled framework. Then, the PGraph module performs progressive graph reasoning, including local spatial aggregation and global semantic interaction, to enhance semantic representations of region proposals for classification. The proposed BA module integrates multi-level features adaptively, focusing on the low-level border activation to obtain representations with spatial and border perception for regression. Finally, we utilize the aggregated knowledge from SFA to keep the instance-level semantic consistency (ISC) of decoupled frameworks. Extensive experiments demonstrate that HTD outperforms existing detection works by a large margin, and achieves single-model 50.4%AP and 33.2% AP s on COCO test-dev set using ResNet-101-DCN backbone, which is the best entry among state-of-the-arts under the same configuration. Our code is available at https://github.com/CityU-AIM-Group/HTD .
doi_str_mv 10.1109/TIP.2021.3126423
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2599209575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9615001</ieee_id><sourcerecordid>2598075777</sourcerecordid><originalsourceid>FETCH-LOGICAL-c324t-a76012a12b2679a9c0673d952fe0bcc7fe956e440e09f64f0ae590a8a2cfc56a3</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMotlbvgpcFL162TrL5aLyVVm2hUMH1HNI4W7ZuN3Wzi_jvTW3x4OkdmOcdhoeQawpDSkHf5_OXIQNGhxllkrPshPSp5jQF4Ow0ziBUqijXPXIRwgaAckHlOellXI0gY7JPxrN8-pDMsMXGr7FG34Ukt-EjmaLz3a4q63VS-CbJv3z62to1JsvVBl0b922M0teX5KywVcCrYw7I29NjPpmli-XzfDJepC5jvE2tkkCZpWzFpNJWO5Aqe9eCFQgr51SBWkjkHBB0IXkBFoUGO7LMFU5Imw3I3eHurvGfHYbWbMvgsKrs79eGCT0CJZRSEb39h25819Txuz2lGWihRKTgQLnGh9BgYXZNubXNt6Fg9npN1Gv2es1Rb6zcHColIv7hWlIR3WY_M6NyBQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2599209575</pqid></control><display><type>article</type><title>HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Wuyang ; Chen, Zhen ; Li, Baopu ; Zhang, Dingwen ; Yuan, Yixuan</creator><creatorcontrib>Li, Wuyang ; Chen, Zhen ; Li, Baopu ; Zhang, Dingwen ; Yuan, Yixuan</creatorcontrib><description>Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besides, the shared knowledge that may benefit the two branches is neglected, leading to potential excessive decoupling and semantic inconsistency. To address these two issues, we propose Heterogeneous task decoupling (HTD) framework for object detection, which utilizes a Progressive Graph (PGraph) module and a Border-aware Adaptation (BA) module for task-decoupling. Specifically, we first devise a Semantic Feature Aggregation (SFA) module to aggregate global semantics with image-level supervision, serving as the shared knowledge for the task-decoupled framework. Then, the PGraph module performs progressive graph reasoning, including local spatial aggregation and global semantic interaction, to enhance semantic representations of region proposals for classification. The proposed BA module integrates multi-level features adaptively, focusing on the low-level border activation to obtain representations with spatial and border perception for regression. Finally, we utilize the aggregated knowledge from SFA to keep the instance-level semantic consistency (ISC) of decoupled frameworks. Extensive experiments demonstrate that HTD outperforms existing detection works by a large margin, and achieves single-model 50.4%AP and 33.2% AP s on COCO test-dev set using ResNet-101-DCN backbone, which is the best entry among state-of-the-arts under the same configuration. Our code is available at https://github.com/CityU-AIM-Group/HTD .</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2021.3126423</identifier><identifier>PMID: 34780326</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Agglomeration ; Classification ; Cognition ; Decoupling ; Feature extraction ; Location awareness ; Misalignment ; Modules ; Object detection ; Object recognition ; Representations ; Semantics ; Task analysis ; task-decoupled framework</subject><ispartof>IEEE transactions on image processing, 2021, Vol.30, p.9456-9469</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c324t-a76012a12b2679a9c0673d952fe0bcc7fe956e440e09f64f0ae590a8a2cfc56a3</citedby><cites>FETCH-LOGICAL-c324t-a76012a12b2679a9c0673d952fe0bcc7fe956e440e09f64f0ae590a8a2cfc56a3</cites><orcidid>0000-0003-0255-6435 ; 0000-0002-7338-9251 ; 0000-0002-0853-6948 ; 0000-0002-9032-3991</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9615001$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9615001$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Wuyang</creatorcontrib><creatorcontrib>Chen, Zhen</creatorcontrib><creatorcontrib>Li, Baopu</creatorcontrib><creatorcontrib>Zhang, Dingwen</creatorcontrib><creatorcontrib>Yuan, Yixuan</creatorcontrib><title>HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><description>Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besides, the shared knowledge that may benefit the two branches is neglected, leading to potential excessive decoupling and semantic inconsistency. To address these two issues, we propose Heterogeneous task decoupling (HTD) framework for object detection, which utilizes a Progressive Graph (PGraph) module and a Border-aware Adaptation (BA) module for task-decoupling. Specifically, we first devise a Semantic Feature Aggregation (SFA) module to aggregate global semantics with image-level supervision, serving as the shared knowledge for the task-decoupled framework. Then, the PGraph module performs progressive graph reasoning, including local spatial aggregation and global semantic interaction, to enhance semantic representations of region proposals for classification. The proposed BA module integrates multi-level features adaptively, focusing on the low-level border activation to obtain representations with spatial and border perception for regression. Finally, we utilize the aggregated knowledge from SFA to keep the instance-level semantic consistency (ISC) of decoupled frameworks. Extensive experiments demonstrate that HTD outperforms existing detection works by a large margin, and achieves single-model 50.4%AP and 33.2% AP s on COCO test-dev set using ResNet-101-DCN backbone, which is the best entry among state-of-the-arts under the same configuration. Our code is available at https://github.com/CityU-AIM-Group/HTD .</description><subject>Agglomeration</subject><subject>Classification</subject><subject>Cognition</subject><subject>Decoupling</subject><subject>Feature extraction</subject><subject>Location awareness</subject><subject>Misalignment</subject><subject>Modules</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Representations</subject><subject>Semantics</subject><subject>Task analysis</subject><subject>task-decoupled framework</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LAzEQhoMotlbvgpcFL162TrL5aLyVVm2hUMH1HNI4W7ZuN3Wzi_jvTW3x4OkdmOcdhoeQawpDSkHf5_OXIQNGhxllkrPshPSp5jQF4Ow0ziBUqijXPXIRwgaAckHlOellXI0gY7JPxrN8-pDMsMXGr7FG34Ukt-EjmaLz3a4q63VS-CbJv3z62to1JsvVBl0b922M0teX5KywVcCrYw7I29NjPpmli-XzfDJepC5jvE2tkkCZpWzFpNJWO5Aqe9eCFQgr51SBWkjkHBB0IXkBFoUGO7LMFU5Imw3I3eHurvGfHYbWbMvgsKrs79eGCT0CJZRSEb39h25819Txuz2lGWihRKTgQLnGh9BgYXZNubXNt6Fg9npN1Gv2es1Rb6zcHColIv7hWlIR3WY_M6NyBQ</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Li, Wuyang</creator><creator>Chen, Zhen</creator><creator>Li, Baopu</creator><creator>Zhang, Dingwen</creator><creator>Yuan, Yixuan</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0255-6435</orcidid><orcidid>https://orcid.org/0000-0002-7338-9251</orcidid><orcidid>https://orcid.org/0000-0002-0853-6948</orcidid><orcidid>https://orcid.org/0000-0002-9032-3991</orcidid></search><sort><creationdate>2021</creationdate><title>HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection</title><author>Li, Wuyang ; Chen, Zhen ; Li, Baopu ; Zhang, Dingwen ; Yuan, Yixuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c324t-a76012a12b2679a9c0673d952fe0bcc7fe956e440e09f64f0ae590a8a2cfc56a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Agglomeration</topic><topic>Classification</topic><topic>Cognition</topic><topic>Decoupling</topic><topic>Feature extraction</topic><topic>Location awareness</topic><topic>Misalignment</topic><topic>Modules</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Representations</topic><topic>Semantics</topic><topic>Task analysis</topic><topic>task-decoupled framework</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Wuyang</creatorcontrib><creatorcontrib>Chen, Zhen</creatorcontrib><creatorcontrib>Li, Baopu</creatorcontrib><creatorcontrib>Zhang, Dingwen</creatorcontrib><creatorcontrib>Yuan, Yixuan</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Wuyang</au><au>Chen, Zhen</au><au>Li, Baopu</au><au>Zhang, Dingwen</au><au>Yuan, Yixuan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><date>2021</date><risdate>2021</risdate><volume>30</volume><spage>9456</spage><epage>9469</epage><pages>9456-9469</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Decoupling the sibling head has recently shown great potential in relieving the inherent task-misalignment problem in two-stage object detectors. However, existing works design similar structures for the classification and regression, ignoring task-specific characteristics and feature demands. Besides, the shared knowledge that may benefit the two branches is neglected, leading to potential excessive decoupling and semantic inconsistency. To address these two issues, we propose Heterogeneous task decoupling (HTD) framework for object detection, which utilizes a Progressive Graph (PGraph) module and a Border-aware Adaptation (BA) module for task-decoupling. Specifically, we first devise a Semantic Feature Aggregation (SFA) module to aggregate global semantics with image-level supervision, serving as the shared knowledge for the task-decoupled framework. Then, the PGraph module performs progressive graph reasoning, including local spatial aggregation and global semantic interaction, to enhance semantic representations of region proposals for classification. The proposed BA module integrates multi-level features adaptively, focusing on the low-level border activation to obtain representations with spatial and border perception for regression. Finally, we utilize the aggregated knowledge from SFA to keep the instance-level semantic consistency (ISC) of decoupled frameworks. Extensive experiments demonstrate that HTD outperforms existing detection works by a large margin, and achieves single-model 50.4%AP and 33.2% AP s on COCO test-dev set using ResNet-101-DCN backbone, which is the best entry among state-of-the-arts under the same configuration. Our code is available at https://github.com/CityU-AIM-Group/HTD .</abstract><cop>New York</cop><pub>IEEE</pub><pmid>34780326</pmid><doi>10.1109/TIP.2021.3126423</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-0255-6435</orcidid><orcidid>https://orcid.org/0000-0002-7338-9251</orcidid><orcidid>https://orcid.org/0000-0002-0853-6948</orcidid><orcidid>https://orcid.org/0000-0002-9032-3991</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2021, Vol.30, p.9456-9469
issn 1057-7149
1941-0042
language eng
recordid cdi_proquest_journals_2599209575
source IEEE Electronic Library (IEL)
subjects Agglomeration
Classification
Cognition
Decoupling
Feature extraction
Location awareness
Misalignment
Modules
Object detection
Object recognition
Representations
Semantics
Task analysis
task-decoupled framework
title HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T03%3A09%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HTD:%20Heterogeneous%20Task%20Decoupling%20for%20Two-Stage%20Object%20Detection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Li,%20Wuyang&rft.date=2021&rft.volume=30&rft.spage=9456&rft.epage=9469&rft.pages=9456-9469&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2021.3126423&rft_dat=%3Cproquest_RIE%3E2598075777%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2599209575&rft_id=info:pmid/34780326&rft_ieee_id=9615001&rfr_iscdi=true