Task-Balanced Distillation for Object Detection

Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2022-08
Hauptverfasser: Tang, Ruining, Liu, Zhenyu, Li, Yangguang, Song, Yiguo, Liu, Hui, Wang, Qide, Shao, Jing, Duan, Guifang, Tan, Jianrong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Tang, Ruining
Liu, Zhenyu
Li, Yangguang
Song, Yiguo
Liu, Hui
Wang, Qide
Shao, Jing
Duan, Guifang
Tan, Jianrong
description Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Therefore, this paper alleviates this misalignment in the view of knowledge distillation. First, we observe that the massive teacher achieves a higher proportion of harmonious predictions than the lightweight student. Based on this intriguing observation, a novel Harmony Score (HS) is devised to estimate the alignment of classification and regression qualities. HS models the relationship between two sub-tasks and is seen as prior knowledge to promote harmonious predictions for the student. Second, this spatial misalignment will result in inharmonious region selection when distilling features. To alleviate this problem, a novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks. Eventually, HD and TFD constitute the proposed method, named Task-Balanced Distillation (TBD). Extensive experiments demonstrate the considerable potential and generalization of the proposed method. Specifically, when equipped with TBD, RetinaNet with ResNet-50 achieves 41.0 mAP under the COCO benchmark, outperforming the recent FGD and FRS.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2699825324</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2699825324</sourcerecordid><originalsourceid>FETCH-proquest_journals_26998253243</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQD0ksztZ1SsxJzEtOTVFwySwuyczJSSzJzM9TSMsvUvBPykpNLlFwSS0BUkBBHgbWtMSc4lReKM3NoOzmGuLsoVtQlF9YmlpcEp-VX1qUB5SKNzKztLQwMjU2MjEmThUAE1Iyjw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2699825324</pqid></control><display><type>article</type><title>Task-Balanced Distillation for Object Detection</title><source>Free E- Journals</source><creator>Tang, Ruining ; Liu, Zhenyu ; Li, Yangguang ; Song, Yiguo ; Liu, Hui ; Wang, Qide ; Shao, Jing ; Duan, Guifang ; Tan, Jianrong</creator><creatorcontrib>Tang, Ruining ; Liu, Zhenyu ; Li, Yangguang ; Song, Yiguo ; Liu, Hui ; Wang, Qide ; Shao, Jing ; Duan, Guifang ; Tan, Jianrong</creatorcontrib><description>Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Therefore, this paper alleviates this misalignment in the view of knowledge distillation. First, we observe that the massive teacher achieves a higher proportion of harmonious predictions than the lightweight student. Based on this intriguing observation, a novel Harmony Score (HS) is devised to estimate the alignment of classification and regression qualities. HS models the relationship between two sub-tasks and is seen as prior knowledge to promote harmonious predictions for the student. Second, this spatial misalignment will result in inharmonious region selection when distilling features. To alleviate this problem, a novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks. Eventually, HD and TFD constitute the proposed method, named Task-Balanced Distillation (TBD). Extensive experiments demonstrate the considerable potential and generalization of the proposed method. Specifically, when equipped with TBD, RetinaNet with ResNet-50 achieves 41.0 mAP under the COCO benchmark, outperforming the recent FGD and FRS.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Distillation ; Misalignment ; Object recognition ; Regression ; Spatial distribution</subject><ispartof>arXiv.org, 2022-08</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Tang, Ruining</creatorcontrib><creatorcontrib>Liu, Zhenyu</creatorcontrib><creatorcontrib>Li, Yangguang</creatorcontrib><creatorcontrib>Song, Yiguo</creatorcontrib><creatorcontrib>Liu, Hui</creatorcontrib><creatorcontrib>Wang, Qide</creatorcontrib><creatorcontrib>Shao, Jing</creatorcontrib><creatorcontrib>Duan, Guifang</creatorcontrib><creatorcontrib>Tan, Jianrong</creatorcontrib><title>Task-Balanced Distillation for Object Detection</title><title>arXiv.org</title><description>Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Therefore, this paper alleviates this misalignment in the view of knowledge distillation. First, we observe that the massive teacher achieves a higher proportion of harmonious predictions than the lightweight student. Based on this intriguing observation, a novel Harmony Score (HS) is devised to estimate the alignment of classification and regression qualities. HS models the relationship between two sub-tasks and is seen as prior knowledge to promote harmonious predictions for the student. Second, this spatial misalignment will result in inharmonious region selection when distilling features. To alleviate this problem, a novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks. Eventually, HD and TFD constitute the proposed method, named Task-Balanced Distillation (TBD). Extensive experiments demonstrate the considerable potential and generalization of the proposed method. Specifically, when equipped with TBD, RetinaNet with ResNet-50 achieves 41.0 mAP under the COCO benchmark, outperforming the recent FGD and FRS.</description><subject>Classification</subject><subject>Distillation</subject><subject>Misalignment</subject><subject>Object recognition</subject><subject>Regression</subject><subject>Spatial distribution</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQD0ksztZ1SsxJzEtOTVFwySwuyczJSSzJzM9TSMsvUvBPykpNLlFwSS0BUkBBHgbWtMSc4lReKM3NoOzmGuLsoVtQlF9YmlpcEp-VX1qUB5SKNzKztLQwMjU2MjEmThUAE1Iyjw</recordid><startdate>20220805</startdate><enddate>20220805</enddate><creator>Tang, Ruining</creator><creator>Liu, Zhenyu</creator><creator>Li, Yangguang</creator><creator>Song, Yiguo</creator><creator>Liu, Hui</creator><creator>Wang, Qide</creator><creator>Shao, Jing</creator><creator>Duan, Guifang</creator><creator>Tan, Jianrong</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220805</creationdate><title>Task-Balanced Distillation for Object Detection</title><author>Tang, Ruining ; Liu, Zhenyu ; Li, Yangguang ; Song, Yiguo ; Liu, Hui ; Wang, Qide ; Shao, Jing ; Duan, Guifang ; Tan, Jianrong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26998253243</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classification</topic><topic>Distillation</topic><topic>Misalignment</topic><topic>Object recognition</topic><topic>Regression</topic><topic>Spatial distribution</topic><toplevel>online_resources</toplevel><creatorcontrib>Tang, Ruining</creatorcontrib><creatorcontrib>Liu, Zhenyu</creatorcontrib><creatorcontrib>Li, Yangguang</creatorcontrib><creatorcontrib>Song, Yiguo</creatorcontrib><creatorcontrib>Liu, Hui</creatorcontrib><creatorcontrib>Wang, Qide</creatorcontrib><creatorcontrib>Shao, Jing</creatorcontrib><creatorcontrib>Duan, Guifang</creatorcontrib><creatorcontrib>Tan, Jianrong</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Ruining</au><au>Liu, Zhenyu</au><au>Li, Yangguang</au><au>Song, Yiguo</au><au>Liu, Hui</au><au>Wang, Qide</au><au>Shao, Jing</au><au>Duan, Guifang</au><au>Tan, Jianrong</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Task-Balanced Distillation for Object Detection</atitle><jtitle>arXiv.org</jtitle><date>2022-08-05</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Mainstream object detectors are commonly constituted of two sub-tasks, including classification and regression tasks, implemented by two parallel heads. This classic design paradigm inevitably leads to inconsistent spatial distributions between classification score and localization quality (IOU). Therefore, this paper alleviates this misalignment in the view of knowledge distillation. First, we observe that the massive teacher achieves a higher proportion of harmonious predictions than the lightweight student. Based on this intriguing observation, a novel Harmony Score (HS) is devised to estimate the alignment of classification and regression qualities. HS models the relationship between two sub-tasks and is seen as prior knowledge to promote harmonious predictions for the student. Second, this spatial misalignment will result in inharmonious region selection when distilling features. To alleviate this problem, a novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks. Eventually, HD and TFD constitute the proposed method, named Task-Balanced Distillation (TBD). Extensive experiments demonstrate the considerable potential and generalization of the proposed method. Specifically, when equipped with TBD, RetinaNet with ResNet-50 achieves 41.0 mAP under the COCO benchmark, outperforming the recent FGD and FRS.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-08
issn 2331-8422
language eng
recordid cdi_proquest_journals_2699825324
source Free E- Journals
subjects Classification
Distillation
Misalignment
Object recognition
Regression
Spatial distribution
title Task-Balanced Distillation for Object Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T15%3A39%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Task-Balanced%20Distillation%20for%20Object%20Detection&rft.jtitle=arXiv.org&rft.au=Tang,%20Ruining&rft.date=2022-08-05&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2699825324%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2699825324&rft_id=info:pmid/&rfr_iscdi=true