Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation

Knowledge distillation (KD) is a prevalent model compression technique in deep learning, aiming to leverage knowledge from a large teacher model to enhance the training of a smaller student model. It has found success in deploying compact deep models in intelligent applications like intelligent tran...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2024, Vol.26, p.7901-7916
Hauptverfasser:	Gou, Jianping, Chen, Yu, Yu, Baosheng, Liu, Jinhua, Du, Lan, Wan, Shaohua, Yi, Zhang
Format:	Artikel
Sprache:	eng
Schlagworte:	Computational modeling Correlation Deep learning Distillation Feedback feedback knowledge Knowledge knowledge distillation Knowledge engineering Knowledge transfer Learning Model compression Reviews Students Teachers Training visual recognition Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	7916
container_issue
container_start_page	7901
container_title	IEEE transactions on multimedia
container_volume	26
creator	Gou, Jianping Chen, Yu Yu, Baosheng Liu, Jinhua Du, Lan Wan, Shaohua Yi, Zhang
description	Knowledge distillation (KD) is a prevalent model compression technique in deep learning, aiming to leverage knowledge from a large teacher model to enhance the training of a smaller student model. It has found success in deploying compact deep models in intelligent applications like intelligent transportation, smart health, and distributed intelligence. Current knowledge distillation methods primarily fall into two categories: offline and online knowledge distillation. Offline methods involve a one-way distillation process, transferring unvaried knowledge from teacher to student, while online methods enable the simultaneous training of multiple peer students. However, existing knowledge distillation methods often face challenges where the student may not fully comprehend the teacher's knowledge due to model capacity gaps, and there might be knowledge incongruence among outputs of multiple students without teacher guidance. To address these issues, we propose a novel reciprocal teacher-student learning inspired by human teaching and examining through forward and feedback knowledge distillation (FFKD). Forward knowledge distillation operates offline, while feedback knowledge distillation follows an online scheme. The rationale is that feedback knowledge distillation enables the pre-trained teacher model to receive feedback from students, allowing the teacher to refine its teaching strategies accordingly. To achieve this, we introduce a new weighting constraint to gauge the extent of students' understanding of the teacher's knowledge, which is then utilized to enhance teaching strategies. Experimental results on five visual recognition datasets demonstrate that the proposed FFKD outperforms current state-of-the-art knowledge distillation methods.
doi_str_mv	10.1109/TMM.2024.3372833
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TMM_2024_3372833</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10465265</ieee_id><sourcerecordid>3044651986</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-11920251288bfc77bb6ab1f14ae93b8605a571df5e7691218907287c2d0ddde33</originalsourceid><addsrcrecordid>eNpNkL1PwzAQxS0EEuVjZ2CwxJzis-M4HlGhgChComW2HPtSXEICTkrFf4-rdmC6G9579-5HyAWwMQDT14vn5zFnPB8LoXgpxAEZgc4hY0ypw7RLzjLNgR2Tk75fMQa5ZGpE5q_owlfsnG3oAq17x5jNh7XHdqAztLEN7ZL-BEunXdzY6KltPZ0i-sq6D_rUdpsG_RLpbeiH0DR2CF17Ro5q2_R4vp-n5G16t5g8ZLOX-8fJzSxzXPMhA9CpsARellXtlKqqwlZQQ25Ri6osmLRSga8lqkIDh1Kz9Jly3DPvPQpxSq52uan_9xr7way6dWzTSSNYnhcSdFkkFdupXOz6PmJtvmL4tPHXADNbdCahM1t0Zo8uWS53loCI_-QpkhdS_AGkw2l4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3044651986</pqid></control><display><type>article</type><title>Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation</title><source>IEEE Electronic Library (IEL)</source><creator>Gou, Jianping ; Chen, Yu ; Yu, Baosheng ; Liu, Jinhua ; Du, Lan ; Wan, Shaohua ; Yi, Zhang</creator><creatorcontrib>Gou, Jianping ; Chen, Yu ; Yu, Baosheng ; Liu, Jinhua ; Du, Lan ; Wan, Shaohua ; Yi, Zhang</creatorcontrib><description>Knowledge distillation (KD) is a prevalent model compression technique in deep learning, aiming to leverage knowledge from a large teacher model to enhance the training of a smaller student model. It has found success in deploying compact deep models in intelligent applications like intelligent transportation, smart health, and distributed intelligence. Current knowledge distillation methods primarily fall into two categories: offline and online knowledge distillation. Offline methods involve a one-way distillation process, transferring unvaried knowledge from teacher to student, while online methods enable the simultaneous training of multiple peer students. However, existing knowledge distillation methods often face challenges where the student may not fully comprehend the teacher's knowledge due to model capacity gaps, and there might be knowledge incongruence among outputs of multiple students without teacher guidance. To address these issues, we propose a novel reciprocal teacher-student learning inspired by human teaching and examining through forward and feedback knowledge distillation (FFKD). Forward knowledge distillation operates offline, while feedback knowledge distillation follows an online scheme. The rationale is that feedback knowledge distillation enables the pre-trained teacher model to receive feedback from students, allowing the teacher to refine its teaching strategies accordingly. To achieve this, we introduce a new weighting constraint to gauge the extent of students' understanding of the teacher's knowledge, which is then utilized to enhance teaching strategies. Experimental results on five visual recognition datasets demonstrate that the proposed FFKD outperforms current state-of-the-art knowledge distillation methods.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2024.3372833</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computational modeling ; Correlation ; Deep learning ; Distillation ; Feedback ; feedback knowledge ; Knowledge ; knowledge distillation ; Knowledge engineering ; Knowledge transfer ; Learning ; Model compression ; Reviews ; Students ; Teachers ; Training ; visual recognition ; Visualization</subject><ispartof>IEEE transactions on multimedia, 2024, Vol.26, p.7901-7916</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-11920251288bfc77bb6ab1f14ae93b8605a571df5e7691218907287c2d0ddde33</citedby><cites>FETCH-LOGICAL-c292t-11920251288bfc77bb6ab1f14ae93b8605a571df5e7691218907287c2d0ddde33</cites><orcidid>0000-0001-7013-9081 ; 0000-0002-9925-0223 ; 0000-0002-5867-9322 ; 0000-0002-8438-7286 ; 0009-0003-3437-9717 ; 0000-0002-0761-7893 ; 0000-0001-9671-0135</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10465265$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27902,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10465265$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Chen, Yu</creatorcontrib><creatorcontrib>Yu, Baosheng</creatorcontrib><creatorcontrib>Liu, Jinhua</creatorcontrib><creatorcontrib>Du, Lan</creatorcontrib><creatorcontrib>Wan, Shaohua</creatorcontrib><creatorcontrib>Yi, Zhang</creatorcontrib><title>Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Knowledge distillation (KD) is a prevalent model compression technique in deep learning, aiming to leverage knowledge from a large teacher model to enhance the training of a smaller student model. It has found success in deploying compact deep models in intelligent applications like intelligent transportation, smart health, and distributed intelligence. Current knowledge distillation methods primarily fall into two categories: offline and online knowledge distillation. Offline methods involve a one-way distillation process, transferring unvaried knowledge from teacher to student, while online methods enable the simultaneous training of multiple peer students. However, existing knowledge distillation methods often face challenges where the student may not fully comprehend the teacher's knowledge due to model capacity gaps, and there might be knowledge incongruence among outputs of multiple students without teacher guidance. To address these issues, we propose a novel reciprocal teacher-student learning inspired by human teaching and examining through forward and feedback knowledge distillation (FFKD). Forward knowledge distillation operates offline, while feedback knowledge distillation follows an online scheme. The rationale is that feedback knowledge distillation enables the pre-trained teacher model to receive feedback from students, allowing the teacher to refine its teaching strategies accordingly. To achieve this, we introduce a new weighting constraint to gauge the extent of students' understanding of the teacher's knowledge, which is then utilized to enhance teaching strategies. Experimental results on five visual recognition datasets demonstrate that the proposed FFKD outperforms current state-of-the-art knowledge distillation methods.</description><subject>Computational modeling</subject><subject>Correlation</subject><subject>Deep learning</subject><subject>Distillation</subject><subject>Feedback</subject><subject>feedback knowledge</subject><subject>Knowledge</subject><subject>knowledge distillation</subject><subject>Knowledge engineering</subject><subject>Knowledge transfer</subject><subject>Learning</subject><subject>Model compression</subject><subject>Reviews</subject><subject>Students</subject><subject>Teachers</subject><subject>Training</subject><subject>visual recognition</subject><subject>Visualization</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkL1PwzAQxS0EEuVjZ2CwxJzis-M4HlGhgChComW2HPtSXEICTkrFf4-rdmC6G9579-5HyAWwMQDT14vn5zFnPB8LoXgpxAEZgc4hY0ypw7RLzjLNgR2Tk75fMQa5ZGpE5q_owlfsnG3oAq17x5jNh7XHdqAztLEN7ZL-BEunXdzY6KltPZ0i-sq6D_rUdpsG_RLpbeiH0DR2CF17Ro5q2_R4vp-n5G16t5g8ZLOX-8fJzSxzXPMhA9CpsARellXtlKqqwlZQQ25Ri6osmLRSga8lqkIDh1Kz9Jly3DPvPQpxSq52uan_9xr7way6dWzTSSNYnhcSdFkkFdupXOz6PmJtvmL4tPHXADNbdCahM1t0Zo8uWS53loCI_-QpkhdS_AGkw2l4</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Gou, Jianping</creator><creator>Chen, Yu</creator><creator>Yu, Baosheng</creator><creator>Liu, Jinhua</creator><creator>Du, Lan</creator><creator>Wan, Shaohua</creator><creator>Yi, Zhang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-7013-9081</orcidid><orcidid>https://orcid.org/0000-0002-9925-0223</orcidid><orcidid>https://orcid.org/0000-0002-5867-9322</orcidid><orcidid>https://orcid.org/0000-0002-8438-7286</orcidid><orcidid>https://orcid.org/0009-0003-3437-9717</orcidid><orcidid>https://orcid.org/0000-0002-0761-7893</orcidid><orcidid>https://orcid.org/0000-0001-9671-0135</orcidid></search><sort><creationdate>2024</creationdate><title>Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation</title><author>Gou, Jianping ; Chen, Yu ; Yu, Baosheng ; Liu, Jinhua ; Du, Lan ; Wan, Shaohua ; Yi, Zhang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-11920251288bfc77bb6ab1f14ae93b8605a571df5e7691218907287c2d0ddde33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computational modeling</topic><topic>Correlation</topic><topic>Deep learning</topic><topic>Distillation</topic><topic>Feedback</topic><topic>feedback knowledge</topic><topic>Knowledge</topic><topic>knowledge distillation</topic><topic>Knowledge engineering</topic><topic>Knowledge transfer</topic><topic>Learning</topic><topic>Model compression</topic><topic>Reviews</topic><topic>Students</topic><topic>Teachers</topic><topic>Training</topic><topic>visual recognition</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gou, Jianping</creatorcontrib><creatorcontrib>Chen, Yu</creatorcontrib><creatorcontrib>Yu, Baosheng</creatorcontrib><creatorcontrib>Liu, Jinhua</creatorcontrib><creatorcontrib>Du, Lan</creatorcontrib><creatorcontrib>Wan, Shaohua</creatorcontrib><creatorcontrib>Yi, Zhang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gou, Jianping</au><au>Chen, Yu</au><au>Yu, Baosheng</au><au>Liu, Jinhua</au><au>Du, Lan</au><au>Wan, Shaohua</au><au>Yi, Zhang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2024</date><risdate>2024</risdate><volume>26</volume><spage>7901</spage><epage>7916</epage><pages>7901-7916</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Knowledge distillation (KD) is a prevalent model compression technique in deep learning, aiming to leverage knowledge from a large teacher model to enhance the training of a smaller student model. It has found success in deploying compact deep models in intelligent applications like intelligent transportation, smart health, and distributed intelligence. Current knowledge distillation methods primarily fall into two categories: offline and online knowledge distillation. Offline methods involve a one-way distillation process, transferring unvaried knowledge from teacher to student, while online methods enable the simultaneous training of multiple peer students. However, existing knowledge distillation methods often face challenges where the student may not fully comprehend the teacher's knowledge due to model capacity gaps, and there might be knowledge incongruence among outputs of multiple students without teacher guidance. To address these issues, we propose a novel reciprocal teacher-student learning inspired by human teaching and examining through forward and feedback knowledge distillation (FFKD). Forward knowledge distillation operates offline, while feedback knowledge distillation follows an online scheme. The rationale is that feedback knowledge distillation enables the pre-trained teacher model to receive feedback from students, allowing the teacher to refine its teaching strategies accordingly. To achieve this, we introduce a new weighting constraint to gauge the extent of students' understanding of the teacher's knowledge, which is then utilized to enhance teaching strategies. Experimental results on five visual recognition datasets demonstrate that the proposed FFKD outperforms current state-of-the-art knowledge distillation methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2024.3372833</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-7013-9081</orcidid><orcidid>https://orcid.org/0000-0002-9925-0223</orcidid><orcidid>https://orcid.org/0000-0002-5867-9322</orcidid><orcidid>https://orcid.org/0000-0002-8438-7286</orcidid><orcidid>https://orcid.org/0009-0003-3437-9717</orcidid><orcidid>https://orcid.org/0000-0002-0761-7893</orcidid><orcidid>https://orcid.org/0000-0001-9671-0135</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2024, Vol.26, p.7901-7916
issn	1520-9210 1941-0077
language	eng
recordid	cdi_crossref_primary_10_1109_TMM_2024_3372833
source	IEEE Electronic Library (IEL)
subjects	Computational modeling Correlation Deep learning Distillation Feedback feedback knowledge Knowledge knowledge distillation Knowledge engineering Knowledge transfer Learning Model compression Reviews Students Teachers Training visual recognition Visualization
title	Reciprocal Teacher-Student Learning via Forward and Feedback Knowledge Distillation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T22%3A43%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reciprocal%20Teacher-Student%20Learning%20via%20Forward%20and%20Feedback%20Knowledge%20Distillation&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Gou,%20Jianping&rft.date=2024&rft.volume=26&rft.spage=7901&rft.epage=7916&rft.pages=7901-7916&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2024.3372833&rft_dat=%3Cproquest_RIE%3E3044651986%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3044651986&rft_id=info:pmid/&rft_ieee_id=10465265&rfr_iscdi=true