BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection
In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language...
Gespeichert in:
Veröffentlicht in: | PloS one 2021-09, Vol.16 (9), p.e0257130-e0257130 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | e0257130 |
---|---|
container_issue | 9 |
container_start_page | e0257130 |
container_title | PloS one |
container_volume | 16 |
creator | Li, Yang Sun, Yuqing Zhu, Nana |
description | In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel ‘student’ CNN structure from a much larger ‘teacher’ language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously. |
doi_str_mv | 10.1371/journal.pone.0257130 |
format | Article |
fullrecord | <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2571447373</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A674968939</galeid><doaj_id>oai_doaj_org_article_20905e10c23048df9ce82dda4d80aa51</doaj_id><sourcerecordid>A674968939</sourcerecordid><originalsourceid>FETCH-LOGICAL-c735t-f80e8c5376530bbe187edcb70eee49cec189b814ed28dcfc6a8eaca7f491ea1c3</originalsourceid><addsrcrecordid>eNqNkl9rFDEUxQdRbK1-A8EBQfRh12SSmSR9EOpSdaG00D--hmxyZzbr7GRNMtV-ezPuKB3pg-Qh4eaXc28OJ8teYjTHhOH3G9f7TrXznetgjoqSYYIeZYdYkGJWFYg8vnc-yJ6FsEGoJLyqnmYHhJaoKqk4zG4-nl5eR7c4Pz_Or-zWtsrbeDfbeQjgb23X5NCtVafB5N8696MF00BubIi2bVW0rstr5_MQByQ3EEEPxefZk1q1AV6M-1F28-n0evFldnbxebk4OZtpRso4qzkCrkvCqpKg1QowZ2D0iiEAoEKDxlysOKZgCm50rSvFQWnFaiowKKzJUfZqr7trXZCjI0EOZlDKCCOJWO4J49RG7rzdKn8nnbLyd8H5RiofrW5BFkigEjDSBUGUmzoNwAtjFDUcKVXipPVh7NavtmlQ6KJX7UR0etPZtWzcreSUFLzkSeDtKODd9x5ClFsbNCQnO3D9fm6BOREsoa__QR_-3Ug1Kn3AdrVLffUgKk8qRkXFBRGJmj9ApWVga3WKT21TffLg3eRBYiL8jI3qQ5DLq8v_Zy--Ttk399g1qDaug2v7ITJhCtI9qL0LwUP912SM5JD-P27IIf1yTD_5BSqf9zw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2571447373</pqid></control><display><type>article</type><title>BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection</title><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Li, Yang ; Sun, Yuqing ; Zhu, Nana</creator><contributor>Zhang, Weinan</contributor><creatorcontrib>Li, Yang ; Sun, Yuqing ; Zhu, Nana ; Zhang, Weinan</creatorcontrib><description>In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel ‘student’ CNN structure from a much larger ‘teacher’ language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0257130</identifier><identifier>PMID: 34506549</identifier><language>eng</language><publisher>San Francisco: Public Library of Science</publisher><subject>Analysis ; Biology and Life Sciences ; Classification ; Computational linguistics ; Computer and Information Sciences ; Computer engineering ; Data mining ; Distillation ; Information management ; Knowledge ; Language ; Language processing ; Learning ; Methods ; Modelling ; Natural language interfaces ; Network management systems ; Neural networks ; Optimization ; People and Places ; Product reviews ; Research and Analysis Methods ; Sentiment analysis ; Similarity ; Social networks ; Social Sciences ; Teachers ; Text categorization</subject><ispartof>PloS one, 2021-09, Vol.16 (9), p.e0257130-e0257130</ispartof><rights>COPYRIGHT 2021 Public Library of Science</rights><rights>2021 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 Li et al 2021 Li et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c735t-f80e8c5376530bbe187edcb70eee49cec189b814ed28dcfc6a8eaca7f491ea1c3</citedby><cites>FETCH-LOGICAL-c735t-f80e8c5376530bbe187edcb70eee49cec189b814ed28dcfc6a8eaca7f491ea1c3</cites><orcidid>0000-0002-0403-7287</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432858/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432858/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,725,778,782,862,883,2098,2917,23849,27907,27908,53774,53776,79351,79352</link.rule.ids></links><search><contributor>Zhang, Weinan</contributor><creatorcontrib>Li, Yang</creatorcontrib><creatorcontrib>Sun, Yuqing</creatorcontrib><creatorcontrib>Zhu, Nana</creatorcontrib><title>BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection</title><title>PloS one</title><description>In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel ‘student’ CNN structure from a much larger ‘teacher’ language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.</description><subject>Analysis</subject><subject>Biology and Life Sciences</subject><subject>Classification</subject><subject>Computational linguistics</subject><subject>Computer and Information Sciences</subject><subject>Computer engineering</subject><subject>Data mining</subject><subject>Distillation</subject><subject>Information management</subject><subject>Knowledge</subject><subject>Language</subject><subject>Language processing</subject><subject>Learning</subject><subject>Methods</subject><subject>Modelling</subject><subject>Natural language interfaces</subject><subject>Network management systems</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>People and Places</subject><subject>Product reviews</subject><subject>Research and Analysis Methods</subject><subject>Sentiment analysis</subject><subject>Similarity</subject><subject>Social networks</subject><subject>Social Sciences</subject><subject>Teachers</subject><subject>Text categorization</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNqNkl9rFDEUxQdRbK1-A8EBQfRh12SSmSR9EOpSdaG00D--hmxyZzbr7GRNMtV-ezPuKB3pg-Qh4eaXc28OJ8teYjTHhOH3G9f7TrXznetgjoqSYYIeZYdYkGJWFYg8vnc-yJ6FsEGoJLyqnmYHhJaoKqk4zG4-nl5eR7c4Pz_Or-zWtsrbeDfbeQjgb23X5NCtVafB5N8696MF00BubIi2bVW0rstr5_MQByQ3EEEPxefZk1q1AV6M-1F28-n0evFldnbxebk4OZtpRso4qzkCrkvCqpKg1QowZ2D0iiEAoEKDxlysOKZgCm50rSvFQWnFaiowKKzJUfZqr7trXZCjI0EOZlDKCCOJWO4J49RG7rzdKn8nnbLyd8H5RiofrW5BFkigEjDSBUGUmzoNwAtjFDUcKVXipPVh7NavtmlQ6KJX7UR0etPZtWzcreSUFLzkSeDtKODd9x5ClFsbNCQnO3D9fm6BOREsoa__QR_-3Ug1Kn3AdrVLffUgKk8qRkXFBRGJmj9ApWVga3WKT21TffLg3eRBYiL8jI3qQ5DLq8v_Zy--Ttk399g1qDaug2v7ITJhCtI9qL0LwUP912SM5JD-P27IIf1yTD_5BSqf9zw</recordid><startdate>20210910</startdate><enddate>20210910</enddate><creator>Li, Yang</creator><creator>Sun, Yuqing</creator><creator>Zhu, Nana</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-0403-7287</orcidid></search><sort><creationdate>20210910</creationdate><title>BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection</title><author>Li, Yang ; Sun, Yuqing ; Zhu, Nana</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c735t-f80e8c5376530bbe187edcb70eee49cec189b814ed28dcfc6a8eaca7f491ea1c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Analysis</topic><topic>Biology and Life Sciences</topic><topic>Classification</topic><topic>Computational linguistics</topic><topic>Computer and Information Sciences</topic><topic>Computer engineering</topic><topic>Data mining</topic><topic>Distillation</topic><topic>Information management</topic><topic>Knowledge</topic><topic>Language</topic><topic>Language processing</topic><topic>Learning</topic><topic>Methods</topic><topic>Modelling</topic><topic>Natural language interfaces</topic><topic>Network management systems</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>People and Places</topic><topic>Product reviews</topic><topic>Research and Analysis Methods</topic><topic>Sentiment analysis</topic><topic>Similarity</topic><topic>Social networks</topic><topic>Social Sciences</topic><topic>Teachers</topic><topic>Text categorization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Yang</creatorcontrib><creatorcontrib>Sun, Yuqing</creatorcontrib><creatorcontrib>Zhu, Nana</creatorcontrib><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing & Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing & Allied Health Database (Alumni Edition)</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Nursing & Allied Health Premium</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials Science Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Yang</au><au>Sun, Yuqing</au><au>Zhu, Nana</au><au>Zhang, Weinan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection</atitle><jtitle>PloS one</jtitle><date>2021-09-10</date><risdate>2021</risdate><volume>16</volume><issue>9</issue><spage>e0257130</spage><epage>e0257130</epage><pages>e0257130-e0257130</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel ‘student’ CNN structure from a much larger ‘teacher’ language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.</abstract><cop>San Francisco</cop><pub>Public Library of Science</pub><pmid>34506549</pmid><doi>10.1371/journal.pone.0257130</doi><tpages>e0257130</tpages><orcidid>https://orcid.org/0000-0002-0403-7287</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1932-6203 |
ispartof | PloS one, 2021-09, Vol.16 (9), p.e0257130-e0257130 |
issn | 1932-6203 1932-6203 |
language | eng |
recordid | cdi_plos_journals_2571447373 |
source | DOAJ Directory of Open Access Journals; Public Library of Science (PLoS); EZB-FREE-00999 freely available EZB journals; PubMed Central; Free Full-Text Journals in Chemistry |
subjects | Analysis Biology and Life Sciences Classification Computational linguistics Computer and Information Sciences Computer engineering Data mining Distillation Information management Knowledge Language Language processing Learning Methods Modelling Natural language interfaces Network management systems Neural networks Optimization People and Places Product reviews Research and Analysis Methods Sentiment analysis Similarity Social networks Social Sciences Teachers Text categorization |
title | BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T20%3A01%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=BERTtoCNN:%20Similarity-preserving%20enhanced%20knowledge%20distillation%20for%20stance%20detection&rft.jtitle=PloS%20one&rft.au=Li,%20Yang&rft.date=2021-09-10&rft.volume=16&rft.issue=9&rft.spage=e0257130&rft.epage=e0257130&rft.pages=e0257130-e0257130&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0257130&rft_dat=%3Cgale_plos_%3EA674968939%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2571447373&rft_id=info:pmid/34506549&rft_galeid=A674968939&rft_doaj_id=oai_doaj_org_article_20905e10c23048df9ce82dda4d80aa51&rfr_iscdi=true |