Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder

As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on affective computing 2017-07, Vol.8 (3), p.328-339
Hauptverfasser:	Zhao, Rui, Mao, Kezhi
Format:	Artikel
Sprache:	eng
Schlagworte:	Adolescents Adults Analytical models Bullying Children Cyberbullying Cyberbullying detection Digital media Feature extraction Machine learning Mathematical models Media Messages Noise reduction Numerical models representation learning Representations Robustness Robustness (mathematics) Semantics Short message service Social networks stacked denoising autoencoders Teaching methods text mining word embedding
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	339
container_issue	3
container_start_page	328
container_title	IEEE transactions on affective computing
container_volume	8
creator	Zhao, Rui Mao, Kezhi
description	As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed based on domain knowledge and the word embedding technique. Our proposed method is able to exploit the hidden feature structure of bullying information and learn a robust and discriminative representation of text. Comprehensive experiments on two public cyberbullying corpora (Twitter and MySpace) are conducted, and the results show that our proposed approaches outperform other baseline text representation learning methods.
doi_str_mv	10.1109/TAFFC.2016.2531682
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2174429561</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7412690</ieee_id><sourcerecordid>2174429561</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-352a4ba186c46edb4a6b80ba6d24a2ab7bcafb75a5813aaa6b240995eeb9e25b3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGEwnyB_RC4rm43-0esYCaaDyIXpvZ7YAlpcXd9oC_3kWIcS7z9T6TyUvINaMTxqi5W04Xi3zCKdMTrgTTGT8jA2akSQSV6vxffUlGIWxoDCGE5umAfOR7i972db2vmvV4hh26rmqb8T0ELMexeMMtNF3lknnzCY2Lwxfw66qBuvqOzQybtgoHdtp3bRS5tkR_RS5WUAccnfKQvC_my_wxeX59eMqnz4kTmnWJUBykBZZpJzWWVoK2GbWgSy6Bg02tg5VNFaiMCYC45ZIaoxCtQa6sGJLb492db796DF2xaXsffwsFZ6mU3CjNooofVc63IXhcFTtfbcHvC0aLg4XFr4XFwcLiZGGEbo5QhYh_QCoZ14aKH0ndbZs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174429561</pqid></control><display><type>article</type><title>Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder</title><source>IEEE Electronic Library (IEL)</source><creator>Zhao, Rui ; Mao, Kezhi</creator><creatorcontrib>Zhao, Rui ; Mao, Kezhi</creatorcontrib><description>As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed based on domain knowledge and the word embedding technique. Our proposed method is able to exploit the hidden feature structure of bullying information and learn a robust and discriminative representation of text. Comprehensive experiments on two public cyberbullying corpora (Twitter and MySpace) are conducted, and the results show that our proposed approaches outperform other baseline text representation learning methods.</description><identifier>ISSN: 1949-3045</identifier><identifier>EISSN: 1949-3045</identifier><identifier>DOI: 10.1109/TAFFC.2016.2531682</identifier><identifier>CODEN: ITACBQ</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adolescents ; Adults ; Analytical models ; Bullying ; Children ; Cyberbullying ; Cyberbullying detection ; Digital media ; Feature extraction ; Machine learning ; Mathematical models ; Media ; Messages ; Noise reduction ; Numerical models ; representation learning ; Representations ; Robustness ; Robustness (mathematics) ; Semantics ; Short message service ; Social networks ; stacked denoising autoencoders ; Teaching methods ; text mining ; word embedding</subject><ispartof>IEEE transactions on affective computing, 2017-07, Vol.8 (3), p.328-339</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-352a4ba186c46edb4a6b80ba6d24a2ab7bcafb75a5813aaa6b240995eeb9e25b3</citedby><cites>FETCH-LOGICAL-c361t-352a4ba186c46edb4a6b80ba6d24a2ab7bcafb75a5813aaa6b240995eeb9e25b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7412690$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7412690$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhao, Rui</creatorcontrib><creatorcontrib>Mao, Kezhi</creatorcontrib><title>Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder</title><title>IEEE transactions on affective computing</title><addtitle>T-AFFC</addtitle><description>As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed based on domain knowledge and the word embedding technique. Our proposed method is able to exploit the hidden feature structure of bullying information and learn a robust and discriminative representation of text. Comprehensive experiments on two public cyberbullying corpora (Twitter and MySpace) are conducted, and the results show that our proposed approaches outperform other baseline text representation learning methods.</description><subject>Adolescents</subject><subject>Adults</subject><subject>Analytical models</subject><subject>Bullying</subject><subject>Children</subject><subject>Cyberbullying</subject><subject>Cyberbullying detection</subject><subject>Digital media</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Media</subject><subject>Messages</subject><subject>Noise reduction</subject><subject>Numerical models</subject><subject>representation learning</subject><subject>Representations</subject><subject>Robustness</subject><subject>Robustness (mathematics)</subject><subject>Semantics</subject><subject>Short message service</subject><subject>Social networks</subject><subject>stacked denoising autoencoders</subject><subject>Teaching methods</subject><subject>text mining</subject><subject>word embedding</subject><issn>1949-3045</issn><issn>1949-3045</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1PwkAQhjdGEwnyB_RC4rm43-0esYCaaDyIXpvZ7YAlpcXd9oC_3kWIcS7z9T6TyUvINaMTxqi5W04Xi3zCKdMTrgTTGT8jA2akSQSV6vxffUlGIWxoDCGE5umAfOR7i972db2vmvV4hh26rmqb8T0ELMexeMMtNF3lknnzCY2Lwxfw66qBuvqOzQybtgoHdtp3bRS5tkR_RS5WUAccnfKQvC_my_wxeX59eMqnz4kTmnWJUBykBZZpJzWWVoK2GbWgSy6Bg02tg5VNFaiMCYC45ZIaoxCtQa6sGJLb492db796DF2xaXsffwsFZ6mU3CjNooofVc63IXhcFTtfbcHvC0aLg4XFr4XFwcLiZGGEbo5QhYh_QCoZ14aKH0ndbZs</recordid><startdate>20170701</startdate><enddate>20170701</enddate><creator>Zhao, Rui</creator><creator>Mao, Kezhi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20170701</creationdate><title>Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder</title><author>Zhao, Rui ; Mao, Kezhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-352a4ba186c46edb4a6b80ba6d24a2ab7bcafb75a5813aaa6b240995eeb9e25b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Adolescents</topic><topic>Adults</topic><topic>Analytical models</topic><topic>Bullying</topic><topic>Children</topic><topic>Cyberbullying</topic><topic>Cyberbullying detection</topic><topic>Digital media</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Media</topic><topic>Messages</topic><topic>Noise reduction</topic><topic>Numerical models</topic><topic>representation learning</topic><topic>Representations</topic><topic>Robustness</topic><topic>Robustness (mathematics)</topic><topic>Semantics</topic><topic>Short message service</topic><topic>Social networks</topic><topic>stacked denoising autoencoders</topic><topic>Teaching methods</topic><topic>text mining</topic><topic>word embedding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Rui</creatorcontrib><creatorcontrib>Mao, Kezhi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on affective computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Rui</au><au>Mao, Kezhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder</atitle><jtitle>IEEE transactions on affective computing</jtitle><stitle>T-AFFC</stitle><date>2017-07-01</date><risdate>2017</risdate><volume>8</volume><issue>3</issue><spage>328</spage><epage>339</epage><pages>328-339</pages><issn>1949-3045</issn><eissn>1949-3045</eissn><coden>ITACBQ</coden><abstract>As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed based on domain knowledge and the word embedding technique. Our proposed method is able to exploit the hidden feature structure of bullying information and learn a robust and discriminative representation of text. Comprehensive experiments on two public cyberbullying corpora (Twitter and MySpace) are conducted, and the results show that our proposed approaches outperform other baseline text representation learning methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TAFFC.2016.2531682</doi><tpages>12</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1949-3045
ispartof	IEEE transactions on affective computing, 2017-07, Vol.8 (3), p.328-339
issn	1949-3045 1949-3045
language	eng
recordid	cdi_proquest_journals_2174429561
source	IEEE Electronic Library (IEL)
subjects	Adolescents Adults Analytical models Bullying Children Cyberbullying Cyberbullying detection Digital media Feature extraction Machine learning Mathematical models Media Messages Noise reduction Numerical models representation learning Representations Robustness Robustness (mathematics) Semantics Short message service Social networks stacked denoising autoencoders Teaching methods text mining word embedding
title	Cyberbullying Detection Based on Semantic-Enhanced Marginalized Denoising Auto-Encoder
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T18%3A05%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cyberbullying%20Detection%20Based%20on%20Semantic-Enhanced%20Marginalized%20Denoising%20Auto-Encoder&rft.jtitle=IEEE%20transactions%20on%20affective%20computing&rft.au=Zhao,%20Rui&rft.date=2017-07-01&rft.volume=8&rft.issue=3&rft.spage=328&rft.epage=339&rft.pages=328-339&rft.issn=1949-3045&rft.eissn=1949-3045&rft.coden=ITACBQ&rft_id=info:doi/10.1109/TAFFC.2016.2531682&rft_dat=%3Cproquest_RIE%3E2174429561%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174429561&rft_id=info:pmid/&rft_ieee_id=7412690&rfr_iscdi=true