Equivalence between dropout and data augmentation: A mathematical check

The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow conv...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2019-07, Vol.115, p.82-89
Hauptverfasser:	Zhao, Dazhi, Yu, Guozhu, Xu, Peng, Luo, Maokang
Format:	Artikel
Sprache:	eng
Schlagworte:	Data augmentation Deep learning Dropout Mathematical check Neural network
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	89
container_issue
container_start_page	82
container_title	Neural networks
container_volume	115
creator	Zhao, Dazhi Yu, Guozhu Xu, Peng Luo, Maokang
description	The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.
doi_str_mv	10.1016/j.neunet.2019.03.013
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2209599612</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608019300942</els_id><sourcerecordid>2209599612</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</originalsourceid><addsrcrecordid>eNp9kLtOwzAUhi0EouXyBghlZEk4tlPHZkBCVSlISCwwW459Ai65tLFTxNuTqsDIcs7yX_R_hFxQyChQcb3KWhxajBkDqjLgGVB-QKZUFiplhWSHZApS8VSAhAk5CWEFAELm_JhMOKhCCgpTslxsBr81NbYWkxLjJ2KbuL5bd0NMTOsSZ6JJzPDWYBtN9F17k9wljYnvOB5vTZ3Yd7QfZ-SoMnXA859_Sl7vFy_zh_Tpefk4v3tKLRcspoUTBaOFBMkYgEJlOLWccglVJfI8h3HazBUCnKTlzMmqLBRlJTCruLPO8VNytc9d991mwBB144PFujYtdkPQY6yaKSUoG6X5Xmr7LoQeK73ufWP6L01B7xDqld4j1DuEGrgeEY62y5-GoWzQ_Zl-mY2C270Ax51bj70O1u_4Od-jjdp1_v-Gb9eIgug</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2209599612</pqid></control><display><type>article</type><title>Equivalence between dropout and data augmentation: A mathematical check</title><source>Elsevier ScienceDirect Journals</source><creator>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</creator><creatorcontrib>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</creatorcontrib><description>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2019.03.013</identifier><identifier>PMID: 30978610</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Data augmentation ; Deep learning ; Dropout ; Mathematical check ; Neural network</subject><ispartof>Neural networks, 2019-07, Vol.115, p.82-89</ispartof><rights>2019 Elsevier Ltd</rights><rights>Copyright © 2019 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</citedby><cites>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0893608019300942$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30978610$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Dazhi</creatorcontrib><creatorcontrib>Yu, Guozhu</creatorcontrib><creatorcontrib>Xu, Peng</creatorcontrib><creatorcontrib>Luo, Maokang</creatorcontrib><title>Equivalence between dropout and data augmentation: A mathematical check</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</description><subject>Data augmentation</subject><subject>Deep learning</subject><subject>Dropout</subject><subject>Mathematical check</subject><subject>Neural network</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kLtOwzAUhi0EouXyBghlZEk4tlPHZkBCVSlISCwwW459Ai65tLFTxNuTqsDIcs7yX_R_hFxQyChQcb3KWhxajBkDqjLgGVB-QKZUFiplhWSHZApS8VSAhAk5CWEFAELm_JhMOKhCCgpTslxsBr81NbYWkxLjJ2KbuL5bd0NMTOsSZ6JJzPDWYBtN9F17k9wljYnvOB5vTZ3Yd7QfZ-SoMnXA859_Sl7vFy_zh_Tpefk4v3tKLRcspoUTBaOFBMkYgEJlOLWccglVJfI8h3HazBUCnKTlzMmqLBRlJTCruLPO8VNytc9d991mwBB144PFujYtdkPQY6yaKSUoG6X5Xmr7LoQeK73ufWP6L01B7xDqld4j1DuEGrgeEY62y5-GoWzQ_Zl-mY2C270Ax51bj70O1u_4Od-jjdp1_v-Gb9eIgug</recordid><startdate>20190701</startdate><enddate>20190701</enddate><creator>Zhao, Dazhi</creator><creator>Yu, Guozhu</creator><creator>Xu, Peng</creator><creator>Luo, Maokang</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20190701</creationdate><title>Equivalence between dropout and data augmentation: A mathematical check</title><author>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Data augmentation</topic><topic>Deep learning</topic><topic>Dropout</topic><topic>Mathematical check</topic><topic>Neural network</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Dazhi</creatorcontrib><creatorcontrib>Yu, Guozhu</creatorcontrib><creatorcontrib>Xu, Peng</creatorcontrib><creatorcontrib>Luo, Maokang</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Dazhi</au><au>Yu, Guozhu</au><au>Xu, Peng</au><au>Luo, Maokang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Equivalence between dropout and data augmentation: A mathematical check</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2019-07-01</date><risdate>2019</risdate><volume>115</volume><spage>82</spage><epage>89</epage><pages>82-89</pages><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>30978610</pmid><doi>10.1016/j.neunet.2019.03.013</doi><tpages>8</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0893-6080
ispartof	Neural networks, 2019-07, Vol.115, p.82-89
issn	0893-6080 1879-2782
language	eng
recordid	cdi_proquest_miscellaneous_2209599612
source	Elsevier ScienceDirect Journals
subjects	Data augmentation Deep learning Dropout Mathematical check Neural network
title	Equivalence between dropout and data augmentation: A mathematical check
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T12%3A27%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Equivalence%20between%20dropout%20and%20data%20augmentation:%20A%20mathematical%20check&rft.jtitle=Neural%20networks&rft.au=Zhao,%20Dazhi&rft.date=2019-07-01&rft.volume=115&rft.spage=82&rft.epage=89&rft.pages=82-89&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2019.03.013&rft_dat=%3Cproquest_cross%3E2209599612%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2209599612&rft_id=info:pmid/30978610&rft_els_id=S0893608019300942&rfr_iscdi=true