Equivalence between dropout and data augmentation: A mathematical check
The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow conv...
Gespeichert in:
Veröffentlicht in: | Neural networks 2019-07, Vol.115, p.82-89 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 89 |
---|---|
container_issue | |
container_start_page | 82 |
container_title | Neural networks |
container_volume | 115 |
creator | Zhao, Dazhi Yu, Guozhu Xu, Peng Luo, Maokang |
description | The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results. |
doi_str_mv | 10.1016/j.neunet.2019.03.013 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2209599612</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608019300942</els_id><sourcerecordid>2209599612</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</originalsourceid><addsrcrecordid>eNp9kLtOwzAUhi0EouXyBghlZEk4tlPHZkBCVSlISCwwW459Ai65tLFTxNuTqsDIcs7yX_R_hFxQyChQcb3KWhxajBkDqjLgGVB-QKZUFiplhWSHZApS8VSAhAk5CWEFAELm_JhMOKhCCgpTslxsBr81NbYWkxLjJ2KbuL5bd0NMTOsSZ6JJzPDWYBtN9F17k9wljYnvOB5vTZ3Yd7QfZ-SoMnXA859_Sl7vFy_zh_Tpefk4v3tKLRcspoUTBaOFBMkYgEJlOLWccglVJfI8h3HazBUCnKTlzMmqLBRlJTCruLPO8VNytc9d991mwBB144PFujYtdkPQY6yaKSUoG6X5Xmr7LoQeK73ufWP6L01B7xDqld4j1DuEGrgeEY62y5-GoWzQ_Zl-mY2C270Ax51bj70O1u_4Od-jjdp1_v-Gb9eIgug</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2209599612</pqid></control><display><type>article</type><title>Equivalence between dropout and data augmentation: A mathematical check</title><source>Elsevier ScienceDirect Journals</source><creator>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</creator><creatorcontrib>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</creatorcontrib><description>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2019.03.013</identifier><identifier>PMID: 30978610</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Data augmentation ; Deep learning ; Dropout ; Mathematical check ; Neural network</subject><ispartof>Neural networks, 2019-07, Vol.115, p.82-89</ispartof><rights>2019 Elsevier Ltd</rights><rights>Copyright © 2019 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</citedby><cites>FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0893608019300942$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30978610$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Dazhi</creatorcontrib><creatorcontrib>Yu, Guozhu</creatorcontrib><creatorcontrib>Xu, Peng</creatorcontrib><creatorcontrib>Luo, Maokang</creatorcontrib><title>Equivalence between dropout and data augmentation: A mathematical check</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</description><subject>Data augmentation</subject><subject>Deep learning</subject><subject>Dropout</subject><subject>Mathematical check</subject><subject>Neural network</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kLtOwzAUhi0EouXyBghlZEk4tlPHZkBCVSlISCwwW459Ai65tLFTxNuTqsDIcs7yX_R_hFxQyChQcb3KWhxajBkDqjLgGVB-QKZUFiplhWSHZApS8VSAhAk5CWEFAELm_JhMOKhCCgpTslxsBr81NbYWkxLjJ2KbuL5bd0NMTOsSZ6JJzPDWYBtN9F17k9wljYnvOB5vTZ3Yd7QfZ-SoMnXA859_Sl7vFy_zh_Tpefk4v3tKLRcspoUTBaOFBMkYgEJlOLWccglVJfI8h3HazBUCnKTlzMmqLBRlJTCruLPO8VNytc9d991mwBB144PFujYtdkPQY6yaKSUoG6X5Xmr7LoQeK73ufWP6L01B7xDqld4j1DuEGrgeEY62y5-GoWzQ_Zl-mY2C270Ax51bj70O1u_4Od-jjdp1_v-Gb9eIgug</recordid><startdate>20190701</startdate><enddate>20190701</enddate><creator>Zhao, Dazhi</creator><creator>Yu, Guozhu</creator><creator>Xu, Peng</creator><creator>Luo, Maokang</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20190701</creationdate><title>Equivalence between dropout and data augmentation: A mathematical check</title><author>Zhao, Dazhi ; Yu, Guozhu ; Xu, Peng ; Luo, Maokang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-7d6721780822009e9a31c31380ff644401015d760d81b5d8fb7912b02c93dcdd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Data augmentation</topic><topic>Deep learning</topic><topic>Dropout</topic><topic>Mathematical check</topic><topic>Neural network</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Dazhi</creatorcontrib><creatorcontrib>Yu, Guozhu</creatorcontrib><creatorcontrib>Xu, Peng</creatorcontrib><creatorcontrib>Luo, Maokang</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhao, Dazhi</au><au>Yu, Guozhu</au><au>Xu, Peng</au><au>Luo, Maokang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Equivalence between dropout and data augmentation: A mathematical check</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2019-07-01</date><risdate>2019</risdate><volume>115</volume><spage>82</spage><epage>89</epage><pages>82-89</pages><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>30978610</pmid><doi>10.1016/j.neunet.2019.03.013</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0893-6080 |
ispartof | Neural networks, 2019-07, Vol.115, p.82-89 |
issn | 0893-6080 1879-2782 |
language | eng |
recordid | cdi_proquest_miscellaneous_2209599612 |
source | Elsevier ScienceDirect Journals |
subjects | Data augmentation Deep learning Dropout Mathematical check Neural network |
title | Equivalence between dropout and data augmentation: A mathematical check |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T12%3A27%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Equivalence%20between%20dropout%20and%20data%20augmentation:%20A%20mathematical%20check&rft.jtitle=Neural%20networks&rft.au=Zhao,%20Dazhi&rft.date=2019-07-01&rft.volume=115&rft.spage=82&rft.epage=89&rft.pages=82-89&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2019.03.013&rft_dat=%3Cproquest_cross%3E2209599612%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2209599612&rft_id=info:pmid/30978610&rft_els_id=S0893608019300942&rfr_iscdi=true |