Impact of deep learning-based dropout on shallow neural networks applied to stream temperature modelling

Although deep learning applicability in various fields of earth sciences is rapidly increasing, shallow multilayer-perceptron neural networks remain widely used for regression problems. Despite many clear distinctions between deep and shallow neural networks, some techniques developed for deep learn...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Earth-science reviews 2020-02, Vol.201, p.103076, Article 103076
Hauptverfasser:	Piotrowski, Adam P., Napiorkowski, Jaroslaw J., Piotrowska, Agnieszka E.
Format:	Artikel
Sprache:	eng
Schlagworte:	air temperature algorithms Atmosphere-hydrosphere interactions Deep learning Dropout probability river flow Shallow artificial neural networks Stream temperature modelling water temperature
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Although deep learning applicability in various fields of earth sciences is rapidly increasing, shallow multilayer-perceptron neural networks remain widely used for regression problems. Despite many clear distinctions between deep and shallow neural networks, some techniques developed for deep learning may help improve shallow models. Dropout, a simple approach to avoid overfitting by randomly skipping some nodes in a net during each training iteration, is among methodological features that made deep learning networks successful. In this study we give a review of dropout methods and empirically show that, when used together with early-stopping, dropout and its variant dropconnect could improve performance of shallow multi-layer perceptron neural networks. Shallow neural networks are applied to streamwater temperature modelling problems in six catchments, based on air temperature, river discharge and declination of the Sun. We found that when training of a particular neural network architecture that includes at least a few hidden nodes is repeated many times, dropout reduces the number of models that perform poorly on testing data, and hence improves the mean performance. If the number of inputs or hidden nodes is very low, dropout only disturbs training. However, nodes need to be dropped out with a much lower probability than in the case of deep neural networks (about 1%, instead of 10–50% for deep learning), due to a much smaller number of nodes in the network. Larger probabilities of dropping out nodes hinder convergence of the training algorithm and lead to poor results for both calibration and testing data. Dropconnect turned out to be slightly more effective than dropout.
ISSN:	0012-8252 1872-6828
DOI:	10.1016/j.earscirev.2019.103076