Information and disinformation in hydrological data across space: The case of streamflow predictions using machine learning
A total of 461 watersheds across the USA. Study Focus: This study aimed to assess the usefulness of data from donor watersheds to predict streamflow in parent watersheds. For this purpose, Long Short-Term Memory network (LSTM) was used as an information extraction algorithm because of its state-of-t...
Gespeichert in:
Veröffentlicht in: | Journal of hydrology. Regional studies 2024-02, Vol.51, p.101607, Article 101607 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A total of 461 watersheds across the USA.
Study Focus:
This study aimed to assess the usefulness of data from donor watersheds to predict streamflow in parent watersheds. For this purpose, Long Short-Term Memory network (LSTM) was used as an information extraction algorithm because of its state-of-the art performance in terms of predicting streamflow. Out of the 461 watersheds used in this study, 57 watersheds were selected as the parent watersheds. The quantity ‘optimal number of donor watersheds (NT)’ and ‘changes in NSE’ were used as a practical measures of information content in donor watersheds. Several LSTM models were developed by using the data from different number of donor watersheds, varying from 1 to 128, to train the models.
New Hydrological Insights for the Region:
Increasing the number of donor watersheds beyond some optimal NT resulted in a statistically insignificant and, in several cases, hydrologically irrelevant gain in accuracy. In some cases, the Nash-Sutcliff Efficiency (NSE) slightly decreased when NT was increased beyond the optimal value. In several watersheds using a large number of donor watersheds might result in excessively rainfall sensitive LSTM models. Further, data from donor watersheds do not seem to provide information for low flow predictions. Thus, this study offers a nuanced and sobering perspective on the usefulness of data from multiple donor watersheds in terms of streamflow predictions in any given watershed.
[Display omitted]
•Large library of hydrological data is useful to model streamflow dynamics.•Information from a few nearby watersheds is typically sufficient for any specific watershed.•The prevalent idea is that an LSTM must be trained with data from several watersheds.•Based on the results of this study, the above idea appears to be only partially valid. |
---|---|
ISSN: | 2214-5818 2214-5818 |
DOI: | 10.1016/j.ejrh.2023.101607 |