Assessing methods for multiple imputation of systematic missing data in marine fisheries time series with a new validation algorithm

Time series from fisheries often contain multiple missing data. This is a severe limitation that prevents using the data for research on population dynamics, stock assessment, forecasting, and, hence, decision-making around marine resources. Several methods have been proposed to impute missing data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Aquaculture and fisheries 2023-09, Vol.8 (5), p.587-599
Hauptverfasser: Benavides, Iván F., Santacruz, Marlon, Romero-Leiton, Jhoana P., Barreto, Carlos, Selvaraj, John Josephraj
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Time series from fisheries often contain multiple missing data. This is a severe limitation that prevents using the data for research on population dynamics, stock assessment, forecasting, and, hence, decision-making around marine resources. Several methods have been proposed to impute missing data in univariate time series. Still, their performances depend not only on the amount of missing data but also on the data structure. This study compares the performance of twelve imputation methods on the time series of marine fishery landings for six species in the Colombian Pacific Ocean. Unlike other studies, we validate the precision of the imputations in the same target time series that include missing data, using the Known Sub-Sequence Algorithm (KSSA), a novelty validation approach that simulates missing data in known sub-sequences of the target time series. The results showed that the best methods for imputation are Seasonal Decomposition with Kalman filters and Structural Models with Kalman filters fitted by maximum likelihood. Results also show that validating the imputation methods with other time series different to the target time series, leads to wrong imputation methods choices. It is noteworthy that these methods and also the validation framework are mainly suited to time series with non-random distribution of missing data, this is, missing data produced systematically in chunks or clusters with predictable frequency, which are common in marine sciences.
ISSN:2468-550X
2468-550X
DOI:10.1016/j.aaf.2021.12.013