A hybrid system based on ensemble learning to model residuals for time series forecasting

The time series forecasting literature has highlighted the accuracy of hybrid systems that combine statistical linear and Machine Learning (ML) models by modeling the residuals. These systems separately model linear and nonlinear patterns aiming to overcome the limitations of using only a single mod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences 2023-11, Vol.649, p.119614, Article 119614
Hauptverfasser: Santos Júnior, Domingos S. de O., de Mattos Neto, Paulo S.G., de Oliveira, João F.L., Cavalcanti, George D.C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The time series forecasting literature has highlighted the accuracy of hybrid systems that combine statistical linear and Machine Learning (ML) models by modeling the residuals. These systems separately model linear and nonlinear patterns aiming to overcome the limitations of using only a single model. This system comprises three phases: linear modeling of the time series, forecasting the residuals using an ML model, and final forecasting through the combination of past phases. Modeling the residuals is challenging because the residuals may present heteroscedasticity, complex nonlinear patterns, and random fluctuations. Hence, specifying a single ML model is a complex task. This work proposes a hybrid system that combines a linear statistical model with an ensemble of ML models to forecast real-world time series. The proposed method employs an ensemble in the phase of modeling the residuals, aiming at: improving the generalization capacity of the system, reducing the risk of selecting an incorrect model, expanding the function space, and increasing the system's accuracy. Moreover, for each time series, a data-driven search is carried out for the parameters of the ensemble that will be the most suitable for that time series. The experimental results show that the proposal attains superior performance and is statistically better than the related systems in the literature.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2023.119614