Selection of time instants and intervals with Support Vector Regression for multivariate functional data

•An approach for feature selection in functional data regression is developed.•The methodology allows to select time instants and intervals in the same manner.•The higher-order information exploits the functional nature of the data.•A continuous optimization methodology is applied as solving strateg...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & operations research 2020-11, Vol.123, p.105050, Article 105050
Hauptverfasser: Blanquero, Rafael, Carrizosa, Emilio, Jiménez-Cordero, Asunción, Martín-Barragán, Belén
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•An approach for feature selection in functional data regression is developed.•The methodology allows to select time instants and intervals in the same manner.•The higher-order information exploits the functional nature of the data.•A continuous optimization methodology is applied as solving strategy. When continuously monitoring processes over time, data is collected along a whole period, from which only certain time instants and certain time intervals may play a crucial role in the data analysis. We develop a method that addresses the problem of selecting a finite and small set of short intervals (or instants) able to capture the information needed to predict a response variable from multivariate functional data using Support Vector Regression (SVR). In addition to improving interpretability, storage requirements, and monitoring cost, feature selection can potentially reduce overfitting by mitigating data autocorrelation. We propose a continuous optimization algorithm to fit the SVR parameters and select intervals and instants. Our approach takes advantage of the functional nature of the data by formulating a new bilevel optimization problem that integrates selection of intervals and instants, tuning of some key SVR parameters and fitting the SVR. We illustrate the usefulness of our proposal in some benchmark data sets.
ISSN:0305-0548
0305-0548
DOI:10.1016/j.cor.2020.105050