Arborescent Orthogonal Least Squares Regression for NARMAX-Based Black-Box Fitting

This paper proposes a linear algebra-based supervised machine learning algorithm for the symbolic representation of arbitrarily non-linear and recursive systems. It introduces multiple extensions to the algorithmic class of "Forward Orthogonal Least Squares Regressions" (FOrLSR), which per...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024, Vol.12, p.155578-155597
Hauptverfasser: Thunus, Stephane J. P. S., Parker, Julian D., Weinzierl, Stefan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a linear algebra-based supervised machine learning algorithm for the symbolic representation of arbitrarily non-linear and recursive systems. It introduces multiple extensions to the algorithmic class of "Forward Orthogonal Least Squares Regressions" (FOrLSR), which performs dictionary-based sparse symbolic regressions. The regression, being only provided with the system's input and output, performs variable combinations and non-linear transformations from a given dictionary of analytic expressions and selects the optimal ones to represent the unknown system. This yields a "symbolic" system representation, having the minimum number of terms to enforce sparsity ( L_{0} -norm), while keeping the highest possible precision. The first proposed algorithm (rFOrLSR) restructures the FOrLSR to be in matrix form (for large scale GPU and BLAS-like optimizations), recursive (to reduce the computational complexity from quadratic in model length to linear) and allows regressors to be imposed (to include user expertise and perform tree-searches). Furthermore, the dictionary search is restructured into a breadth-first arborescence traversal kept sparse by five proposed theorems, four corollaries and one pruning mechanism, while adding a validation procedure for the final model selection. The proposed arborescence (AOrLSR) scans large search-space segments, significantly increasing the probability of finding an optimal system representation, while only computing a marginal fraction of the search-space. The regression and arborescence are solvers for arbitrarily-determined linear equation systems which maximize sparsity in the solution vectors.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3450808