An efficient hyper-parameter optimization method for supervised learning

Supervised learning is an important tool for data mining and knowledge discovery. The hyper-parameter in learning models usually has a significant impact on the generalization performance of supervised learning model. Although some state-of-the-art hyper-parameter optimizers, such as cross-validatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied soft computing 2022-09, Vol.126, p.109266, Article 109266
Hauptverfasser: Shi, Ying, Qi, Hui, Qi, Xiaobo, Mu, Xiaofang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Supervised learning is an important tool for data mining and knowledge discovery. The hyper-parameter in learning models usually has a significant impact on the generalization performance of supervised learning model. Although some state-of-the-art hyper-parameter optimizers, such as cross-validation (CV) and its improvements, have been widely used in many applications, there still exist some limitations including the low efficiency and the error variation from data partition. To address these issues, we propose the sign similarity to distinguish between over-fitting and under-fitting in supervised learning. On this basis, the minimal symmetric similarity criterion (MSSC) is proposed to optimize hyper-parameters. It provides an equivalent condition of well-fitting, i.e., the well-fitted hyper-parameter should have a minimal symmetric similarity. Compared with CV, the criterion is more efficient and could avoid error variation as the symmetric similarity is based on the whole data set without partition. The reasonableness of the proposed MSSC is proved for well-posed learning problems. Also, corresponding assumptions are explained by means of theoretical analysis and empirical verification. Both theoretical analysis and experimental results indicate that the efficiency of MSSC is significantly higher than popular optimizers like 10-CV. Experimental results demonstrate that MSSC is comparable with or outperforms the state-of-the-art optimizers in three kinds of learning tasks from the statistical perspective.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2022.109266