A semi-supervised adaptive discriminative discretization method improving discrimination power of regularized naive Bayes

Recently, many improved naive Bayes methods have been developed with enhanced discrimination capabilities. Among them, regularized naive Bayes (RNB) produces excellent performance by balancing the discrimination power and generalization capability. Data discretization is important in naive Bayes. By...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2023-09, Vol.225, p.120094, Article 120094
Hauptverfasser: Wang, Shihe, Ren, Jianfeng, Bai, Ruibin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, many improved naive Bayes methods have been developed with enhanced discrimination capabilities. Among them, regularized naive Bayes (RNB) produces excellent performance by balancing the discrimination power and generalization capability. Data discretization is important in naive Bayes. By grouping similar values into one interval, the data distribution could be better estimated. However, existing methods including RNB often discretize the data into too few intervals, which may result in a significant information loss. To address this problem, we propose a semi-supervised adaptive discriminative discretization framework for naive Bayes, which could better estimate the data distribution by utilizing both labeled data and unlabeled data through pseudo-labeling techniques. The proposed method also significantly reduces the information loss during discretization by utilizing an adaptive discriminative discretization scheme, and hence greatly improves the discrimination power of classifiers. The proposed RNB+, i.e., regularized naive Bayes utilizing the proposed discretization framework, is systematically evaluated on a wide range of machine-learning datasets. It significantly and consistently outperforms state-of-the-art NB classifiers. •We identify the significant information loss in previous discretization methods.•We propose a Semi-supervised Adaptive Discriminative Discretization (SADD) method.•The proposed SADD is integrated with regularized naïve Bayes, namely RNB+.•The proposed SADD effectively enhances the discrimination power of NB classifiers.•Experimental results on 31 UCI datasets validate the effectiveness of proposed SADD.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.120094