Causal Feature Selection With Dual Correction
Causal feature selection methods aim to identify a Markov boundary (MB) of a class variable, and almost all the existing causal feature selection algorithms use conditional independence (CI) tests to learn the MB. However, in real-world applications, due to data issues (e.g., noisy or small samples)...
Gespeichert in:
Veröffentlicht in: | IEEE transaction on neural networks and learning systems 2024-01, Vol.PP (1), p.1-14 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Causal feature selection methods aim to identify a Markov boundary (MB) of a class variable, and almost all the existing causal feature selection algorithms use conditional independence (CI) tests to learn the MB. However, in real-world applications, due to data issues (e.g., noisy or small samples), CI tests can be unreliable; thus, causal feature selection algorithms relying on CI tests encounter two types of errors: false positives (i.e., selecting false MB features) and false negatives (i.e., discarding true MB features). Existing algorithms only tackle either false positives or false negatives, and they cannot deal with both types of errors at the same time, leading to unsatisfactory results. To address this issue, we propose a dual-correction-strategy-based MB learning (DCMB) algorithm to correct the two types of errors simultaneously. Specifically, DCMB selectively removes false positives from the MB features currently selected, while selectively retrieving false negatives from the features currently discarded. To automatically determine the optimal number of selected features for the selective removal and retrieval in the dual correction strategy, we design the simulated-annealing-based DCMB (SA-DCMB) algorithm. Using benchmark Bayesian network (BN) datasets, the experimental results demonstrate that DCMB achieves substantial improvements on the MB learning accuracy compared with the existing MB learning methods. Empirical studies in real-world datasets validate the effectiveness of SA-DCMB for classification against state-of-the-art causal and traditional feature selection algorithms. |
---|---|
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2022.3178075 |