Causality-Based Feature Importance Quantifying Methods: PN-FI, PS-FI and PNS-FI
In the current ML field models are getting larger and more complex, and data used for model training are also getting larger in quantity and higher in dimensions. Therefore, in order to train better models, and save training time and computational resources, a good Feature Selection (FS) method in t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the current ML field models are getting larger and more complex, and data
used for model training are also getting larger in quantity and higher in
dimensions. Therefore, in order to train better models, and save training time
and computational resources, a good Feature Selection (FS) method in the
preprocessing stage is necessary. Feature importance (FI) is of great
importance since it is the basis of feature selection. Therefore, this paper
creatively introduces the calculation of PN (the probability of Necessity), PN
(the probability of Sufficiency), and PNS (the probability of Necessity and
Sufficiency) of Causality into quantifying feature importance and creates 3 new
FI measuring methods, PN-FI, which means how much importance a feature has in
image recognition tasks, PS-FI that means how much importance a feature has in
image generating tasks, and PNS-FI which measures both. The main body of this
paper is three RCTs, with whose results we show how PS-FI, PN-FI, and PNS-FI of
3 features, dog nose, dog eyes, and dog mouth are calculated. The experiments
show that firstly, FI values are intervals with tight upper and lower bounds.
Secondly, the feature dog eyes has the most importance while the other two have
almost the same. Thirdly, the bounds of PNS and PN are tighter than the bounds
of PS. |
---|---|
DOI: | 10.48550/arxiv.2308.14474 |