Dynamic Instance-Wise Classification in Correlated Feature Spaces
In a typical supervised machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. However, using a different subset of features that is most informative for each test instance individually may improve not only the predi...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on artificial intelligence 2021-12, Vol.2 (6), p.537-548 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In a typical supervised machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. However, using a different subset of features that is most informative for each test instance individually may improve not only the prediction accuracy but also the overall interpretability of the model. At the same time, feature selection methods for classification have been known to be the most effective when many features are irrelevant and/or uncorrelated. In fact, feature selection ignoring correlations between features can lead to poor classification performance. In this work, a Bayesian network is utilized to model feature dependencies. Using the dependence network, a new method is proposed that sequentially selects the best feature to evaluate for each test instance individually and stops the selection process to make a prediction once it determines that no further improvement can be achieved with respect to classification accuracy. The optimum number of features to acquire and the optimum classification strategy are derived for each test instance. The theoretical properties of the optimum solution are analyzed, and a new algorithm is proposed that takes advantage of these properties to implement a robust and scalable solution for high-dimensional settings. The effectiveness, generalizability, and scalability of the proposed method are illustrated on a variety of real-world datasets from diverse application domains.
Impact statement- The ability to rationalize which features to use to classify each data instance is of paramount importance in a wide range of application domains, including but not limited to medicine, criminal justice, and cybersecurity. Correlations between features, and the need for variable selection at the same stage as classification, in such application domains present additional challenges to machine learning related to classification accuracy and computationally intractability. The proposed framework presents, to the best of our knowledge, the first practical solution that balances between classification accuracy and sparsity at the instance level, by dynamically choosing the most informative features, relative to each instance, from a set of potentially correlated features, for classifying each individual instance. The proposed framework achieves reductions up to 82% in the average number of features used by state-of-the-art methods without sacrificing accuracy and |
---|---|
ISSN: | 2691-4581 2691-4581 |
DOI: | 10.1109/TAI.2021.3109858 |