Variational Information Pursuit for Interpretable Predictions
There is a growing interest in the machine learning community in developing predictive algorithms that are "interpretable by design". Towards this end, recent work proposes to make interpretable decisions by sequentially asking interpretable queries about data until a prediction can be mad...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | There is a growing interest in the machine learning community in developing
predictive algorithms that are "interpretable by design". Towards this end,
recent work proposes to make interpretable decisions by sequentially asking
interpretable queries about data until a prediction can be made with high
confidence based on the answers obtained (the history). To promote short
query-answer chains, a greedy procedure called Information Pursuit (IP) is
used, which adaptively chooses queries in order of information gain. Generative
models are employed to learn the distribution of query-answers and labels,
which is in turn used to estimate the most informative query. However, learning
and inference with a full generative model of the data is often intractable for
complex tasks. In this work, we propose Variational Information Pursuit (V-IP),
a variational characterization of IP which bypasses the need for learning
generative models. V-IP is based on finding a query selection strategy and a
classifier that minimizes the expected cross-entropy between true and predicted
labels. We then demonstrate that the IP strategy is the optimal solution to
this problem. Therefore, instead of learning generative models, we can use our
optimal strategy to directly pick the most informative query given any history.
We then develop a practical algorithm by defining a finite-dimensional
parameterization of our strategy and classifier using deep networks and train
them end-to-end using our objective. Empirically, V-IP is 10-100x faster than
IP on different Vision and NLP tasks with competitive performance. Moreover,
V-IP finds much shorter query chains when compared to reinforcement learning
which is typically used in sequential-decision-making problems. Finally, we
demonstrate the utility of V-IP on challenging tasks like medical diagnosis
where the performance is far superior to the generative modelling approach. |
---|---|
DOI: | 10.48550/arxiv.2302.02876 |