But What Did You Actually Learn? Improving Inference for Non-Identifiable Deep Latent Variable Models

Deep probabilistic and Bayesian latent variable models allow one to infer variables that have previously not been observed in the data in order to accurately model the data density. They provide an intuitive and flexible modeling paradigm, making them particularly well-suited for safety-critical app...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Yacoby, Yaniv
Format:	Dissertation
Sprache:	eng
Schlagworte:	Artificial intelligence Bayesian Statistics Computer science Deep Learning Latent Variable Models Non-Identifiability Statistics Variational Inference
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep probabilistic and Bayesian latent variable models allow one to infer variables that have previously not been observed in the data in order to accurately model the data density. They provide an intuitive and flexible modeling paradigm, making them particularly well-suited for safety-critical applications, in which the data is often noisy and complex, requiring expressivity from deep learning, and assumptions need to be well-understood and easily interrogated to ensure safety, requiring an intuitive model specification process. However, this model class comes with several challenges. First, as model complexity grows, so does the possibility for model non-identifiability---the existence of multiple models (or parameterizations of the same models) that explain the observed data equally well. This means that, while the model specification process is intuitive, the results of inference may not be. Second, inference for such models is intractable, requiring approximations. Unfortunately, these approximations require us to make additional assumptions. But unlike the assumptions made in the original model specification, these inference assumptions are highly unintuitive. As such, in safety-critical domains, they present a significant barrier for use. Moreover, these inference assumptions may impact downstream decision-making in unpredictable (and undesirable) ways. This dissertation is focused on (a) understanding how non-identifiability compromises the quality of approximate inference for deep probabilistic and Bayesian latent variable models, and (b) developing novel approximate inference methods that mitigate or accommodate non-identifiability to achieve better performance on downstream tasks.