Statistical divergences in high-dimensional hypothesis testing and a modern technique for estimating them
Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between the population distributions of observed data and competi...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hypothesis testing in high dimensional data is a notoriously difficult
problem without direct access to competing models' likelihood functions. This
paper argues that statistical divergences can be used to quantify the
difference between the population distributions of observed data and competing
models, justifying their use as the basis of a hypothesis test. We go on to
point out how modern techniques for functional optimization let us estimate
many divergences, without the need for population likelihood functions, using
samples from two distributions alone. We use a physics-based example to show
how the proposed two-sample test can be implemented in practice, and discuss
the necessary steps required to mature the ideas presented into an experimental
framework. The code used has been made available for others to use. |
---|---|
DOI: | 10.48550/arxiv.2405.06397 |