Statistical divergences in high-dimensional hypothesis testing and a modern technique for estimating them

Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between the population distributions of observed data and competi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Wilkinson, Jeremy J. H, Lester, Christopher G
Format:	Artikel
Sprache:	eng
Schlagworte:	Mathematics - Statistics Theory Physics - Data Analysis, Statistics and Probability Physics - High Energy Physics - Experiment Physics - High Energy Physics - Phenomenology Statistics - Theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Hypothesis testing in high dimensional data is a notoriously difficult problem without direct access to competing models' likelihood functions. This paper argues that statistical divergences can be used to quantify the difference between the population distributions of observed data and competing models, justifying their use as the basis of a hypothesis test. We go on to point out how modern techniques for functional optimization let us estimate many divergences, without the need for population likelihood functions, using samples from two distributions alone. We use a physics-based example to show how the proposed two-sample test can be implemented in practice, and discuss the necessary steps required to mature the ideas presented into an experimental framework. The code used has been made available for others to use.
DOI:	10.48550/arxiv.2405.06397