AutoML Two-Sample Test
Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require spec...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Two-sample tests are important in statistics and machine learning, both as
tools for scientific discovery as well as to detect distribution shifts. This
led to the development of many sophisticated test procedures going beyond the
standard supervised learning frameworks, whose usage can require specialized
knowledge about two-sample testing. We use a simple test that takes the mean
discrepancy of a witness function as the test statistic and prove that
minimizing a squared loss leads to a witness with optimal testing power. This
allows us to leverage recent advancements in AutoML. Without any user input
about the problems at hand, and using the same method for all our experiments,
our AutoML two-sample test achieves competitive performance on a diverse
distribution shift benchmark as well as on challenging two-sample testing
problems.
We provide an implementation of the AutoML two-sample test in the Python
package autotst. |
---|---|
DOI: | 10.48550/arxiv.2206.08843 |