Early Stopping Based on Repeated Significance
For a bucket test with a single criterion for success and a fixed number of samples or testing period, requiring a $p$-value less than a specified value of $\alpha$ for the success criterion produces statistical confidence at level $1 - \alpha$. For multiple criteria, a Bonferroni correction that pa...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | For a bucket test with a single criterion for success and a fixed number of
samples or testing period, requiring a $p$-value less than a specified value of
$\alpha$ for the success criterion produces statistical confidence at level $1
- \alpha$. For multiple criteria, a Bonferroni correction that partitions
$\alpha$ among the criteria produces statistical confidence, at the cost of
requiring lower $p$-values for each criterion. The same concept can be applied
to decisions about early stopping, but that can lead to strict requirements for
$p$-values. We show how to address that challenge by requiring criteria to be
successful at multiple decision points. |
---|---|
DOI: | 10.48550/arxiv.2408.00908 |