Can expected error costs justify testing a hypothesis at multiple alpha levels rather than searching for an elusive optimal alpha?

Simultaneous testing of one hypothesis at multiple alpha levels can be performed within a conventional Neyman-Pearson framework. This is achieved by treating the hypothesis as a family of hypotheses, each member of which explicitly concerns test level as well as effect size. Such testing encourages...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PloS one 2024-09, Vol.19 (9), p.e0304675
1. Verfasser: Aisbett, Janet
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Simultaneous testing of one hypothesis at multiple alpha levels can be performed within a conventional Neyman-Pearson framework. This is achieved by treating the hypothesis as a family of hypotheses, each member of which explicitly concerns test level as well as effect size. Such testing encourages researchers to think about error rates and strength of evidence in both the statistical design and reporting stages of a study. Here, we show that these multi-alpha level tests can deliver acceptable expected total error costs. We first present formulas for expected error costs from single alpha and multiple alpha level tests, given prior probabilities of effect sizes that have either dichotomous or continuous distributions. Error costs are tied to decisions, with different decisions assumed for each of the potential outcomes in the multi-alpha level case. Expected total costs for tests at single and multiple alpha levels are then compared with optimal costs. This comparison highlights how sensitive optimization is to estimated error costs and to assumptions about prevalence. Testing at multiple default thresholds removes the need to formally identify decisions, or to model costs and prevalence as required in optimization approaches. Although total expected error costs with this approach will not be optimal, our results suggest they may be lower, on average, than when "optimal" test levels are based on mis-specified models.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0304675