Permutation tests are robust and powerful at 0.5% and 5% significance levels
Recent replication crisis has led to a number of ad hoc suggestions to decrease the chance of making false positive findings. Among them, Johnson ( Proceedings of the National Academy of Sciences , 110 , 19313–19317, 2013 ) and Benjamin et al. ( Nature Human Behaviour , 2 , 6–10 2018 ) recommend usi...
Gespeichert in:
Veröffentlicht in: | Behavior Research Methods 2021-12, Vol.53 (6), p.2712-2724 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent replication crisis has led to a number of ad hoc suggestions to decrease the chance of making false positive findings. Among them, Johnson (
Proceedings of the National Academy of Sciences
,
110
, 19313–19317,
2013
) and Benjamin et al. (
Nature Human Behaviour
,
2
, 6–10
2018
) recommend using the significance level of
α
= 0.005 (0.5
%
) as opposed to the conventional 0.05 (5
%
) level. Even though their suggestion is easy to implement, it is unclear whether or not the commonly used statistical tests are robust and/or powerful at such a small significance level. Therefore, the main aim of our study is to investigate the robustness and power curve behaviors of independent (unpaired) two-sample tests for metric and ordinal data at nominal significance levels of
α
= 0.005 and
α
= 0.05. Through an extensive simulation study, it is found that the permutation versions of the Welch
t
-test and the Brunner-Munzel test are particularly robust and powerful while the commonly used two-sample tests which utilize
t
-distribution tend to be either liberal or conservative, and have peculiar power curve behaviors under skewed distributions with variance heterogeneity. |
---|---|
ISSN: | 1554-3528 1554-3528 |
DOI: | 10.3758/s13428-021-01595-5 |