Type I and Type II error concerns in fMRI research: re-balancing the scale

Statistical thresholding (i.e. P-values) in fMRI research has become increasingly conservative over the past decade in an attempt to diminish Type I errors (i.e. false alarms) to a level traditionally allowed in behavioral science research. In this article, we examine the unintended negative consequ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Social cognitive and affective neuroscience 2009-12, Vol.4 (4), p.423-428
Hauptverfasser:	Lieberman, Matthew D., Cunningham, William A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Brain - blood supply Brain Mapping Data Interpretation, Statistical False Positive Reactions Humans Image Interpretation, Computer-Assisted Magnetic Resonance Imaging Meta-Analysis as Topic Oxygen - blood Reproducibility of Results Research Design Sample Size Tools of the Trade
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Statistical thresholding (i.e. P-values) in fMRI research has become increasingly conservative over the past decade in an attempt to diminish Type I errors (i.e. false alarms) to a level traditionally allowed in behavioral science research. In this article, we examine the unintended negative consequences of this single-minded devotion to Type I errors: increased Type II errors (i.e. missing true effects), a bias toward studying large rather than small effects, a bias toward observing sensory and motor processes rather than complex cognitive and affective processes and deficient meta-analyses. Power analyses indicate that the reductions in acceptable P-values over time are producing dramatic increases in the Type II error rate. Moreover, the push for a mapwide false discovery rate (FDR) of 0.05 is based on the assumption that this is the FDR in most behavioral research; however, this is an inaccurate assessment of the conventions in actual behavioral research. We report simulations demonstrating that combined intensity and cluster size thresholds such as P < 0.005 with a 10 voxel extent produce a desirable balance between Types I and II error rates. This joint threshold produces high but acceptable Type II error rates and produces a FDR that is comparable to the effective FDR in typical behavioral science articles (while a 20 voxel extent threshold produces an actual FDR of 0.05 with relatively common imaging parameters). We recommend a greater focus on replication and meta-analysis rather than emphasizing single studies as the unit of analysis for establishing scientific truth. From this perspective, Type I errors are self-erasing because they will not replicate, thus allowing for more lenient thresholding to avoid Type II errors.
ISSN:	1749-5016 1749-5024
DOI:	10.1093/scan/nsp052