A common misapplication of statistical inference: Nuisance control with null-hypothesis significance tests

•Researchers use statistical tests of stimulus or subjects characteristics to “control for confounds”.•This practice is conceptually misguided and pragmatically useless.•We discuss the problem and alternatives. Experimental research on behavior and cognition frequently rests on stimulus or subject s...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Brain and language 2016-11, Vol.162, p.42-45
Hauptverfasser:	Sassenhagen, Jona, Alday, Phillip M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Behavioral Research - methods Cognition Confounding Factors (Epidemiology) Humans Inference Language Miscommunication Models, Statistical Statistical inference Word frequency Word length
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Researchers use statistical tests of stimulus or subjects characteristics to “control for confounds”.•This practice is conceptually misguided and pragmatically useless.•We discuss the problem and alternatives. Experimental research on behavior and cognition frequently rests on stimulus or subject selection where not all characteristics can be fully controlled, even when attempting strict matching. For example, when contrasting patients to controls, variables such as intelligence or socioeconomic status are often correlated with patient status. Similarly, when presenting word stimuli, variables such as word frequency are often correlated with primary variables of interest. One procedure very commonly employed to control for such nuisance effects is conducting inferential tests on confounding stimulus or subject characteristics. For example, if word length is not significantly different for two stimulus sets, they are considered as matched for word length. Such a test has high error rates and is conceptually misguided. It reflects a common misunderstanding of statistical tests: interpreting significance not to refer to inference about a particular population parameter, but about 1. the sample in question, 2. the practical relevance of a sample difference (so that a nonsignificant test is taken to indicate evidence for the absence of relevant differences). We show inferential testing for assessing nuisance effects to be inappropriate both pragmatically and philosophically, present a survey showing its high prevalence, and briefly discuss an alternative in the form of regression including nuisance variables.
ISSN:	0093-934X 1090-2155
DOI:	10.1016/j.bandl.2016.08.001