Précis of Statistical significance: Rationale, validity, and utility

Siu L. Chow

doi:10.1017/S0140525X98001162

Abstract

The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Andersson, Gerhard 1999. The Role of Meta-Analysis in the Significance Test Controversy. European Psychologist, Vol. 4, Issue. 2, p. 75.

Hubbard, Raymond and Ryan, Patricia A. 2000. Statistical Significance with Comments by Editors of Marketing Journals. Educational and Psychological Measurement, Vol. 60, Issue. 5, p. 661.

Tryon, Warren W. 2001. Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests.. Psychological Methods, Vol. 6, Issue. 4, p. 371.

Krueger, Joachim 2001. Null hypothesis significance testing: On the survival of a flawed method.. American Psychologist, Vol. 56, Issue. 1, p. 16.

Morgan, David L. and Morgan, Robin K. 2001. Single-participant research design: Bringing science to managed care.. American Psychologist, Vol. 56, Issue. 2, p. 119.

Lories, Guy 2002. The Explanatory Power of Models. p. 31.

Chokron, Sylvie Colliot, Pascale Bartolomeo, Paolo Rhein, François Eusop, Estelle Vassel, Philippe and Ohlmann, Théophile 2002. Visual, proprioceptive and tactile performance in left neglect patients. Neuropsychologia, Vol. 40, Issue. 12, p. 1965.

Finch, Sue Thomason, Neil and Cumming, Geoff 2002. Past and Future American Psychological Association Guidelines for Statistical Practice. Theory & Psychology, Vol. 12, Issue. 6, p. 825.

2003. Sexual Murder.

Rakover, Sam S. 2003. Experimental Psychology and Duhem's Problem. Journal for the Theory of Social Behaviour, Vol. 33, Issue. 1, p. 45.

Trafimow, David 2003. Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes's theorem.. Psychological Review, Vol. 110, Issue. 3, p. 526.

Weisburd, David Lum, Cynthia M. and Yang, Sue-Ming 2003. When can we Conclude that Treatments or Programs “Don’t Work”?. The ANNALS of the American Academy of Political and Social Science, Vol. 587, Issue. 1, p. 31.

Gigerenzer, Gerd 2004. Mindless statistics. The Journal of Socio-Economics, Vol. 33, Issue. 5, p. 587.

Hubbard, Raymond 2004. Alphabet Soup. Theory & Psychology, Vol. 14, Issue. 3, p. 295.

Buckleton, John 2004. Forensic DNA Evidence Interpretation.

Blouin, David C. and Riopelle, Arthur J. 2005. On Confidence Intervals for Within-Subjects Designs.. Psychological Methods, Vol. 10, Issue. 4, p. 397.

Stamps, Judy McElreath, Richard and Eason, Perri 2005. Alternative models of conspecific attraction in flies and crabs. Behavioral Ecology, Vol. 16, Issue. 6, p. 974.

Balluerka, Nekane Gómez, Juana and Hidalgo, Dolores 2005. The Controversy over Null Hypothesis Significance Testing Revisited. Methodology, Vol. 1, Issue. 2, p. 55.

Garamszegi, László Zsolt 2006. Comparing effect sizes across variables: generalization without the need for Bonferroni correction. Behavioral Ecology, Vol. 17, Issue. 4, p. 682.

Chambers, Wendy C. 2007. Oral Sex: Varied Behaviors and Perceptions in a College Population. Journal of Sex Research, Vol. 44, Issue. 1, p. 28.

Download full list

Article contents

Précis of Statistical significance: Rationale, validity, and utility

Abstract

Keywords

Access options

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Précis of Statistical significance: Rationale, validity, and utility

Abstract

Keywords

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests