Skip to main content Accessibility help

Assessing the reliability of textbook data in syntax: Adger's Core Syntax1



There has been a consistent pattern of criticism of the reliability of acceptability judgment data in syntax for at least 50 years (e.g., Hill 1961), culminating in several high-profile criticisms within the past ten years (Edelman & Christiansen 2003, Ferreira 2005, Wasow & Arnold 2005, Gibson & Fedorenko 2010, in press). The fundamental claim of these critics is that traditional acceptability judgment collection methods, which tend to be relatively informal compared to methods from experimental psychology, lead to an intolerably high number of false positive results. In this paper we empirically assess this claim by formally testing all 469 (unique, US-English) data points from a popular syntax textbook (Adger 2003) using 440 naïve participants, two judgment tasks (magnitude estimation and yes–no), and three different types of statistical analyses (standard frequentist tests, linear mixed effects models, and Bayes factor analyses). The results suggest that the maximum discrepancy between traditional methods and formal experimental methods is 2%. This suggests that even under the (likely unwarranted) assumption that the discrepant results are all false positives that have found their way into the syntactic literature due to the shortcomings of traditional methods, the minimum replication rate of these 469 data points is 98%. We discuss the implications of these results for questions about the reliability of syntactic data, as well as the practical consequences of these results for the methodological options available to syntacticians.


Corresponding author

Authors' addresses: (Sprouse) Department of Cognitive Sciences, University of California, 3151 Social Science Plaza A, Irvine, CA 92697-5100,
(Almeida) Department of Linguistics and Languages, Michigan State University, A-621 Wells Hall, East Lansing, MI 48824-1027,


Hide All

This research was supported in part by National Science Foundation grant BCS-0843896 to Jon Sprouse. We would like to thank Carson Schütze, Colin Phillips, James Myers and three anonymous JL referees for helpful comments on earlier drafts. We would also like to thank Andrew Angeles, Melody Chen, and Kevin Proff for their assistance constructing materials. All errors remain our own.



Hide All
Adger, David. 2003. Core syntax: A Minimalist approach. Oxford: Oxford University Press.
Alexopoulou, Theodora & Keller, Frank. 2007. Locality, cyclicity and resumption: At the interface between the grammar and the human sentence processor. Language 83, 110160.
Baayen, R. Harald. 2007. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Baayen, R. Harald, Davidson, Douglas J. & Bates, Douglas M.. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59, 390412.
Bader, Marcus & Häussler, Jana. 2010. Toward a model of grammaticality judgments. Journal of Linguistics 46, 273330.
Bard, Ellen G., Robertson, Dan & Sorace, Antonella. 1996. Magnitude estimation of linguistic acceptability. Language 72, 3268.
Chomsky, Noam. 1955/1957. The logical structure of linguistic theory. New York: Plenum Press.
Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, Noam. 1986. Barriers. Cambridge, MA: MIT Press.
Cohen, Jacob. 1962. The statistical power of abnormal social psychological research: A review. Journal of Abnormal and Social Psychology 65, 145153.
Cohen, Jacob. 1988. Statistical power analysis for the behavioral sciences, 2nd edn.Hillsdale, NJ: Erlbaum.
Cohen, Jacob. 1992. Statistical power analysis. Current Directions in Psychological Science 1, 98101.
Cowart, Wayne. 1997. Experimental syntax: Applying objective methods to sentence judgments. Thousand Oaks, CA: Sage.
Culbertson, Jennifer & Gross, Steven. 2009. Are linguists better subjects? British Journal for the Philosophy of Science 60, 721736.
Culicover, Peter W. & Jackendoff, Ray. 2010. Quantitative methods alone are not enough: Response to Gibson & Fedorenko. Trends in Cognitive Sciences 14, 234235.
Dąbrowska, Ewa. 2010. Naïve v. expert intuitions: An empirical study of acceptability judgments. The Linguistic Review 27, 123.
den Dikken, Marcel, Bernstein, Judy, Tortora, Christina & Zanuttini, Rafaella. 2007. Data and grammar: Means and individuals. Theoretical Linguistics 33, 335352.
Edelman, Shimon & Christiansen, Morten. 2003. How seriously should we take Minimalist syntax? Trends in Cognitive Sciences 7, 6061.
Fanselow, Gisbert. 2007. Carrots – perfect as vegetables, but please not as a main dish. Theoretical Linguistics 33, 353367.
Featherston, Sam. 2005a. Magnitude estimation and what it can do for your syntax: Some wh-constraints in German. Lingua 115, 15251550.
Featherston, Sam. 2005b. Universals and grammaticality: Wh-constraints in German and English. Linguistics 43, 667711.
Featherston, Sam. 2007. Data in generative grammar: The stick and the carrot. Theoretical Linguistics 33, 269318.
Featherston, Sam. 2008. Thermometer judgments as linguistic evidence. In Riehl, Claudia Maria & Rothe, Astrid (eds.), Was ist linguistische evidenz?, 6990. Aachen: Shaker Verlag.
Featherston, Sam. 2009. Relax, lean back, and be a linguist. Zeitschrift für Sprachwissenschaft 28, 127132.
Ferreira, Fernanda. 2005. Psycholinguistics, formal grammars, and cognitive science. The Linguistic Review 22, 365380.
Gallistel, Randy. 2009. The importance of proving the null. Psychological Review 116, 439–53.
Gibson, Edward. 1991. A computational theory of human linguistic processing: Memory limitations and processing breakdown. Ph.D. dissertation, Carnegie Mellon University.
Gibson, Edward & Fedorenko, Evelina. 2010. Weak quantitative standards in linguistics research. Trends in Cognitive Sciences 14, 233234.
Gibson, Edward & Fedorenko, Evelina. In press. The need for quantitative methods in syntax and semantics research. Language and Cognitive Processes, doi:10.1080/01690965.2010.515080. Published online by Taylor & Francis, 4 May 2011.
Gibson, Edward, Piantadosi, Steve & Fedorenko, Kristina. 2011. Using Mechanical Turk to obtain and analyze English acceptability judgments. Language and Linguistics Compass 5, 509524.
Grewendorf, Günter. 2007. Empirical evidence and theoretical reasoning in generative grammar. Theoretical Linguistics 33, 369381.
Gross, Steven & Culbertson, Jennifer. 2011. Revisited linguistic intuitions. British Journal for the Philosophy of Science 62, 639656.
Haider, Hubert. 2007. As a matter of facts – comments on Featherston's sticks and carrots. Theoretical Linguistics 33, 381395.
Hill, Archibald A. 1961. Grammaticality. Word 17, 110.
Hofmeister, Philip, Jaeger, T. Florian, Arnon, Inbal, Sag, Ivan A. & Snider, Neal. In press. The source ambiguity problem: Distinguishing the effects of grammar and processing on acceptability judgments. Language and Cognitive Processes, doi: 10.1080/01690965.2011.572401. Published online by Taylor & Francis, 18 October 2011.
Jeffreys, Harold. 1961. Theory of probability. Oxford: Oxford University Press.
Kayne, Richard S. 1983. Connectedness. Linguistic Inquiry 14, 223249.
Keller, Frank. 2000. Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. PhD. dissertation, University of Edinburgh.
Keller, Frank. 2003. A psychophysical law for linguistic judgments. In Alterman, Richard & Kirsh, David (eds.), The 25th Annual Conference of the Cognitive Science Society, 652657. Boston.
Myers, James. 2009. Syntactic judgment experiments. Language and Linguistics Compass 3, 406423.
Newmeyer, Frederick J. 2007. Commentary on Sam Featherston, ‘Data in generative grammar: The stick and the carrot’. Theoretical Linguistics 33, 395399.
Nickerson, Raymond. 2000. Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods 5, 241301.
Pesetsky, David. 1987. WH-in-situ: Movement and unselective binding. In Reuland, Eric & ter Meulen, Alice G. B. (eds.), The linguistic representation of (in)definiteness, 98129. Cambridge, MA: MIT Press.
Phillips, Colin. 2009. Should we impeach armchair linguists? In Iwasaki, Shoishi, Hoji, Hajime, Clancy, Patricia & Sohn, Sung-Ock (eds.), Japanese/Korean Linguistics 17. Stanford, CA: CSLI Publications.
Phillips, Colin & Lasnik, Howard. 2003. Linguistics and empirical evidence: Reply to Edelman and Christiansen. Trends in Cognitive Sciences 7, 6162.
Rouder, Jeffrey N., Speckman, Paul L., Sun, Dongchu, Morey, Richard D. & Iverson, Geoffrey. 2009. Bayesian t-tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review 16, 225237.
Schütze, Carson. 1996. The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press.
Schütze, Carson & Sprouse, Jon. In press. Judgment data. In Sharma, Devyani & Podesva, Rob (eds.), Research methods in linguistics. Cambridge: Cambridge University Press.
Sorace, Antonella & Keller, Frank. 2005. Gradience in linguistic data. Lingua 115, 14971524.
Spencer, N. J. 1973. Differences between linguists and nonlinguists in intuitions of grammaticality-acceptability. Journal of Psycholinguistic Research 2, 8398.
Sprouse, Jon. 2007a. A program for experimental syntax. Ph.D. dissertation, University of Maryland.
Sprouse, Jon. 2007b. Continuous acceptability, categorical grammaticality, and experimental syntax. Biolinguistics 1, 118129.
Sprouse, Jon. 2008. The differential sensitivity of acceptability to processing effects. Linguistic Inquiry 39, 686694.
Sprouse, Jon. 2009. Revisiting satiation: Evidence for an equalization response strategy. Linguistic Inquiry 40, 329341.
Sprouse, Jon. 2011a. A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods 43, 155167.
Sprouse, Jon. 2011b. A test of the cognitive assumptions of Magnitude Estimation: Commutativity does not hold for acceptability judgments. Language 87, 274288.
Sprouse, Jon & Almeida, Diogo. 2012. The role of experimental syntax in an integrated cognitive science of language. In Grohmann, Kleanthes & Boeckx, Cedric (eds.), The Cambridge handbook of biolinguistics. Cambridge: Cambridge University Press.
Sprouse, Jon & Almeida, Diogo. 2011. Power in acceptability judgment experiments and the reliability of data in syntax. Ms., University of California, Irvine & Michigan State University.
Sprouse, Jon, Fukuda, Shin, Ono, Hajime & Kluender, Robert. 2011. Grammatical operations, parsing processes, and the nature of wh-dependencies in English and Japanese. Syntax 14, 179203.
Sprouse, Jon, Schütze, Carson & Almeida, Diogo. 2011. Assessing the reliability of journal data in syntax: Linguistic Inquiry 2001–2010. Ms., University of California, Irvine; University of California, Los Angeles & Michigan State University.
Sprouse, Jon, Wagers, Matt & Phillips, Colin. 2012. A test of the relation between working memory capacity and island effects. Language 88.1.
Stevens, Stanley Smith. 1957. On the psychophysical law. Psychological Review 64, 153181.
Wasow, Thomas & Arnold, Jennifer. 2005. Intuitions in linguistic argumentation. Lingua 115, 14811496.
Weskott, Thomas & Fanselow, Gisbert. 2011. On the informativity of different measures of linguistic acceptability. Language 87, 249273.
Wetzels, Ruud, Raaijmakers, Jeroen G. W., Jakab, Emöke & Wagenmakers, Eric-Jan. 2009. How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t-test. Psychonomic Bulletin & Review 16, 752760.

Assessing the reliability of textbook data in syntax: Adger's Core Syntax1



Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed