Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-05-30T00:45:09.580Z Has data issue: false hasContentIssue false


An Assessment of Designs, Analyses, and Reporting Practices in Quantitative L2 Research

Published online by Cambridge University Press:  23 August 2013

Luke Plonsky*
Northern Arizona University
*Correspondence concerning this article should be addressed to Luke Plonsky, PO Box 6032, Flagstaff, AZ 86011. E-mail:


This study assesses research and reporting practices in quantitative second language (L2) research. A sample of 606 primary studies, published from 1990 to 2010 in Language Learning and Studies in Second Language Acquisition, was collected and coded for designs, statistical analyses, reporting practices, and outcomes (i.e., effect sizes). The results point to several systematic strengths as well as many flaws, such as a lack of control in experimental designs, incomplete and inconsistent reporting practices, and low statistical power. I discuss these trends, strengths, and weaknesses in comparison with methodological reviews of L2 research (e.g., Plonsky & Gass, 2011) as well as reviews from other fields (e.g., education, Skidmore & Thompson, 2010). On the basis of the findings, I also make a number of suggestions for methodological reforms in applied linguistics.

Copyright © Cambridge University Press 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)



Aguinis, H., Pierce, C. A., Bosco, F. A., & Muslin, I. S. (2009). First decade of organizational research methods trends in design, measurement, and data-analysis topics. Organizational Research Methods, 12, 69112.CrossRefGoogle Scholar
American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: Author.Google Scholar
Bangert, A. W., & Baumberger, J. P. (2005). Research and statistical techniques used in the Journal of Counseling & Development. Journal of Counseling & Development , 83, 480487.CrossRefGoogle Scholar
Brutus, S., Gill, H., & Duniewicz, K. (2010). State-of-science in industrial and organizational psychology: A review of self-reported limitations. Personnel Psychology, 63, 907936.CrossRefGoogle Scholar
Campbell, D., & Stanley, J. (1963). Experimental and quasi-experimental designs for research. Chicago: Rand-McNally.Google Scholar
Cashen, L. H., & Geiger, S. W. (2004). Statistical power and the testing of null hypotheses: A review of contemporary management research and recommendations for future studies. Organizational Research Methods, 7, 151167.CrossRefGoogle Scholar
Chan, A.-W., Hróbjartsson, A., Haahr, M. T., Gøtzsche, P. C., & Altman, D. G. (2004). Empirical evidence for selective reporting of outcomes in randomized trials. Journal of the American Medical Association, 291, 24572465.CrossRefGoogle ScholarPubMed
Chaudron, C. (1986). The interaction of quantitative and qualitative approaches to research: A view of the second language classroom. TESOL Quarterly, 20, 709717.CrossRefGoogle Scholar
Chaudron, C. (2001). Progress in language classroom research: Evidence from The Modern Language Journal, 1916–2000. Modern Language Journal, 85, 5776.CrossRefGoogle Scholar
Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426443.CrossRefGoogle Scholar
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 971003.CrossRefGoogle Scholar
Crookes, G. (1991). Power, effect size, and second language research: Another researcher comments. TESOL Quarterly, 25, 762765.CrossRefGoogle Scholar
DeKeyser, R., & Schoonen, R. (2007). Editors’ announcement. Language Learning, 57, ixx.CrossRefGoogle Scholar
DeVaney, T. A. (2001). Statistical significance, effect size, and replication: What do the journals say? The Journal of Experimental Education, 69, 310320.CrossRefGoogle Scholar
Dinsmore, T. H. (2006). Principles, parameters, and SLA: A retrospective meta-analytic investigation into adult L2 learners’ access to Universal Grammar. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 5390). Amsterdam: Benjamins.CrossRefGoogle Scholar
Downs, S. H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomized and nonrandomized studies of health care interventions. Journal of Epidemiology & Community Health, 52, 377384.CrossRefGoogle Scholar
Egbert, J. (2007). Quality analysis of journals in TESOL and applied linguistics. TESOL Quarterly, 41, 157171.CrossRefGoogle Scholar
Ellis, N. C. (2000). Editorial statement. Language Learning, 50, xixiii.Google Scholar
Fish, L. J. (1988). Why multivariate methods are usually vital. Measurement and Evaluation in Counseling and Development, 21, 130137.CrossRefGoogle Scholar
Flahive, D., & Ehlers-Zavala, F. (2010, March). Power analysis in applied linguistics research. Paper presented at the meeting of the American Association for Applied Linguistics, Atlanta, GA.Google Scholar
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378382.CrossRefGoogle Scholar
Gass, S. M. (1993). Second language acquisition: Cross-disciplinary perspectives. Second Language Research, 9, 9598.CrossRefGoogle Scholar
Gass, S. (2009). A survey of SLA research. In Ritchie, W. & Bhatia, T. (Eds.), Handbook of second language acquisition (pp. 328). Bingley, UK: Emerald.Google Scholar
Gass, S., Fleck, C., Leder, N., & Svetics, I. (1998). Ahistoricity revisited: Does SLA have a history? Studies in Second Language Acquisition, 20, 407421.CrossRefGoogle Scholar
Gelman, A., Hill, J., & Yajima, M. (2012). Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness, 5, 189211.CrossRefGoogle Scholar
Gelman, A., & Weakliem, D. (2009). Of beauty, sex and power: Too little attention has been paid to the statistical challenges in estimating small effects. American Scientist, 97, 310316.CrossRefGoogle Scholar
Goodwin, L. D., & Goodwin, W. L. (1985). An analysis of statistical techniques used in the Journal of Educational Psychology, 1979–1983. Educational Psychologist, 20, 1321.CrossRefGoogle Scholar
Hatch, E. (1978). Apply with caution. Studies in Second Language Acquisition, 2, 123143.CrossRefGoogle Scholar
Hatch, E., & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. Boston: Heinle & Heinle.Google Scholar
Hauser, E. (2001, October). The statistical power of second language acquisition research: A review. Paper presented at the Pacific Second Language Research Forum, University of Hawai‘i at Mānoa.Google Scholar
Henning, G. (1986). Quantitative methods in language acquisition research. TESOL Quarterly, 20, 701708.CrossRefGoogle Scholar
Humphreys, L. G. (1978). Doing research the hard way: Substituting analysis of variance for a problem in correlational analysis. Journal of Educational Psychology, 70, 873876.CrossRefGoogle Scholar
Journal Article Reporting Standards Working Group. (2008). Reporting standards for research in psychology: Why do we need them? What might they be? American Psychologist, 63, 839851.CrossRefGoogle Scholar
Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., . . . Levin, J. R. (1998). Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68, 350386.CrossRefGoogle Scholar
Kieffer, K. M., Reese, R. J., & Thompson, B. (2001). Statistical techniques employed in AERJ and JCP articles from 1988 to 1997: A methodological review. The Journal of Experimental Education, 69, 280309.CrossRefGoogle Scholar
Kubanyiova, M. (2008). Rethinking research ethics in contemporary applied linguistics: The tension between macroethical and microethical perspectives in situated research. Modern Language Journal, 92, 503518.CrossRefGoogle Scholar
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159174.CrossRefGoogle ScholarPubMed
Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. London: Routledge.Google Scholar
Larson-Hall, J., & Herrington, R. (2010). Improving data analysis in second language acquisition by utilizing modern developments in applied statistics. Applied Linguistics, 31, 368–190.CrossRefGoogle Scholar
Lazaraton, A. (1991). Power, effect size, and second language research: A researcher comments. TESOL Quarterly, 25, 759762.CrossRefGoogle Scholar
Lazaraton, A. (2000). Current trends in research methodology and statistics in applied linguistics. TESOL Quarterly, 34, 175181.CrossRefGoogle Scholar
Lazaraton, A. (2005). Quantitative research methods. In Hinkel, E. (Ed.), Handbook of research in second language teaching and learning (pp. 109224). Mahwah, NJ: Erlbaum.Google Scholar
Lazaraton, A., Riggenbach, H., & Ediger, A. (1987). Forming a discipline: Applied linguists’ literacy in research methodology and statistics. TESOL Quarterly, 21, 263277.CrossRefGoogle Scholar
Lee, J. (2010). Integrating second language empirical evidence in theory construction: Unaccusativity as a dichotomy versus a continuum. Eoneohag, 56, 6786.Google Scholar
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60, 309365.CrossRefGoogle Scholar
Lightbown, P. M. (2000). Anniversary article: Classroom second language research and second language teaching. Applied Linguistics, 21, 431462.CrossRefGoogle Scholar
Loewen, S. (2005). Incidental focus on form and second language learning. Studies in Second Language Acquisition, 27, 361386.CrossRefGoogle Scholar
Loewen, S., & Gass, S. (2009). The use of statistics in L2 acquisition research. Language Teaching, 42, 181196.CrossRefGoogle Scholar
Loewen, S., Lavolette, E., Spino, L., Papi, M., Schmidtke, J., Sterling, S., & Wolff, D. (in press). A discipline formed?: An update on applied linguists’ statistical literacy. TESOL Quarterly.Google Scholar
Lykken, D. E. (1968). Statistical significance in psychological research. Psychological Bulletin, 70, 151159.CrossRefGoogle ScholarPubMed
Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language Learning, 59, 453498.CrossRefGoogle Scholar
Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. Mahwah, NJ: Erlbaum.Google Scholar
Mackey, A., & Gass, S. M. (Eds.). (2012). Research methods in second language acquisition: A practical guide. Oxford: Wiley-Blackwell.Google Scholar
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In Mackey, A. (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 407449). Oxford: Oxford University Press.Google Scholar
Magnan, S. S. (1994). From the editor: The MLJ tradition and the challenges ahead. Modern Language Journal, 78, 79.CrossRefGoogle Scholar
Magnan, S. S. (2007). Commentary: The promise of digital scholarship in SLA research and language pedagogy. Language Learning & Technology, 11, 152155.Google Scholar
Matrixx Initiatives Inc. Siracusano, v.. No. 09–1156 (9th Cir. Mar. 22, 2011).Google Scholar
Matthews, M. S., Gentry, M., McCoach, D. B., Worrell, F. C., Matthews, D., & Dixon, F. (2008). Evaluating the state of a field: Effect size reporting in gifted education. The Journal of Experimental Education, 77, 5565.CrossRefGoogle Scholar
Meier, S. T., & Davis, S. R. (1990). Trends in reporting psychometric properties of scales used in counseling psychology research. Journal of Counseling Psychology, 37, 113115.CrossRefGoogle Scholar
Mone, M. A., Mueller, G. C., & Mauland, W. (1996). The perceptions and usage of statistical power in applied psychology and management research. Personnel Psychology, 49, 103120.CrossRefGoogle Scholar
Nassaji, H. (2012). Significance tests and generalizability of research results: A case for replication. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 92115). New York: Cambridge University Press.Google Scholar
Nekrasova, T., & Becker, T. (2009). Effectiveness of practice: A research synthesis and quantitative meta-analysis. Manuscript in preparation.Google Scholar
Nicoladis, E., & Krott, A. (2007). Word family size and French-speaking children’s segmentation of existing compounds. Language Learning, 57, 201228.CrossRefGoogle Scholar
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50, 417528.CrossRefGoogle Scholar
Norris, J. M., & Ortega, L. (2003). Defining and measuring SLA. In Doughty, C. J. & Long, M. H. (Eds.), The handbook of second language acquisition (pp. 717761). Oxford: Blackwell.Google Scholar
Norris, J. M., & Ortega, L. (2006). The value and practice of research synthesis for language learning and teaching. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 350). Amsterdam: Benjamins.CrossRefGoogle Scholar
Norris, J. M., & Ortega, L. (2012). Assessing learner knowledge. In Gass, S. M. & Mackey, A. (Eds.), The Routledge handbook of second language acquisition (pp. 573589). London: Routledge.Google Scholar
Nunan, D. (1991). Methods in second language classroom-oriented research: A critical review. Studies in Second Language Acquisition, 13, 249274.CrossRefGoogle Scholar
Nunan, D. (1996). Issues in second language acquisition research: Examining substance and procedure. In Ritchie, W. C. & Bhatia, T. K. (Eds.), The handbook of second language acquisition (pp. 349374). San Diego, CA: Academic Press.Google Scholar
Ortega, L. (2005). Methodology, epistemology, and ethics in instructed SLA research: An introduction. Modern Language Journal, 89, 317327.CrossRefGoogle Scholar
Ortega, L. (2009). Understanding second language acquisition. London: Hodder.Google Scholar
Ortega, L. (2012). Language acquisition research for language teaching: Choosing between application and relevance. In Hinger, B., Newby, D., & Unterrainer, E. M. (Eds.), Sprachen lernen: Kompetenzen entwickeln? Performanzen (über)prüfen [Language learning: Developing competency? (Re)assessing performances] (pp. 2438). Vienna: Präsens Verlag.Google Scholar
Oswald, F. L., & Plonsky, L. (2010). Meta-analysis in second language research: Choices and challenges. Annual Review of Applied Linguistics, 30, 85110CrossRefGoogle Scholar
Pica, T. (1997). Second language teaching and research relationships: A North American view. Language Teaching Research, 1, 4872.CrossRefGoogle Scholar
Pigott, T. D. (2009). Handling missing data. In Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.), The handbook of research synthesis (2nd ed., pp. 399416). New York: Russell Sage Foundation.Google Scholar
Plonsky, L. (2009, October). “Nix the null”: Why statistical significance is overrated. Paper presented at the Second Language Research Forum, East Lansing, MI.Google Scholar
Plonsky, L. (2011a). The effectiveness of second language strategy instruction: A meta-analysis. Language Learning, 61, 9931038.CrossRefGoogle Scholar
Plonsky, L. (2011b). Study quality in SLA: A cumulative and developmental assessment of designs, analyses, reporting practices, and outcomes in quantitative L2 research (Unpublished doctoral dissertation). Michigan State University, East Lansing.Google Scholar
Plonsky, L. (2012). Replication, meta-analysis, and generalizability. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 116132). New York: Cambridge University Press.Google Scholar
Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61, 325366.CrossRefGoogle Scholar
Plonsky, L., & Oswald, F. L. (2012). How to do a meta-analysis. In Mackey, A. & Gass, S. (Eds.), Research methods in second language acquisition: A practical guide (pp. 275295). Oxford: Wiley-Blackwell.Google Scholar
Polio, C. (1997). Measures of linguistic accuracy in second language writing research. Language Learning, 47, 101143.CrossRefGoogle Scholar
Polio, C. (2012). Replication in published applied linguistics research: An historical perspective. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 4791). New York: Cambridge University Press.Google Scholar
Porte, G. (2010) Appraising research in second language learning: A practical approach to critical analysis of quantitative research (2nd ed.). Amsterdam: Benjamins.CrossRefGoogle Scholar
Pulido, D. (2004). The relationship between text comprehension and second language incidental vocabulary acquisition: A matter of topic familiarity? Language Learning, 54, 469523.CrossRefGoogle Scholar
Raykov, T., & Marcoulides, G. A. (2008). An introduction to applied multivariate analysis. New York: Taylor & Francis.CrossRefGoogle Scholar
Read, J. (2007). Towards a new collaboration: Research in SLA and language testing. New Zealand Studies in Applied Linguistics, 13, 2235.Google Scholar
Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 133164). Amsterdam: Benjamins.Google Scholar
Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training researchers. Psychological Methods, 1, 115129.CrossRefGoogle Scholar
Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105, 309316.CrossRefGoogle Scholar
Selinker, L., & Lakshmanan, U. (2001). How do we know what we know? Why do we believe what we believe? Second Language Research, 17, 323325.Google Scholar
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.Google Scholar
Skidmore, S. T., & Thompson, B. (2010). Statistical techniques used in published articles: A historical review of reviews. Educational and Psychological Measurement, 70, 777795.CrossRefGoogle Scholar
Smith, B., & Lafford, B. A. (2009). The evaluation of scholarly activity in computer-assisted language learning. Modern Language Journal, 93, 868883.CrossRefGoogle Scholar
Sun, S., Pan, W., & Wang, L. L. (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102, 9891004.CrossRefGoogle Scholar
Teleni, V., & Baldauf, R. B. (1989). Statistical techniques used in three applied linguistics journals: Language Learning, Applied Linguistics, and TESOL Quarterly, 1980–1986: Implications for readers and researchers. Retrieved from ERIC database. (ED312905).Google Scholar
Thompson, B. (2001). Significance, effect sizes, stepwise methods, and other issues: Strong arguments move the field. The Journal of Experimental Education, 70, 8093.CrossRefGoogle Scholar
Thompson, B., & Snyder, P. A. (1998). Statistical significance and reliability analyses in recent JCD research articles. Journal of Counseling and Development, 76, 436441.CrossRefGoogle Scholar
Vacha-Haase, T., Ness, C., Nilsson, J., & Reetz, D. (1999). Practices regarding reporting of reliability coefficients: A review of three journals. The Journal of Experimental Education, 67, 335341.CrossRefGoogle Scholar
Vacha-Haase, T., & Thompson, B. (2004). How to estimate and interpret various effect sizes. Journal of Counseling Psychology, 51, 473481.CrossRefGoogle Scholar
Valdman, A. (1998). A note from the editor: 20th anniversary of SSLA. Studies in Second Language Acquisition, 20, 463470.CrossRefGoogle Scholar
Valentine, J. C., & Cooper, H. (2008). A systematic and transparent approach for assessing the methodological quality of intervention effectiveness research: The study design and implementation assessment device (Study DIAD). Psychological Methods, 13, 130149.CrossRefGoogle Scholar
VanPatten, B., & Williams, J. (2002). Research criteria for tenure in second language acquisition: Results from a survey of the field. Unpublished manuscript, University of Illinois at Chicago.Google Scholar
Wa-Mbaleka, S. (2006). A meta-analysis investigating the effects of reading on second language vocabulary learning (Unpublished doctoral dissertation). Northern Arizona University, Flagstaff.Google Scholar
Waring, H. Z. (2009). Moving out of IRF (initiation-response-feedback): A single case analysis. Language Learning, 59, 796824.CrossRefGoogle Scholar
Wells, C. S., & Hintze, J. M. (2007). Dealing with assumptions underlying statistical tests. Psychology in the Schools, 44, 495502.CrossRefGoogle Scholar
Wells, K., & Littell, J. H. (2009). Study quality assessment in systematic reviews of research on intervention effects. Research on Social Work Practice, 19, 5262.CrossRefGoogle Scholar
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594604.CrossRefGoogle Scholar
Willson, V. L. (1980). Research techniques in AERJ articles: 1969 to 1978. Educational Researcher, 9, 510.Google Scholar