Abbott, M. L. (2005). English reading strategies differences in Arabian and Mandarin speaker performance on the CLBA reading assessment (doctoral dissertation). Retrieved from Theses Canada (32659077).
Abbott, M. L. (2007). A confirmatory approach to differential item functioning on an ESL reading assessment. Language Testing 24.1, 1–30.
Alderson, C. (2007). The challenge of (diagnostic) testing: Do we know what we are measuring? In Fox, J., Wesche, M., Bayliss, D., Cheng, L., Turner, C. & Doe, C. (eds.), Language testing reconsidered. Ottawa, ON: University of Ottawa Press, 21–39.
Baba, K. (2007). Dimensions of lexical proficiency in writing summaries for an English as a foreign language test (doctoral dissertation). Retrieved from Theses Canada (33748472).
Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing 17.1, 1–42.
Bachman, L. F. (2007). What is the construct? The dialectic of abilities and contexts in defining constructs in language assessment. In Fox, J., Wesche, M., Bayliss, D., Cheng, L., Turner, C. & Doe, C. (eds.), Language testing reconsidered. Ottawa, ON: University of Ottawa Press, 41–71.
Bachman, L. F. & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.
Baker, B. A. (2010). In the service of the stakeholder: A critical, mixed methods program of research in high-stakes language assessment (doctoral dissertation). Retrieved from ProQuest (NR74366).
Barkaoui, K. (2007). Participants, texts, and processes in second language writing assessment: A narrative review of the literature. The Canadian Modern Language Review 64, 97–132.
Barkaoui, K. (2008). Effects of scoring method and rater experience on ESL essay rating processes and outcomes (doctoral dissertation). Retrieved from Theses Canada (35096546).
Bogdan, R. C. & Bicklen, S. K. (1998). Qualitative research in education. Boston, MA: Allyn and Bacon.
Brumfit, C. (1997). How applied linguistics is the same as any other science. International Journal of Applied Linguistics 7.1, 86–94.
Cervatiuc, A. (2007). Highly proficient adult non-native English speakers’ perceptions of their second language vocabulary learning process (doctoral dissertation). Retrieved from ProQuest (NR33791).
Cheng, L. (2005). Changing language teaching through language testing: A washback study. Cambridge: Cambridge University Press,
Cheng, L. (2008). Washback, impact and consequences. In Shohamy, E. & Hornberger, N. H. (eds.), Encyclopedia of language and education, Vol. 7: Language testing and assessment. New York: Springer, 349–364.
Cheng, L. & DeLuca, C. (2011). Voices from test-takers: Further evidence for test validation and test use. Educational Assessment 16.2, 104–122.
Cheng, L., Watanabe, Y. & Curtis, A. (eds.) (2004). Washback in language testing: Research contexts and methods. Mahwah, NJ: Lawrence Erlbaum.
Clandinin, J. & Connelly, M. (2000). Narrative inquiry: Experience and story in qualitative research. San Francisco, CA: Jossey-Bass.
Colby-Kelly, C. & Turner, C. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review 64.1, 9–37.
Colby, D. C. (2010). Using ‘Assessment of learning’ practices with pre-university level ESL students: A mixed methods study of teacher and student performance and beliefs (doctoral dissertation). Retrieved from ProQuest (NR61979).
Creswell, J. W. (1998). Qualitative inquiry and research design: Choosing among five traditions. Thousand Oaks, CA: Sage.
Creswell, J. W. & Plano-Clark, V. L. (2007). Designing and conducting mixed methods research. Thousand Oaks, CA: Sage.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing 7, 31–51.
Doe, C. (2011). The integration of diagnostic assessment into classroom instruction. In Tsagari, D. & Csepes, I. (eds.), Classroom-based language assessment: Language testing and evaluation. Frankfurt: Peter Lang, 63–76.
Douglas, S. R. (2010). Non-native English speaking students at university: Lexical richness and academic success (doctoral dissertation). Retrieved from ProQuest (NR69496).
Eggins, S. (2004). An introduction to systemic functional linguistics (2nd edn). New York: Continuum.
Farnia, F. (2006). Modeling growth in reading fluency and reading comprehension in EL1 and ESL children: A longitudinal individual growth curve analysis from first to sixth grade (doctoral dissertation). Retrieved from Theses Canada (33265070).
Fleming, D. J. (2007). Becoming Canadian: Punjabi ESL learners, national language policy and the Canadian language benchmarks (doctoral dissertation). Retrieved from Theses Canada (33664937).
Fox, J. (2003). From products to process: An ecological approach to bias detection. International Journal of Testing 3.1, 21–48.
Fox, J. (2005). Rethinking second language acquisition requirements: Problems with language-residency criteria and the need for language assessment and support. Language Assessment Quarterly 2.2, 85–115.
Fox, J. (2009). Moderating top-down policy impact and supporting EAP curricular renewal: Exploring the potential of diagnostic assessment. Journal of English for Academic Purposes 8, 26–42.
Fox, J. & Cheng, L. (2007). Did we take the same test? Differing accounts of the Ontario Secondary School Literacy Test by first and second language test takers. Assessment in Education: Principles, Policy & Practice 14.1, 9–26.
Fox, J. & P. Hartwick (2011). Taking a diagnostic turn: Reinventing the portfolio in EAP classrooms. In Tsagari, D. & Csepes, I. (eds.), Classroom-based language assessment. Frankfurt: Peter Lang, 47–61.
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing 13.2, 208–238.
Gao, L. (2007). Cognitive-psychometric modeling of the MELAB reading items (doctoral dissertation). Retrieved from Theses Canada (33905480).
Gao, L. & Rogers, W. T. (2011). Use of tree-based regression in the analyses of L2 reading test items. Language Testing 28, 77–104.
Glaser, B. G. & Strauss, A. L. (1967). The discovery of grounded theory. Chicago, IL: Aldine.
Grabe, W. (2009). Reading in a second language: Moving from theory to practice. New York: Cambridge University Press.
Grabe, W. & Stoller, F. L. (2011). Teaching and researching reading. Harlow, UK: Pearson Education Limited.
Greene, J. C., Caracelli, V. J. & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis 11, 255–374.
Gunning, P. (2011). ESL strategy use and instruction at the elementary school level: A mixed methods investigation (doctoral dissertation). Retrieved from ProQuest (NR77521).
Hamp-Lyons, L. (1990). Second language writing: Assessment issues. In Kroll, B. (ed.), Second language writing: Research insights for the classroom. Cambridge: Cambridge University Press, 69–87.
Hamp-Lyons, L. (1995). Rating non-native writing: The trouble with holistic scoring. TESOL Quarterly 29, 759–762.
Hatch, E. & Lazaraton, A. (1991). The research manual: Design and statistics for Applied Linguistics. Rowley, MA: Newbury House.
Hutchinson, S. A. (1997). Education and grounded theory. In Sherman, R. & Webb, R. (eds.), Qualitative research in education: Focus and methods. Philadelphia, PA: Falmer Press, 123–140.
Isaacs, T. (2010). Towards defining a valid assessment criterion of punctuation proficiency in non-native English-speaking graduate students (doctoral dissertation). Retrieved from ProQuest (MR24877).
Isaacs, T. & Trofimovich, P. (2011). Phonological memory, attention control, and musical ability: Effects of individual differences on rater judgments of L2 speech. Applied Psycholinguistics 32, 113–140.
Ishii, D. N. (2009). Language dia-logs: A collaborative approach for providing effective feedback on ESL learners’ verb errors in writing (doctoral dissertation). Retrieved from Theses Canada (37943444).
Kane, M. T. (2002). Validating high-stakes testing programs. Educational Measurement: Issues and Practices 21.1, 31–41.
Kim, Y. (2010). An argument-based validity inquiry into the empirically-derived descriptor-based diagnostic assessment in ESL academic writing. Unpublished doctoral dissertation. University of Toronto.
Kwan, A. B. (2005). Impact of systemic phonics instruction on young children learning English as a second language (doctoral dissertation). Retrieved from Theses Canada (32659359).
Lado, J. (1961). Language testing: The construction and use of foreign language tests. London: Longman.
Leedy, P. (1997). Practical research: Planning and design (6th edn). Upper Saddle River, NJ: Prentice Hall.
Limbos, M. (2005). Early identification of second-language students at risk for reading disability (Doctoral dissertation). Retrieved from Theses Canada (32659383).
McKay, P. (2006). Assessing young language learners. Cambridge: Cambridge University Press.
McNamara, T. & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell Publishing.
Messick, S. (1989). Validity. In Linn, R. L. (ed.), Educational measurement (3rd edn). New York: Macmillan, 13–103.
Messick, S. (1996). Validity and washback in language testing. Language Testing 13, 243–256.
Mislevy, R. J., Steinberg, L. S. & Almond, R. C. (2003). On the structure of assessment arguments. Measurement: Interdisciplinary Research and Perspectives 1.1, 3–62.
Morris, L. & Cobb, T. (2004). Vocabulary profiles as predictors of TESL student performance. System 32.1, 75–87.
Moss, P. A., Girard, B. J. & Haniford, L. C. (2006). Validity in educational assessment. Review of Research in Education 30, 109–162.
Mullen, A. (2009). The impact of using a proficiency test as a placement tool: The case of Test of English for International Communication (TOEIC) (doctoral dissertation). University of Laval, QC: Canada.
Neumann, H. (2010). What's in a grade? A mixed methods investigation of teacher assessment of grammatical ability in L2 academic writing (doctoral dissertation). Retrieved from ProQuest (NR77532).
Qi, L. (2007). Is testing an efficient agent for pedagogical change? Examining the intended washback of the writing task in a high-stakes English test in China. Assessment in Education 14.1, 51–74.
Rampton, B. (1997). Retuning in applied linguistics. International Journal of Applied Linguistics 7.1, 3–25.
Samson, M. (2012). What applied linguists do: An investigation of research practices in the field (Master's research essay). Carleton University, Ottawa, Canada.
Seror, J. (2008). Socialization in the margins: Second language writers and feedback practices in university content courses (doctoral dissertation). University of British Columbia, Canada.
Shih, C. M. (2006). Perceptions of the General English Proficiency Test and its washback: A case study of two Taiwan technological institutes (doctoral dissertation). Retrieved from ProQuest (NR16000).
Shohamy, E. & McNamara, T. (2009). Language tests for citizenship, immigration, and asylum. Language Assessment Quarterly: An International Journal 6, 1–5.
Song, Y. H. (2007). A narrative inquiry into classroom assessment: Stories of six Chinese adult learners of English as a second language (doctoral dissertation). Retrieved from Theses Canada (33969044).
Sterzuk, A. (2007). Dialect speakers, academic achievement, and power: First nations and Metis children in Standard English classrooms (doctoral dissertation). Retrieved from Theses Canada (34491819).
Suzuki, W. (2009). Languaging, direct correction, and second language writing: Japanese university students of English (doctoral dissertation). Retrieved from Theses Canada (37943556).
Tan, H. M. (2009). Changing the language of instruction of mathematics and science in Malaysia: The PPSMI policy and washback effect of bilingual high-stakes secondary school exit exams (doctoral dissertation). Retrieved from Theses Canada (39290869).
Teddlie, C. & Tashakkori, A. (2009). Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences. Thousand Oaks, CA: Sage.
Tesch, R. (1994). The contribution of a qualitative method: Phenomenological research. In Langenbach, M., Vaughan, C. & Aagaard, L. (eds.), An introduction to educational research. Needham Heights, MA: Allyn & Bacon, 143–157.
Turner, C. & Upshur, J. (2002). Rating scales derived from students samples: Effects of the scale maker and student sample on scale content and student scores. TESOL Quarterly 36.1, 49–70.
Wakamoto, N. (2007). The impact of extroversion/introversion and associated learner strategies on English language comprehension in a Japanese EFL setting (doctoral dissertation). Retrieved from Theses Canada (33748547).
Wall, D. (2005). The impact of high-stakes examinations on classroom teaching: A case study using insights from testing and innovation theory. Cambridge, UK: Cambridge University Press.
Wang, J. (2010). A study of the role of the ‘teacher factor’ in washback (doctoral dissertation). Retrieved from ProQuest (NR74872).
Watanabe, Y. (2004). Teacher factors mediating washback. In Cheng, L., Watanabe, Y. & Curtis, A. (eds.), Washback in language testing: Research contexts and methods. Mahwah, NJ: Lawrence Erlbaum, 129–146.
Weigle, S. C. (1994). Effects of training on raters of ESL compositions. Language Testing 11, 197–223.
Widdowson, H. G. (1998). Retuning, calling the tune, and paying the piper: A reaction to Rampton. International Journal of Applied Linguistics 8.1, 147–151.
Yang, Y. (2008). Corrective feedback and Chinese learners’ acquisition of English past tense (doctoral dissertation). Retrieved from Theses Canada (38060335).
Zheng, Y. (2010). Chinese university students’ motivation, anxiety, global awareness, linguistic confidence, and English test performance: A causal and correlational investigation (doctoral dissertation). Retrieved from Theses Canada (39291111).