Hostname: page-component-76fb5796d-2lccl Total loading time: 0 Render date: 2024-04-25T08:42:24.934Z Has data issue: false hasContentIssue false

Review of recent empirical research (2011–2018) on language assessment in China

Published online by Cambridge University Press:  15 May 2020

Shangchao Min
Affiliation:
Institute of Applied Linguistics, Zhejiang University, Hangzhou, China
Lianzhen He*
Affiliation:
Institute of Applied Linguistics, Zhejiang University, Hangzhou, China
Jie Zhang
Affiliation:
School of Foreign Language Studies at Shanghai University of Finance and Economics, China
*
*Corresponding author. E-mail: hlz@zju.edu.cn

Abstract

This article reviews a selected sample of 70 empirical studies in journal articles and doctoral dissertations on language assessment in China between 2011 and 2018. Following a brief introduction to the history and current state of language assessment in China, the article presents a critical review of language assessment research on six themes that have aroused the greatest interest from researchers in the country, including (1) test reliability and validity; (2) factors affecting test performance; (3) rating and rating scales; (4) technology and language testing; (5) test washback; and (6) classroom-based assessment. In addition to situating the commentary on the studies within the social, cultural and historical contexts of China, this article outlines the scholarly contributions of these studies to the wider international field of language learning, teaching and assessment. It concludes with recommendations on areas in need of further development over the coming decades.

Type
A Country in Focus
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: Author.Google Scholar
Anderson, L. W., Krathwohl, D. R., & Bloom, B. S. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Boston, MA: Allyn & Bacon.Google Scholar
Ansaldo, A. I., Kahlaoui, K., & Joanette, Y. (2012). Functional near-infrared spectroscopy: Looking at the brain and language mystery from a different angle. Brain & Language, 121(2), 7778.CrossRefGoogle ScholarPubMed
Bachman, L. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 142.CrossRefGoogle Scholar
Bachman, L. (2004). Statistical analyses for language assessment. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Bachman, L., & Palmer, A. (2010). Language assessment in practice. Oxford, UK: Oxford University Press.Google Scholar
Bai, L., Feng, L., & Yan, M. (2018). 中国英语笔译能力等级量表的构念与原则. Modern Foreign Languages, 41(1), 101110.Google Scholar
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment Evaluation & Accountability, 21(1), 531.CrossRefGoogle Scholar
Cao, R., & Chen, Y. (2013). 形成性评估及其在口译教学中的应用探析. Chinese Translator Journal, 1, 4550.Google Scholar
Chalhoub-Deville, M. (2016). Validity theory: Reform policies, accountability testing, and consequences. Language Testing, 33(4), 453472.CrossRefGoogle Scholar
Chapelle, C., Enright, M., & Jamieson, J. (2008). Building a validity argument for TOEFL. New York, NY: Routledge.Google Scholar
Chen, J. (2013). Assessment construct in foreign language teaching: The case of Chinese assessors of high-stake exam essays written in English (Ph.D. dissertation). Shanghai International Studies University.Google Scholar
Cheng, L. (2008). The key to success: English language testing in China. Language Testing, 25(1), 1537.CrossRefGoogle Scholar
Cheng, L., & Curtis, A. (2010). The impact of English language assessment and the Chinese learner in China and beyond. In Cheng, L., & Curtis, A. (Eds.), English language assessment and the Chinese learner (pp. 267273). New York, NY: Routledge: Taylor & Francis Group.CrossRefGoogle Scholar
Cheng, L., & Fox, J. (2017). Assessment in the language classroom: Teachers supporting student learning. London, UK: Palgrave.CrossRefGoogle Scholar
Dai, Z. (2011). 计算机口语考试信度研究. Computer-assisted Foreign Language Education, 138, 4550.Google Scholar
Deng, J., & Deng, H. (2017). 中国英语能力等级量表的写作策略框架研究. Foreign Language World, 179(2), 2936.Google Scholar
Dong, L. (2014). 全国高考北京市英语考试对高中英语教学的反拨效应研究 (Ph.D. dissertation). Shanghai International Studies University.Google Scholar
Du, W., & Ma, X. (2018). 基于认知诊断评估的英语阅读诊断模型构建. Foreign Language Teaching and Research, 50(1), 7487.Google Scholar
Fan, J., Ji, P., & Yu, L. (2014). 语言测试效度研究的另一视角:考试的因子结构研究. Foreign Language Learning: Theory and Practice, 4, 3440.Google Scholar
Gao, X. (2014). 视听测试中考生观看行为影响因素的实证研究. Foreign Language Learning: Theory and Practice, 1, 6471.Google Scholar
Green, A. (2013). Washback in language assessment. International Journal of English Studies, 13(2), 3951.CrossRefGoogle Scholar
Gu, X. (2004). Positive or negative? An empirical study of CET washback on college English teaching and learning in China (Ph.D. dissertation). Shanghai Jiao Tong University.Google Scholar
Gui, S. (2015). 我国英语教育的再思考——实践篇. Modern Foreign Languages, 38(5), 687704.Google Scholar
Guo, S., & Li, F. (2012). 大学英语网络考试对大学英语教师专业发展的反拨效应研究. Computer-assisted Foreign Language Education, 147, 7276.Google Scholar
Hamp-Lyons, L. (2002). The scope of writing assessment. Assessing Writing, 8, 516.CrossRefGoogle Scholar
Han, B., & Huang, Y. (2018). 中国英语能力等级量表的研制——语用能力的界定与描述. Modern Foreign Languages, 41(1), 91100.Google ScholarPubMed
He, J., & Wang, F. (2012). 语法自动分析与计算机辅助写作评分. Foreign Languages and Their Teaching, 267(6), 6165.Google Scholar
He, L. (2010). The graduate school entrance English examination. In Cheng, L., & Curtis, A. (Eds.), English language assessment and the Chinese learner (pp. 145157). New York, NY: Routledge.Google Scholar
He, L., & Chen, D. (2017). Developing common listening ability scales for Chinese learners of English. Language Testing in Asia, 7(4), 112.CrossRefGoogle Scholar
He, L., Chen, D., & Min, S. (2018). 英语听力测试中测试方法对任务难度的影响研究. Modern Foreign Languages, 41(1), 4354.Google Scholar
He, L., & Min, S. (2012). 学生外语水平对其在独立写作与综合写作中写作结果的影响. Foreign Languages and Their Teaching, 265(4), 4347.Google Scholar
He, L., & Sun, Y. (2015). 提示特征对中国学生综合写作任务的影响研究. Foreign Language Teaching and Research, 47(2), 237250.Google Scholar
Higgins, L., & Sun, C. (2002). The development of psychological testing in China. International Journal of Psychology, 37(4), 246254.CrossRefGoogle Scholar
Hoang, G. T. L., & Kunnan, A. (2016). Automated essay evaluation for English language learners: A case study of MY Access. Language Assessment Quarterly, 13(4), 359376.CrossRefGoogle Scholar
Hu, X. (2015). 在线作文自我修改对大学生英语写作结果的影响. Computer-assisted Foreign Language Education, 163, 4549.Google Scholar
Huang, X. (2012). An investigation into the effects of background knowledge intervention on English reading comprehension (Ph.D. dissertation). Zhejiang University.Google Scholar
Jiang, J., Wang, L., & Wang, Z. (2012). 学生英译汉分析性评分标准的研制. Foreign Languages and Their Teaching, 267(6), 5660.Google Scholar
Jiang, J., & Wen, Q. (2012). 大规模测试中学生英译汉机器评分模型的构建. Computer-assisted Foreign Language Education, 144, 38.Google Scholar
Jie, W., & Jin, Y. (2017). 口语能力描述语的语体分析:基于中国英语能力等级量表的研究. Foreign Language World, 179(2), 2028.Google Scholar
Jin, Y. (2010). The place of language testing and assessment in the professional preparation of foreign language teachers in China. Language Testing, 27(4), 555584.CrossRefGoogle Scholar
Jin, Y., & Cheng, L. (2013). 影响高风险考试效度的心理因素研究. Modern Foreign Languages, 36(1), 6269.Google Scholar
Jin, Y., & Fan, J. (2011). Test for English Majors (TEM) in China. Language Testing, 28(4), 589596.CrossRefGoogle Scholar
Jin, Y., & Jie, W. (2017). 中国英语能力等级量表的’口语量表’制定原则和方法. Foreign Language World, 179(2), 1019.Google Scholar
Jin, Y., & Yang, H. (2006). The English proficiency of college and university students in China: As reflected in the CET. Language. Culture and Curriculum, 19(1), 2136.Google Scholar
Jin, Y., & Zhang, X. (2013). 技能综合对语言测试构念效度的影响—培生英语考试与大学英语六级网考的对比研究. Computer-assisted Foreign Language Education, 154, 310.Google ScholarPubMed
Kane, M. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 173.CrossRefGoogle Scholar
Kong, W., Li, D., & Yu, G. (2013). L2写作动态评估中同伴中介干预和教师中介干预比较研究. Foreign Language World, 156(3), 7786.Google Scholar
Kong, W., Wang, S., Zhou, Y., & Chen, Y. (2011). 视频信息对 EFL 听力理解影响的实证研究. Computer-assisted Foreign Language Education, 141, 2631.Google Scholar
Li, G., & Zeng, Y. (2011). 交际语言能力模型的构念效度研究. Modern Foreign Languages, 34(4), 389396.Google Scholar
Li, H. (2012). Effects of rater-scale interaction on EFL essay rating outcomes and processes (Ph.D. dissertation). Zhejiang University.Google Scholar
Li, J. (2013). Validating summarization as a read-to-write integrated task (Ph.D. dissertation). Guangdong University of Foreign Studies.Google Scholar
Li, J. (2014). 不同文章体裁概要写作任务的 Rasch 模型分析. Foreign Languages and Their Teaching, 278(5), 3034.Google Scholar
Li, R., & Ni, C. (2017). CALL 新技术应用的接受行为研究——基于在线写作自动评价系统的案例. Foreign Languages and Their Teaching, 296(5), 97104.Google Scholar
Li, Y., & Guan, D. (2016). PETS 口试评分培训效果的多面 Rasch 分析. Foreign Language Learning: Theory and Practice, 3, 4348.Google Scholar
Liu, J., & Han, B. (2018). 面向运用的中国英语能力等级量表建设的理论基础. Modern Foreign Languages, 41(1), 7890.Google Scholar
Liu, J., & , J. (2015). 大规模计算机口试分析评分效果研究. Modern Foreign Languages, 38(2), 248257.Google Scholar
Liu, L., Mak, C., & Jin, T. (2013). 写作测试内容质量评分研究—分层决策树. Modern Foreign Languages, 36(4), 419426.Google ScholarPubMed
Liu, M. (2015). 高考英语听后口头复述任务效度论证研究 (Ph.D. dissertation). Beijing Foreign Studies University.Google Scholar
Liu, Q. (2010). The national education examinations authority and its English language tests. In Cheng, L., & Curtis, A. (Eds.), English Language Assessment and the Chinese Learner (pp. 3043). New York, NY: Routledge: Taylor & Francis Group.Google Scholar
Liu, Q. (2017). 高考英语学科 40 年. China Examinations, 298(2), 1319.Google Scholar
Lu, L. (2016). 基于自动评价系统的第二写作过程研究. Foreign Language World, 173(2), 8896.Google Scholar
McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29(4), 555576.CrossRefGoogle Scholar
Meng, Y. (2013). Developing a model of cognitive diagnostic assessment for college EFL listening (Ph.D. dissertation). Shanghai International Studies University.Google Scholar
Min, S. (2012). Design and validation of a computerized adaptive English proficiency test (Ph.D. dissertation). Zhejiang University.Google Scholar
Min, S., He, L., & Luo, L. (2018). 中国英语听力能力等级量表描述语效度验证——基于学生自我评价的多级计分IRT模型分析. Foreign Languages in China, 15(2), 7281.Google Scholar
Pan, M. (2017). 中国英语写作能力等级量表的典型写作活动构建——系统功能语言学的文本类型视角. Foreign Language World, 179(2), 3752.Google Scholar
Pellegrino, J., DiBello, V., & Goldman, S. (2016). A framework for conceptualizing and evaluating the validity of instructionally relevant assessments. Educational Psychologist, 51(1), 123.CrossRefGoogle Scholar
Peng, K., & Zhang, Y. (2013). 文本可听性对听力理解的影响. Foreign Language Education, 34(3), 5053.Google Scholar
Peng, K., & Zou, S. (2012). TEM4 语法词汇项目的构念效度研究—基于 Rasch 和 CFA 模型的分析. Foreign Languages and Their Teaching, 267(6), 4955.Google Scholar
Spolsky, B. (2017). History of language testing. In Shohamy, E. et al. (Ed.), Language testing and assessment, encyclopedia of language and education (pp. 375384). New York, NY: Springer.CrossRefGoogle Scholar
Sun, H. (2011). 概化理论和多层面 Rasch 模型在建立‘职前中学英语教师口语考试模型’中的应用. Foreign Languages and Their Teaching, 260(5), 5762.Google Scholar
Sun, H., & Wei, M. (2012). 口语测试评分标准的现代测试学分析. Foreign Languages and Their Teaching, 267(6), 6670.Google Scholar
Tai, Z. (2015). 学术讲座笔记质量的预测效力与特征研究. Foreign Languages and Their Teaching, 281(2), 5257.Google Scholar
Tang, J. (2014). 探究写作自动评价系统在英语教学中的应用模式. Foreign Language Learning: Theory and Practice, 1, 4957.Google Scholar
Tang, J., & Wu, Y. (2012). 写作自动评价系统在大学英语教学中的应用研究. Foreign Languages and Their Teaching, 265(4), 5359.Google Scholar
Tang, L. (2014). 双人口语测试语境下的会话互动特征分析. Foreign Languages and Their Teaching, 278(5), 3641.Google Scholar
Toulmin, S. E. (2003). The uses of argument. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Tsagari, D., Cheng, L. (2017). Washback, impact, and consequences revisited. In Shohamy, E. et al. (Ed.), Language testing and assessment, encyclopedia of language and education (pp. 359372). New York: Springer.CrossRefGoogle Scholar
Wang, B. (2012). 基于学习档案的基础阶段英语学习评估机制探索. Foreign Languages and Their Teaching, 266(5), 1519.Google Scholar
Wang, C., & Qi, L. (2016). 从动态系统理论视角看语言测试的反拨效应. Shandong Foreign Language Teaching, 37(4), 3542.Google Scholar
Wang, H. (2011). 写作档案袋评价过程中不同评价主体的探索研究. Foreign Language World, 143(2), 9096.Google Scholar
Wang, H. (2014). Exploring the construct of academic listening: The trio of task demands, cognitive processes and language competence (Ph.D. dissertation). Shanghai International Studies University.Google Scholar
Wang, J. (2013). 基于语料库的大学英语英汉翻译量化评价实证研究. Foreign Language Learning: Theory and Practice, 4, 5357.Google Scholar
Wang, W. (2013). The effects of self- and peer-assessment on Chinese learners’ EFL writing: Students’ perceptions and draft revisions (Ph.D. dissertation). Guangdong University of Foreign Studies.Google Scholar
Wang, W., Xu, Y., & Mu, L. (2018). 中国英语能力等级量表中的口译能力. Modern Foreign Languages, 41(1), 111121.Google Scholar
Wen, Q. (2016). ‘师生合作评价’:’产出导向法’创设的新评价形式. Foreign Language World, 176(5), 3743.Google Scholar
Wind, S., & Peterson, M. (2018). A systematic review of methods for evaluating rating quality in language assessment. Language Testing, 35(2), 161192.CrossRefGoogle Scholar
Wu, Y. (2014). EFL 读写结合测试任务中受试使用原文策略的实证研究. Computer-assisted Foreign Language Education, 159, 6369.Google Scholar
Wu, Y. (2017). 读写结合写作测试任务效度研究——结合定量统计和定性描述的方法. Computer-assisted Foreign Language Education, 173, 5561.Google ScholarPubMed
Xiao, W., Gu, X., & Ni, C. (2014). CET的反拨效应机制:基于多群组结构方程建模的历时研究. Foreign Language Learning: Theory and Practice, 3, 3743.Google Scholar
Xiao, Y. (2017). Formative assessment in a test-dominated context: How test practice can become more productive. Language Assessment Quarterly, 14(4), 295311.CrossRefGoogle Scholar
Xu, L. (2018). The effect of task type on construct representation in an EFL speaking test: The case of TEM-4 oral test (Ph.D. dissertation). Guangdong University of Foreign Studies.Google Scholar
Xu, Q. (2012). 英语专业八级考试的反拨作用研究. Foreign Language World, 150(3), 2131.Google Scholar
Xu, Q. (2014). A study on the washback effects of TEM (Ph.D. dissertation). Shanghai International Studies University.Google Scholar
Xu, S. (2014). Washback effect of a high-stakes test: Preparation for the writing tasks of the Graduate School Entrance English Examination (Ph.D. dissertation). Zhejiang University.Google Scholar
Xu, Y. (2011). 大学英语教师在评估改革中身份转变的叙事探究. Foreign Language Learning: Theory and Practice, 2, 4150.Google Scholar
Xu, Y. (2015). 八年级英语写作诊断测试评分标准的构建和效度验证 (Ph.D. dissertation). Beijing Normal University.Google Scholar
Xu, Y., & Zhang, R. (2017). 小组活动中评价个人贡献的必要性与可行性实证研究. Modern Foreign Languages, 40(2), 244253.Google Scholar
Yang, H., & Weir, C. (2001). Validation study of the National College English Test (3rd ed.). Shanghai: Shanghai Foreign Language Education Press.Google Scholar
Yang, H., & Wen, Q. (2014a). 目标在外语课堂即时形成性评估中的动态变化特征及方式. Foreign Language Teaching and Research, 46(3), 389400.Google Scholar
Yang, H., & Wen, Q. (2014b). 外语课堂即时形成性评估的’相倚性’研究. Foreign Language Education, 35(4), 4145.Google Scholar
Zeng, Y., & Fan, T. (2017). Developing reading proficiency scales for EFL learners in China. Language Testing in Asia, 7(8), 115.CrossRefGoogle Scholar
Zhang, C. (2015). 高考英语语法填空题构念效度的 Rasch 模型分析. Modern Foreign Languages, 38(2), 258268.Google Scholar
Zhang, F. (2015). The variability and mechanism of washback: Investigating the washback of NMET CELST through teachers’ test preparations (Ph.D. dissertation). Guangdong University of Foreign Studies.Google Scholar
Zhang, L., & Sheng, Y. (2015). 自动作文评阅系统反馈效果个案研究. Computer-assisted Foreign Language Education, 163, 3844.Google Scholar
Zhang, X., & Zhang, Y. (2014). 任务类型对中国英语学习者写作表现的影响. Modern Foreign Languages, 37(4), 548558.Google Scholar
Zhang, Y., & Peng, K. (2012). TEM8 写作考试评分员差异性研究. Computer-assisted Foreign Language Education, 143, 4246.Google Scholar
Zhou, L. (2015). 英语在线写作平台对大学生写作句法能力的影响. Computer-assisted Foreign Language Education, 165, 2629.Google Scholar
Zhu, Z. (2017). 中国大学英语考试能力构念三十年之嬗变. Foreign Language Learning: Theory and Practice, 1, 6066.Google Scholar
Zou, S. (2003). 语言教学大纲与语言测试的衔接——TEM8 的设计与实施. Foreign Language World, 98(6), 7178.Google Scholar
Zou, S. (2017). The development and validation of an analytic rating scale for the writing assessment of College English Test Band 4 (Ph.D. dissertation). Shanghai Jiao Tong University.Google Scholar