Classroom-based assessment

The assessment of students in the classroom has been going on since time immemorial. What is comparatively recent, however, is the systematic study of classroom-based assessment (CBA). The term ‘CBA’ has been putatively linked to Michael Scriven's (1967) work on formative and summative evaluation. However, current interest in such assessment and how it is enacted has, to a large extent, been prompted by shifts in educational policy in various contexts and evolving education systems. This, in turn, has led to the increase in research activity that is detailed in the timeline that follows. At the same time, considerable effort has been exerted by various governments and professional associations into the development of CBA frameworks, but as publications related to these are not strictly research documents, a separate list of examples is provided as supplementary material.


Introduction
The assessment of students in the classroom has been going on since time immemorial. What is comparatively recent, however, is the systematic study of classroom-based assessment (CBA). The term 'CBA' has been putatively linked to Michael Scriven's (1967) work on formative and summative evaluation. However, current interest in such assessment and how it is enacted has, to a large extent, been prompted by shifts in educational policy in various contexts and evolving education systems. This, in turn, has led to the increase in research activity that is detailed in the timeline that follows. At the same time, considerable effort has been exerted by various governments and professional associations into the development of CBA frameworks, but as publications related to these are not strictly research documents, a separate list of examples is provided as supplementary material.
Initially, research interest in the nature and enactment of formative CBA came from the field of general education. For example, in the late 1980s, as part of a radical overhaul of the school education system in England that placed emphasis on central government control of the curriculum and of assessment, the Task Group on Assessment and Testing (TGAT) drew up a report that reflected an attempt to, inter alia, reconcile the conflicting demands of high-stakes public reporting of student performance in accordance with statutory curriculum specifications, and educationally-oriented assessment that reports student progression in the light of teaching (and therefore learning) experience (TGAT, 1987). Paragraph 5 of the report stated that the results of assessment 'should provide a basis for decisions about pupils' further learning needs: they should be formative'.
The chair of the TGAT task group, Paul Black, later published a seminal paper on formative assessment with Dylan Wiliam (Black & Wiliam, 1998). This seminal 'state of play' review of research on classroom-based formative assessment covered 681 publications which reported on a range of studies with diverse research designs and methodologies investigating formative assessment schemes and practices at different educational levels and settings internationally. Some of the studies had experimental and control groups, others were classroom-based and teacher-led. The analysis and commentary in the Black and Wiliam paper foregrounded issues such as quality of teacher-student interaction, feedback, the role of the student in assessment, and self and peer assessment. Many of these issues resonate with the current work in formative assessment. The authors' list of issues to be taken into account by researchers in investigating formative assessment is as relevant today as it was then. Many of these have been taken up by those researching CBA within the field of languages education with some reflected in the present timeline.
Educational reform in England in the 1980s provided the impetus for investigations into how teachers assess in situations where English is used as an additional language particularly at the primary and secondary school levels of education. But it was not only England that towards the turn of the century was experiencing educational reform: notable examples of reform include the introduction of the National Certificate of Educational Achievement in New Zealand in 2002 (see East & Scott, 2011, for a historical account), the introduction of school-based assessment (SBA) as an element of secondary school assessment in Hong Kong from 2001, as well as the introduction of Scottish National Standardised Assessments since 2017. What all these initiatives have in common is a commitment to the assessment for learning (AfL) as well as a growing interest in understanding how such assessment is implemented at the classroom level. This, in turn, has exposed existing tension between national policy recommendations and challenges of classroom implementation. It has also drawn attention to the need for improved assessment literacy for teachers, many of whom have continued to rely heavily on summative assessment practices despite the fact that limitations of summative assessment had been unequivocally demonstrated. Wiliam (2001), for example, was able to show that from a mathematical perspective the results of high-stakes tests may produce highly unreliable results for individual students even when they are deemed reliable for the group as a whole.
The formative-summative assessment dichotomy was for some time portrayed over-simplistically. Case study research of individual classrooms has shed light on the complexities of formative CBA in general and more specifically in language education which has to work with a variety of pedagogic concepts and learning theories. This kind of research has highlighted the limited applicability of the psychometric paradigm dominant within summative, standardised assessment to account for unplanned, often context and content-embedded implicit assessment that is part and parcel of CBA. Such research has provided a basis for the development of CBA frameworks that take account of teacher and student interactions. Unlike in psychometrically oriented assessments, CBA is predominantly located within the classroom, thus making it important to take account of participants' perspectives and understandings in relation to curriculum and assessment requirements.
CBA has developed along two main trajectories: theory building to better understand its conceptual basis (construct) and analysing on a moment-by-moment basis how it is enacted in classrooms at different educational levels and in different world locations. These, therefore, form the main themes of our timeline. In this timeline our working definition of CBA is: Any teacher-led classroom activity designed to find out about students' performance on curriculum tasks that would yield information regarding their understanding as well as their need for further support and scaffolding with reference to their situated learning needs. (We recognise that not all CBA is teacher-led and that self and peer assessment may play an important role, but these aspects of CBA would warrant a timeline in their own right.) We further recognise that there have been shifts in the use of terms such as 'foreign language', 'second language' and 'additional language' in recent times. We will use 'additional/second language' as a catch-all term, but we will use 'foreign' and 'second' where it is representationally important to signal historical accuracy. Our timeline contains works from additional/second language research, and from the field of education more generally where appropriate.
Thus, our selection of themes is as follows: A understanding of the conceptual basis (construct) of CBA B implementation and enactment of CBA  (1998). From formative evaluation to a controlled regulation of learning processes: Towards a wider conceptual field. Assessment in Education, 5(1), 85-102.
Perrenoud, engaging the English language literature from a French research perspective, argues that much scholarly work (up to the time of writing) in formative CBA focussed on evaluative feedback, assuming that feedback would promote learning. Perrenoud argues that such an assumption is not necessarily 'safe' as student responses cannot be assumed. While this paper is concerned with educational assessment generally, many of the issues discussed have conceptually foreshadowed a good deal of the more recent critically minded work in the field of additional/second language education. Rea-Dickins and Gardner were among the first to focus attention on the construct of formative assessment. On the basis of feedback from teachers, they demonstrated the complex nature of formative assessment and showed that the demarcation between summative and formative assessment was not clear-cut as previously believed.
With the increasing debates on the educational merits and the advocacy for classroom-based formative assessment in the 1990s (also known as alternative assessment at the time), there was a need, also acknowledged by REA-DICKINS & GARDENER (2000), to reconsider the established conceptual and operational assumptions in second/foreign language testing in terms of validity and reliability. In this paper Teasdale and Leung provide an account of the conceptual and theoretical difficulties in assuming that the psychometric assumptions and principles underlying standardised (usually summative) testing can be elided unproblematically into classroom-based formative assessment. The implications of the incongruences for assessment, pedagogy and policy are discussed. Rea-Dickins draws on the growing CBA literature within general education (see BLACK & WILIAM, 1998, for a comprehensive overview) to investigate classroom-based teacher assessment in a mainstream English elementary school setting with particular reference to students from an English as an additional language (EAL) background. The discussion provides a context-sensitive account of the complex curricular and pedagogic processes and practices when studying classroom-based teacher assessment.
The 1990s witnessed the emergence of diverse approaches to educational assessment that were sensitive to local contexts and social practices; formative approaches were part of these developments. There was, at the same time, a hardening of policy support for standardised assessment for public accountability. This authoritative retrospective paper provides an informed view on the various 'movements' within the field of educational assessment. In the sections on formative assessment Broadfoot and Black highlight the pitfalls in formative practices in predominantly summative assessment-oriented educational environments, indicating the growth points in conceptualising and developing formative assessments. TEASDALE & LEUNG (2000)  In contrast to her earlier work (REA-DICKINS, 2001) relying on teacher self-report, in this paper Rea-Dickins looks at learners in dialogue with their teachers and peers as they are being assessed in the classroom. She draws on longitudinal data from the EAL primary classroom in the UK to show how different teaching agendas shape assessments variably. She argues that within instruction there should be a balance of summative and formative assessment so as to provide learners with sufficient opportunities to engage in assessment and develop their language and language learning awareness. In this paper Leung and Rea-Dickins explore policy and practice of assessment at a time of curriculum change in the UK. They highlight the rift between policy makers in search of demonstrating rising educational attainment and the realities of the classroom. Echoing REA-DICKINS & GARDNER (2000), the authors argue for the need to distinguish not only between summative and formative classroom assessment but also between assessing English as a first language and EAL.

) and the tensions it may create between teachers and management is well illustrated by McKay and
Brindley with reference to the Australian context. The authors consider how an outcomes-based system of assessment introduced for accountability purposes impacted on school and adult education. In the ESL school sector, the change appeared to have a negative impact as teachers focussed more on the high-stakes standardised assessment and less on the curriculum and individual learner needs (sentiments echoed by INBAR-LOURIE & DONISTA-SCHMIDT, 2009). In the adult sector where there was no standardised final test, reporting on learner outcomes appeared to be variable and not consistent across teachers but teachers remained more learner-centred in their approach. Inbar-Lourie and Donista-Schmidt report on a number of internal as well as external factors that impinge on teachers' employment of CBA and the tensions that exist between the highly centralised, top-down Israeli education system that prizes high-stakes testing and more recent attempts to introduce alternative assessment within the language classroom. Their observations resonate with those of BRINDLEY (2001). Poehner discusses the implementation of group DA and shows how individual learners working in groups may be supported through DA in the classroom. This discussion extends the conceptual frame of formative CBA (see LANTOLF & POEHNER, 2004) that tends to focus on one-to-one teacher-student interaction. This paper reports on a survey investigating student and parents' perceptions of SBA introduced as part of the Hong Kong Certificate of Education Examinations for secondary school students. Interestingly, it found that students viewed SBA in the same way as the more formal parts of the examination and was in part determined by their parents' perceptions of SBA. Based on the results of a large-scale study of school-based classroom assessment of English proficiency in the US, Llosa questions the usefulness of such assessments for classroom-based, formative purposes. She demonstrates that teachers are good judges of students' overall language ability for summative purposes, but their ability to judge students' mastery of individual standards is much less consistent. Llosa argues that teachers need a better understanding of specified standards to be able to help students achieve these standards. The challenges teachers face in operationalising standards discussed by Llosa echo those raised in earlier work by DAVISON (2004) and CHENG ET AL. (2004). Acknowledging the influence of sociocultural theories of learning on assessment practices, Scarino argues for an expansion of the knowledge-base of teacher-assessors. On the basis of examples from in-service training, she shows the importance of acknowledging and working with trainees' existing knowledge and preconceptions of language, and providing the tools for them to critically assess their own assessment practices to develop their assessment literacy. This paper picks up on many of the issues discussed by REA-DICKINS & GARDNER (2000) and DAVISON (2004). In a similar vein to VOGT & TSAGARI (2014), this large-scale study investigated assessment literacy of female primary school teachers of English in Kuwait. Teachers reported that although they perceived themselves as both knowledgeable and skilful in using alternative assessment, they felt more confident and were more favourably disposed to traditional, summative testing.
This survey explores teacher assessment literacy across seven European countries and shows that training in the field is limited and where received, tends to be limited to traditional assessments. The results, which resonate with SCARINO (2013), point to a need for more focus to be placed on language testing and assessment in pre-and in-service training courses. Saito and Inoi investigated the differential use of formative assessment among high school English-as-a-foreign language teachers in Japan. They identified three levels of formative assessment use: high, mid and low, differentiated by four strategy use formative assessment variables, namely: intentions, methods, purposes and feedback. Unlike previous studies (e.g. CHENG ET AL., 2004), Saito and Inoi found that the differences in use of formative assessment were individual and did not depend on the educational level of teachers or the type of school and did not impact on the quality of teaching. Some of the discussion in this paper touches on issues raised by HILL & MCNAMARA (2012). Scarino argues that, with the onset of globalisation, there has been a shift away from Communicative Language Teaching to an intercultural orientation which has brought with it a need for language teachers to reconceptualise the construct to be assessed and adjust the nature of the assessment process. This has brought conceptual and interpretive challenges for teachers that have implications for teacher development of assessment literacy. East investigates the impact of the introduction of a new foreign language speaking assessment in New Zealand as part of the National Certificate of Educational Achievement. The aim of the reform is to move towards locally-based and teacher-created assessment. Although the teacher interviewees perceived the assessment as providing learning potential for students, teachers recognised a tension between the way they operationalise the assessments and the accountability necessitated by the high-stakes nature of the assessments.
This chapter discusses four innovative assessment practices all linked by a common theme of 'for-learning assessment' that is embedded in pedagogical practice and situated in a particular educational and sociocultural context. It highlights again the tension between teachers addressing students' needs and having to adhere to curriculum requirements, as well as the interlocking nature of the various issues that impact on the assessment such as the context, construct, nature of learning and the linguistic focus of teaching/learning. Rather than relying on teacher self-report of language assessment competencies (e.g. the study by VOGT & TSIGARI, 2014), Levi and Inbar-Lourie investigate language teachers' assessment literacy through examining the application of their assessment knowledge following a generic course on assessment literacy. Content analysis of the assessments produced by language teachers of both English and Hebrew produced revealed their unique needs to address the multi-componential complexity of assessing language particularly in formative situations. Their work reaffirms many earlier studies highlighting the complex nature of language assessment literacy ( This chapter explores how the teacher assessment literacy resource developed by HILL (2017) can inform as well as be informed by written feedback practices employed by an experienced teacher of L2 Spanish in the Australian context. The teacher-researcher collaboration proved to be of mutual benefit, enabling the teacher to reflect in a more systematic way on her own feedback practices, and the researcher to assess and refine the resource.