It has now become commonplace in corpus research published across the globe for authors to reflect on and investigate the applications of corpus linguistics to a range of contexts and practices. While its terminological polysemy and reflexivity render the notion of application conceptually accessible, the act of applying corpus linguistics research and methodologies is far from homogeneous. This is the case to the extent that when the applications of corpus linguistics are discussed on the global stage, we are unlikely to share one explanation of what application actually means. Nevertheless, a number of volumes and texts have endeavoured to delineate the scope and remit of the applications of corpus linguistics as the field has evolved. It is upon these contributions to the literature that this volume aims to build.
In Connor and Upton (Reference Connor and Upton2004), corpus linguistics is presented as an applied discipline that offers a broad array of research methodologies and tools for analysing both spoken and written language in various contexts. For them, corpus application pertains to the systematic use of language corpora to explore language features, patterns, and functions within so-called real-life contexts. Much of the work presented in their volume pertains to corpus applications in language pedagogy, building on the burgeoning corpus revolution of the 1990s, which promised to change the face of contemporary language teaching forever (Rundell & Stock, Reference Rundell and Stock1992a; Reference Rundell and Stock1992b; Reference Rundell and Stock1992c). According to Connor and Upton, corpus linguistics approaches can provide valuable evidence for informing language teaching, and for conducting discourse and register analyses, as through their application, corpus-based methodologies offer rich insights into both micro-level language features and macro-level communicative patterns. Hyland et al. (Reference Hyland, Chau and Handford2012) build on the notion of corpus applications, expanding our understanding of its remit within and beyond the academy. For Hyland et al., the applications of corpus linguistics had expanded such that they had come to be seen as means of advancing our theoretical understanding of language and language analysis, pedagogical practices in language education, and approaches to the design and development of interdisciplinary studies. Many of these same foci are echoed in Cortes and Csomay (Reference Cortes and Csomay2015), who conceive the applications of corpus linguistics in terms of the methodological contributions of Doug Biber to the study of language variation, register, and genre across a wide range of contexts. In Cortes and Csomay, corpus application appears to centre around the affordances of corpus approaches for studying linguistic phenomena in both spoken and written discourse, and for uncovering patterns of linguistic behaviour that may not be easily accessible through intuition or small-scale analyses. In this way, the notion of corpus applications becomes intertwined with methodology, above all else. As these volumes attest, already at this time, the notion of corpus applications had become wide-reaching and complex.
This complexity was further enriched by McIntyre and Price’s (Reference McIntyre and Price2018) edited volume, wherein the demonstrable sociocultural and economic impact of applied linguistics research is critically appraised. As opposed to application, the notion of impact governs the discourses in McIntyre and Price (Reference McIntyre and Price2018), as the editors contextualise applied linguistic research within the wider impact agenda and Research Excellence Framework (still) shaping academic research and practice within the UK. While the volume is not solely dedicated to the study of corpus linguistics and its applications, the notion of corpus applications appears at the core of several of the studies presented therein. When corpus approaches are addressed, discussions issue a shift in focus from the methodological applications of corpus linguistics seen in earlier volumes towards a reflection on how corpus-based research, among other kinds of linguistics research, can directly affect and influence public policy, education, media, and social justice initiatives beyond the academy. Reflecting on advances within the field, in her second edition of Corpora in Applied Linguistics, Hunston (Reference Hunston2022; first edition 2002) demonstrates how the applications of corpus linguistics can achieve such ends by drawing on advanced computational tools and corpus data to identify patterns of language that would otherwise be difficult to detect, such as collocations, phraseology, and discourse features, reiterating arguments not only from the first edition of the book but also articulated in other early works on the applications of corpus linguistics (discussed earlier). However, crucially, Hunston also underscores that such corpus applications go beyond merely counting linguistic features. Importantly, they offer a way to explore the deeper meanings and functions of language in specific settings and, in her view, these applications bridge the gap between theoretical knowledge and practical, real-world language use, making corpus linguistics a powerful tool for applied linguistics research.
The story of the applications of corpus linguistics may not be linear, as we move back and forth between a focus on enhancing methodologies and shaping professional practices. However, there is a variable that could act as common denominator for these views of application: context. The affordances of corpus linguistics for enhancing methodologies in applied linguistics is premised on the capacity of corpora to offer insight into contextualised language use. Likewise, when informing professional practices, the role and scope of corpus linguistics is context-dependent, differing as it is applied, variously, to develop language pedagogy, to inform policy, or to support social justice initiatives, inter alia. As this brief review of the literature on the applications of corpus linguistics demonstrates, certain contexts of application appear to be more established than others. Language pedagogy, for example, could be seen as an established context, as its sustained and centralised presence in discussions of the applications of corpus linguistics attests. Conversely, more emergent contexts could include the role of corpora in informing practices in educational contexts beyond the language classroom, the affordances of corpora for developing research on less-studied languages and language varieties, the use of corpus-assisted discourse studies for understanding and informing media practices, and the use of corpus linguistics research to affect social change and inform governmental practices. Acknowledging the centrality of context, this volume presents a collection of chapters that move from established to more emergent contexts in which corpus linguistics is currently being applied.
In terms of established contexts, the application of corpus linguistics arose as part of the so-called corpus revolution that emerged in the 1990s (Rundell & Stock, Reference Rundell and Stock1992a; Reference Rundell and Stock1992b; Reference Rundell and Stock1992c). Building on the development of the Collins COBUILD English Language Dictionary (Sinclair, Reference Sinclair1987), language education sought to move away from intuition-based approaches and towards a more empirical means of informing language teaching by integrating examples of attested language use derived from corpora into classroom materials and resources. Within this context, the notion of application has gained specific currency following Leech’s (Reference Leech, Wichmann, Fligelstone, McEnery and Knowles1997) categorisation of application as either direct or indirect. The former pertains to learners’ direct engagement with corpora, corpus consultation software, and corpus-based language learning technologies through data-driven learning (Johns, Reference Johns, Johns and King1991), while the latter subsumes a range of contexts of application including the development of learner dictionaries, pedagogical grammars, classroom materials, and language assessments. The application of corpus linguistics in these established contexts is explored in the first half of this volume, as the chapters within it move, in focus, from direct to indirect uses of corpora in language education.
For more emergent contexts, the notion of application appears more varied. Beyond language learning and teaching, corpora are increasingly used to produce impact in a wide range of disciplines, and linguistics sub-disciplines, demonstrating the ‘outward’ impacts of corpus linguistics in ‘affecting aspects of life that are dependent on language’ (Hunston, Reference Hunston2022: 2). In some cases, the act of applying corpus linguistics echoes the view of Cortes and Csomay (Reference Cortes and Csomay2015), whereby application is largely seen as methodological application. However, there is a growing body of corpus linguistics research that is concerned with working with practitioners and stakeholders to enhance policing practices (e.g., Wright, Reference Wright2017), communication practices of healthcare professionals (e.g., Brookes & Baker, Reference Brookes and Baker2021), and curriculum development in mainstream national education (e.g., Durrant et al., Reference Durrant, Brenchley and McCallum2021), for example.
Overall, it is clear that the applications of corpus linguistics have greatly expanded over time, and especially since the 2000s. Thus, one may wonder what remains to be said about the applications of corpus linguistics. While we know that great strides have been made in the context of corpus applications to language pedagogy, arguably, there is still substantial progress to be made. Curry and McEnery (Reference Curry and McEnery2025) note that the future of research in corpus applications to language teaching and learning must follow many complementary paths, with the need for more research on teaching contexts beyond university language classrooms, the application of corpus research in the development of national curricula, the development of digital pedagogies to support the use of corpora and corpus technologies for language learning, and engagement with a range of stakeholders beyond teachers and learners. Likewise, while the application of corpus linguistics in social contexts is promising, such an integration has developed more slowly within and beyond corpus linguistics, potentially owing to epistemological differences that govern research design in different disciplinary areas (McEnery & Brookes, Reference McEnery and Brookes2024; Pérez-Paredes & Curry, Reference Pérez-Paredes and Curry2024). As such, there remains much to be done to advance the application of corpus linguistics across both established and emergent contexts.
This volume is designed to continue the exploration of the applications of corpus linguistics across a range of contexts. However, to do so, there are a number of challenges that must be addressed. First, it is important to recognise that the notion of application is culturally situated. For example, in McIntyre and Price (Reference McIntyre and Price2018), the notion of impact is shaped by the definition provided by the UK’s Research Excellence Framework. This is logical, as much of the research discussed within that volume engages with and responds to that particular definition of impact within the UK context. The same can be said of several chapters in this volume. However, the relevance of the Research Excellence Framework is limited beyond the UK, and for national research cultures in Australia, Hong Kong, Italy, Ireland, South Korea, Spain, Switzerland, and the UK – the research cultures represented in this volume – it is not possible to employ – impose, even – any single understanding of application. Second, while national cultures can shape approaches to application, disciplinary differences can also influence engagement with corpus linguistics applications (Pérez-Paredes & Curry, Reference Pérez-Paredes and Curry2024). For example, in education contexts, movements towards participatory design (e.g., Le Foll, Reference Le Foll2021) and action research (Brydon-Miller & Maguire, Reference Brydon-Miller and Maguire2008) demonstrate the importance of inclusive research design within the disciplinary area, while from a discourse analytic perspective, studies such as Koteyko et al. (Reference Koteyko, Van Driel and Vines2022) and Collins et al. (Reference Collins, Nicholson, Lidbetter, Smithson and Baker2024) exemplify how stakeholders and their perspectives can be incorporated, to varying degrees, into projects through the likes of project co-design, interview question development, presence at advisory board meetings, and dedicated workshops for reviewing corpus linguistics research.
Overall, there are various interpretations of what it means to ‘apply’ corpus linguistics, many of which are represented by the chapters that follow. On one hand, corpus linguistics can be said to be ‘applied’ to (i.e., combined with) other approaches to form hybrid or interdisciplinary methodologies. On the other hand, corpus linguistic methods may be ‘applied’ to a particular linguistic context (whether or not in combination with other approaches) to inform research that has some impact on practices and policies outside of academia. Likewise, this same challenge of terminological ambiguity that envelops ‘application’ also characterises the notion of ‘impact’, as this is similarly shaped by cultural and disciplinary norms. In applied linguistics, impact is typically associated with creating ‘positive influence’ (Cook, Reference Cook2012, p. 33), demonstrating ‘practical usefulness’ (McIntyre & Price, Reference McIntyre and Price2018, p. 3), and ‘bringing about change’ (McEnery, Reference Brookes, Harvey, Mullany, McIntyre and Price2018, p. 29) in contexts outside of academia. As part of this activity, researchers may attempt to communicate work to the public or engage with relevant stakeholders to inform their practices (for reflections on the potential of these approaches for facilitating impact in the context of corpus linguistics, see, e.g., Brookes et al., Reference Brookes, Harvey, Mullany, McIntyre and Price2018). For our purposes, we see impact as a facet of application, with application subsuming a range of related activities. In compiling this volume, we seek to demonstrate that just as the scope and reach of corpus linguistics has grown, so too has the range of ways in which its applications can be recognised.
By organising this volume in such a way that the chapters move from a focus on applications of corpus linguistics first in established contexts before addressing more emergent contexts, we aim to demonstrate how corpus methods and findings have been applied in contexts within and beyond the academy. Each chapter provides a contextualisation of the work at hand within the wider relevant literature. The goal of this contextualisation is twofold. First, as the review of applications of corpus linguistics herein demonstrates, application may pertain to a methodological application advancing a field of study or application to a so-called real-world context (i.e., beyond the academy) wherein practices or policies may be changed or influenced. For the former, it is necessary that each chapter is situated within a wider literature so that its application of corpus linguistics as a form of academic contribution can be made clear. For the latter, the contextualisation serves a means to signal that which aligns the authors’ approaches with, or otherwise differentiates them from, those adopted elsewhere in the wider literature. Second, the scope of this volume is notably broad, with studies addressing a range of topics that refract across the continuum of applied linguistics. So as to equip readers unfamiliar with any of the areas addressed in this volume with the required knowledge of the field, the initial contextualisation in each chapter is also designed to address the main facets of the literature to which the chapters respond and upon which they build. This, in turn, allows the authors to highlight the novelty of their contributions, both within and beyond the academy.
Building on the literature, subsequently, each chapter presents a case study, or case studies, which have achieved, or perhaps are on the way to achieving, some demonstrable impact or real-world application. In inviting contributions to this volume, we have not imposed a particular view of what counts as application or impact, but we have invited the authors to conceive of these notions in the way that is most pertinent to their own research. The case studies presented across the chapters represent a mixture of original and already-published research. Original case studies are used particularly in those chapters for which the main focus is on methodological innovation. Meanwhile, for chapters reporting on innovative approaches to achieving and demonstrating impact, the case studies described are mostly already published, with the focus instead devoted to describing and discussing the activity that underpinned or followed the research, and that helped to maximise its application and impact. However they conceive of these notions, all of the authors contributing to this volume were asked to reflect critically on their understanding of application and impact, as well as on the contribution of corpora and corpus linguistics to doing impactful and applied research, both in terms of their own experience and with regard to the wider area of linguistics within which they are situated. As part of this critical reflection, each chapter considers how this contribution could be enhanced and developed in the future, and what might be needed to ensure this progress.
In what follows, Chapter 2 opens the discussion of established contexts and presents Crosthwaite and Gazmuri Sanhueza’s reflection on the implementation of data-driven learning in secondary school contexts. Chapter 3, by Rees, discusses user perceptions of ColloCaid, signalling pathways for developing data-driven learning tools. In Chapter 4, Apavaloae and Farr apply corpus-based data-driven learning in the context of teacher education, reflecting on the barriers and affordances of doing so. Lin, in Chapter 5, reports on the application of corpus approaches in formulaic language learning. She describes the corpus-informed development of an innovative web app – IdiomsTube – designed to support learners to acquire formulaic expressions using YouTube video data. In a final foray into established contexts, in Chapter 6, Curry et al. reflect on the application of corpus linguistics in the development of language teaching materials and assessments, discussing research conducted with teachers, materials developers, publishers, assessment developers, and governmental bodies.
Moving towards emergent contexts for corpus application, in Chapter 7, Durrant and Myhill discuss their Growth in Grammar project, reflecting on the pathways for impact paved within the paradigm of the UK’s Research Excellence Framework. For Chapter 8, Battisti and Ebling address the affordances of corpus linguistics for informing Swiss German sign language learning. In particular, they explore how innovative corpus design can help to broaden the disciplinary scope of sign language research, while also supporting industry development of learner resources for what are under-served languages. Baker and McEnery, in Chapter 9, demonstrate how corpus-assisted discourse studies can not only contribute to research identifying and challenging Islamophobia in UK news reporting, but also help researchers to assess the impacts of interventions designed to influence reporting practices in this regard. Offering a complementary focus on news media, in Chapter 10, Bednarek et al. examine the uptake of media guidelines in Australian and UK news through a corpus analysis of obesity representation. They explore the role that such approaches could play in collaborative efforts, with journalists and advocacy groups alike, to continually monitor the success of proposed guidelines and update them to ensure their relevance and effectiveness. Chapter 11 reports on McGlashan et al.’s work as part of the Misogyny and The Red Pill (MANTRaP) project, which aims to understand and challenge the discursive manifestations of online misogyny in the so-called manosphere. As well as reporting on a wide range of engagement and impact-focused activities, their chapter shows how applications of corpus methods to this domain required particular methodological innovations which can also add value, and potential impact, within the academy. In Chapter 12, Archer reflects on the affordances of corpus approaches for informing training for UK police involved in crisis negotiation, pointing to the challenges associated with working with conflicting understandings and definitions of ‘impact’. To close the book, Pérez-Paredes offers a final reflection in the Afterword, in which he not only provides an effective synthesis of the volume, but also points forward to further pathways for development, application, and impact.
Echoing the disciplinary and topical diversity of these contributions, throughout the chapters that follow, the meaning of application varies across a complex of dimensions in which methodological application and application to contexts beyond the academy co-exist and, at times, productively converge. In some cases, application rests more firmly within the methodological plane, wherein the central innovation pertains to pushing academic research in a new direction or developing new resources and approaches, inter alia. Elsewhere, the focus is largely beyond the academy, as researchers seek to effect some social change through the operationalisation of corpus linguistics research. Although such approaches may appear distinct to one another, in reality – and indeed, across the volume – there is an evident, shared interest in both methodological and societal application of corpus linguistics.
From a methodological perspective, for example, critical reflections address the affordances of corpus approaches for developing new corpora of less-studied languages, such as sign language, and using corpora and corpus approaches in largely understudied educational and professional development contexts. Likewise, corpus methods are touted for their relevance for informing the language used to develop classroom materials and innovative language teaching and learning technologies. In other cases, traditional corpus techniques are revisited and adapted to allow researchers to access specific insights required for their studies, for example, through the study of online and media discourse. This collection of work, pioneering the methodological applications of corpus linguistics, demonstrates that while the affordances of corpus methods for enhancing and enriching empirical work is well established, there remain avenues for advancement and innovation.
In terms of application and impact beyond the academy, the notions of stakeholders and participants are evoked throughout, with research designed to influence professional practices and/or change professional mindsets. Corpus approaches are seen as ways to lend an empirical lens to research, which can garner engagement from teachers, curriculum designers, journalists, and a range of industry professionals. The nature of this kind of stakeholder and participant engagement is elaborated on in many of the reflections within this volume. An openness to participants’ perspectives shapes research design, as teachers’ needs, publishers’ expectations, and journalists’ practices influence researchers’ decisions and contribute to the iterative development of research projects and outputs. The ways in which corpus linguistics approaches are applied therefore can vary, with, in some cases, corpus linguistics being seen as a means to develop and test empirical findings that can, in turn, inform the creation of materials for training a range of stakeholders and practitioners. In other cases, corpus approaches afford means of contributing to a longer term, ongoing process of assessing and developing guidelines and resources in order to keep them up to date and effective for users.
While the diverse selection of chapters in this volume offers a broad overview of the current state of the art in the application of corpus linguistics, a number of core tenets emerge that shape applications of corpus linguistics, regardless of the context in which it is applied. In particular, through the authors’ critical reflections on application and impact, four broad dimensions that characterise the application of corpus linguistics emerged. These are: (1) the relational dimension; (2) the social dimension; (3) the methodological dimension; and (4) the institutional dimension. We would extend to our readers a further challenge – to adopt a critical perspective on application (and related notions such as impact) and to reflect on these four dimensions of application when navigating through the chapters of this book.
For the relational dimension, each study engages with, or anticipates engaging with, a range of stakeholders, participants, and practitioners. Throughout, the building of relationships and networks appears key to facilitating effective societal application and impact. An emerging message in this regard is that good relationships are hard-earned and need to be maintained and managed carefully. This can involve respecting others’ perspectives, managing expectations, remaining open to new ways of working, being flexible in one’s approach, changing research practices, and so on. While a polyphony of stakeholders can enrich a research study and offer new ways of working and thinking, reconciling the disparate views and agendas of stakeholders and participants and situating these within a wider research project can be challenging. In many cases, this kind of reflexivity was necessary to effect any change. However, accounting for these differing perspectives might require that we work within a different research paradigm, and in some cases this switch may challenge the empirical and positivist underpinning of much corpus linguistics research.
In terms of the social dimension, an emerging theme in this volume pertains to the applications of corpus linguistics being used to support social justice agendas. In many cases, this may appear obvious; for example, in this book we will see corpus research being used to address topics such as weight stigma, to challenge forms of discrimination like islamophobia, to help challenge and confront online misogyny, and to improve policing practices. Yet in other cases, the social justice angle is perhaps less obvious, or more subtle, but compelling nevertheless. For example, in some cases, the goal of research presented in this book is to develop resources for under-served minority communities, such as deaf people and users of sign languages. There is an evident social justice motivation underpinning such work, as the research seeks to improve the lives of specific groups of people. Likewise, by developing a programme of research to enhance access and engagement with technology for education and respond to a digital divide, to improve the representation of spoken language in educational materials, or to challenge prescriptive views of language at a national and international level, corpus linguistics appears to be a means to underscore and support a social justice agenda. Beyond the academy, we may find ourselves working with relevant stakeholders and organisations that do not share our social values and for whom the messages we deliver may be confronting. In such cases, it can be worthwhile to reflect on the social value of empirical research which can be leveraged to support our observations and, in some cases, convince stakeholders of the value of our work and how it can support their own practices.
The methodological dimension pertains to the reflexivity of corpus methods and the researchers who employ them to advance fields of research and effect social change. In many cases, the notion of corpora are held to account, as the critical reflections throughout this volume highlight the challenges associated with balancing the needs of researchers, stakeholders, and participants. On the one hand, there is a growing need for larger but also more specialised datasets to support research, industry, and wider society, while, on the other hand, there emerges a contrasting need to protect the anonymity of members of small or vulnerable communities and ensure a fair and accurate representation of diverse populations and particular social groups therein. Questions of representation emerge throughout, as standard approaches employed in corpus development are revisited when contextualised beyond the academy. Reflections also orbit the need for methodological reflexivity, where taken-for-granted approaches to conducting corpus linguistics research are adapted for, and in some cases give way to, stakeholder needs. In this way, the primacy of frequency can be questioned, while the application of different statistical analyses and the development of new approaches emerge as responses to the needs and priorities of different stakeholder groups.
Finally, for the institutional dimension, it is worth reflecting on the contexts in which corpus linguistics research is facilitated and evaluated. In many cases, research funding helped project teams to develop a research agenda through the allocation of time, the facilitation of travel, and the provision of the resources necessary to build and maintain the kinds of relationships required to apply corpus linguistics and affect social change. Such funding is often oriented towards producing impactful research, and when evaluated at national levels the criteria for application and impact can sometimes appear limited for our purposes. In some cases, application and impact may not be (easily) rendered measurable and scalable. Likewise, the application of corpus linguistics to achieve some kind of societal impact may be static, focusing on an intervention at one point in time, or longitudinal, potentially spanning a large part of an individual researcher’s career (or even going beyond it). In academia, it seems that institutional bodies, such as universities, funding councils, and research evaluation bodies, can exert great influence over the aims and agendas of contemporary research. At points, reflections on such influences indicate that they can, at times, be conflicting (not only with each other, but also with researchers’ own goals and motivations), as well potentially being overreaching. As such, it is imperative that researchers situate themselves within their work and address the affordances and limitations that come with such institutional involvement from an ontological perspective.
Ultimately, the goal of this volume was to shed some light on an often-occluded facet of corpus linguistics research pertaining to application (and, by extension, impact). This volume forces us to reflect on what application and impact mean in the world of corpus linguistics, and the contributions from researchers across the globe offer not only critical reflections on these issues, but also various blueprints for advancing the applications of corpus linguistics, in diverse contexts, hereafter. Through a critical engagement with the various dimensions of application that characterise the chapters in this volume, we hope that readers leave this book with a greater, more grounded, and more practical understanding of the decisions and processes involved in affecting change through corpus linguistics research.