Hostname: page-component-cd4964975-xtmlv Total loading time: 0 Render date: 2023-04-02T13:07:41.116Z Has data issue: true Feature Flags: { "useRatesEcommerce": false } hasContentIssue true

Spoken fluency revisited

Published online by Cambridge University Press:  24 September 2010

Michael McCarthy*
University of Nottingham, UK
Rights & Permissions[Opens in a new window]


An important priority for the English Profile programme is to incorporate empirical evidence of the spoken language into the Common European Framework (CEFR). At present, the CEFR descriptors relating to the spoken language include references to fluency and its development as the learner moves from one level to another. This article offers a critique of the monologic bias of much of our current approach to spoken fluency. Fluency undoubtedly involves a degree of automaticity and the ability quickly to retrieve ready-made chunks of language. However, fluency also involves the ability to create flow and smoothness across turn-boundaries and can be seen as an interactive phenomenon in discourse. The article offers corpus evidence for the notion of confluence, that is the joint production of flow by more than one speaker, focusing in particular on turn-openings and closings. It considers the implications of an interactive view of fluency for pedagogy, assessment and in the broader social context.

Research Article
Copyright © Cambridge University Press 2010

1. Introduction

The Common European Framework (CEFR) includes references to fluency in its descriptors at the higher levels. Even at the B2 lower end of the scale, the learner should be able to produce language ‘with a fairly even tempo’ and ‘few noticeably long pauses’ (Council of Europe, 2001: 28), a statement strongly redolent of elements of the core definitions of fluency to be found in the literature over several decades. At the B2 level, the learner ideally can ‘interact with a degree of fluency and spontaneity that makes regular interaction with native speakers quite possible without strain for either party’ (Council of Europe, 2001: 24). This statement associates fluency with spontaneity and makes reference to interaction with another speaker, even if somewhat obliquely. In addition, the specific description of spoken language characterises the C2 user as being able to speak ‘so smoothly that the interlocutor is hardly aware of it’ (Council of Europe, 2001: 28), once again acknowledging the interactive dimension and the presence of a receiver, a theme we shall return to throughout this article.

The terms fluent(ly) and fluency are examples pieces of terminology whose use is common outside the specialist jargon of linguists and applied linguistics and which is used (unproblematically for its users) in everyday lay language. The Oxford English Dictionary, which attests the use of fluent in relation to speech as far back as 1589, defines it as: ‘flowing easily and readily from the tongue or pen’. The plain people have no difficulty understanding statements such as: ‘He/she is fluent in Japanese’ or ‘He/she speaks Swahili fluently’, or one by someone who says that they know a language but that their fluency is not great. Few other items of the terminology of linguists are so easily used in everyday parlance and the metaphor upon which the terms fluent and fluency are based (that of flow, like a river) is a relatively straightforward and easily comprehensible one.

The fact that fluency as a notion is firmly embedded in the broader public consciousness and in the official documentation of the CEFR means that a closer examination of it within the English Profile programme (English Profile) is both timely and apt. English Profile has declared itself to be empirically founded and thus it behoves us to tease out the empirical foundations of fluency, in particular, the criterial features (Buttery and Hawkins, this issue) that mark it out as it emerges in the spoken performances of the developing language learner and is perceived as present or absent in greater or lesser degrees by teachers, assessors, curriculum designers and the gatekeepers of employment, or other forms of social capital to which the language learner might aspire.

2. Previous accounts of spoken fluency

The literature on fluency is, unsurprisingly, extensive and the present paper leans on many expert views propounded over the decades, to which it adds empirical evidence from learner corpora and attempts to encompass the interactive dimension referred to in the introduction. Hieke (Reference Hieke1985) concluded that ‘the literature on fluency reveals it to be replete with vacuous definitions’ (p. 135), reflecting the fact that, at least at that point in time, there seemed to be no common view of what the term meant. However, one can sum up the literature in terms of a small number of recurrent basic themes or preoccupations which represent the broad canvas upon which the debate over fluency is delineated. These themes include:

  1. 1. Rate and smoothness of delivery: this theme encompasses objective metrics such as the number of words per speech unit or per minute, the location, distribution and duration of pauses, the regularity and smoothness of tempo and allied issues (Fillmore, Reference Fillmore, Fillmore, Kempler and Wang1979; Dechert, Reference Dechert, Dechert and Raupach1980; Towell, Reference Towell and Ellis1987; Towell et al., Reference Towell, Hawkins and Bazergui1996; Lennon, Reference Lennon1990; Riggenbach, Reference Riggenbach1991; Freed, Reference Freed and Freed1995; Kormos and Dénes, Reference Kormos and Dénes2004; Wolf, Reference Wolf2008; Mochizuki and Ortega, Reference Mochizuki and Ortega2008; Rossiter, Reference Rossiter2009).

  2. 2. Automaticity: this addresses the ability of the speaker instantaneously and effortlessly to retrieve units of speech, including words, prefabricated phrases and/or whole clauses (Fillmore, Reference Fillmore, Fillmore, Kempler and Wang1979; Rehbein, Reference Rehbein, Dechert and Raupach1987; Gatbonton and Segalowitz, Reference Gatbonton and Segalowitz1988; Towell et al., Reference Towell, Hawkins and Bazergui1996; Chambers, Reference Chambers1998; Wood, Reference Wood2001, Reference Wood2006).

  3. 3. Professional perceptions of fluency: this covers the evaluation and assessment of fluency and implications for professional practitioners such as teachers, curriculum designers and examiners (Derwing et al., Reference Derwing, Rossiter, Munro and Thomson2004; Hasselgreen, Reference Hasselgreen2004; Kormos and Dénes, Reference Kormos and Dénes2004).

  4. 4. Lay perceptions of fluency: this recurring theme addresses perceptions of fluency by non-professionals and their real-world implications, for example, the general public, employers (actual and potential) and social peer-groups (Tainer, Reference Tainer1988; Dustmann,1994; Chiswick and Miller, Reference Chiswick and Miller1998; Dávila and Mora, Reference Dávila and Mora2000; Shields and Wheatley Price, Reference Shields and Wheatley Price2002; Yeh and Inose, Reference Yeh and Inose2003).

The first two thematic areas are typically, but certainly not exclusively, based on a notion of fluency as a monologic affair, the achievement of the single speaker, which is often evaluated under experimental or quasi-experimental conditions. In this conception, the single speaker performs fluently or non-fluently, to a greater or lesser degree. The second two thematic areas place a greater emphasis on the social contexts of fluency, especially the fourth. In this conception, fluency is more typically cited and judged in the perceptions of others and in performance with others (be it other language learners, interlocutors in social settings, employers or others). Not all studies can, of course, be pigeon-holed so simply; studies of language learners who have spent extended periods of time living in the target culture are often based on performances before and after the time spent in the target-language environment (e.g. Towell, Reference Towell and Ellis1987; Lennon, Reference Lennon1990). An example of a blended approach to the evaluation of fluency is found in Wolf's (Reference Wolf2008) study, where the researcher acted as an ‘interlocutor’ with the learners under scrutiny, thus attempting to mirror a more natural type of interaction. Similarly, Lumley and O'Sullivan's (Reference Lumley and O'Sullivan2005) study had language learners respond to stimuli from taped voices to investigate task effects such as gender of interlocutor.

3. Fluency in monologue and in interaction

Rate of delivery

Measures of fluency based on temporal features such as rate of delivery and pausing seem to correlate with the perceptions of informants. Freed (Reference Freed and Freed1995) compared two cohorts of language students, a stay-at-home group and a study-abroad group, using the average number of words per second as a measure of comparison; the study-abroad students delivered more words per second in oral performances. In a study by Kormos and Denés (Reference Kormos and Dénes2004), several aspects of speech rate (excluding pauses) emerged as ‘the best predictors of fluency scores’ (p. 145) (see also Rossiter, Reference Rossiter2009). On the other hand, Foster and Skehan (Reference Foster and Skehan1999) judged extended silences within and between speaker turns in task performances to be ‘moments when performance is seriously disrupted and the subject has to engage in regrouping and unexpected on-line planning’ (p. 229).

However, actual speech rates vary greatly depending on context and speech genre (Tauroza and Allison, Reference Tauroza and Allison1990), thus speed of delivery alone is clearly a blunt instrument for assessing fluency and cannot tell the whole story. Similarly, pausing is an extremely complex phenomenon which has long been recognised as being influenced by the cognitive complexity of speech production and the contextual circumstances of speaking, in which time is a central organising principle (Goldman-Eisler, Reference Goldman-Eisler1968, Reference Goldman-Eisler1972). Pauses may not necessarily be a sign of communicative failure but may indicate complex planning and boosted cognitive effort. Moreover, evidence from several different areas of investigation in communication research, psycholinguistics, sociolinguistics and computer-human-interface studies points to an understanding that human interlocutors fine-tune their speech and pausing rates in an attempt to converge with those of their interlocutor(s) (Street et al., Reference Street, Brady and Putman1983; Street and Capella, Reference Street and Capella1989; Giles et al., Reference Giles, Coupland and Coupland1991; Bosshardt et al., Reference Bosshardt, Sappok, Knipschild and Hölscher1997; Street, Reference Street2006; Kousidis and Dorran, Reference Kousidis and Dorran2009). Such fine adjustments may not be overtly conscious acts, but more a subliminal symphony of balanced tempo that is achieved in successful interactions, with speech and silence between interlocutors manifesting as a rhythmic co-construction. This harmonisation may be a significant contributor to the sense of ‘flow’ in multi-party talk. In this perspective, speed of delivery and pausing are interactive, a co-created phenomena of discourse.


Automatic retrieval of words and fixed expressions undoubtedly contributes significantly to smooth performance and normal paced delivery (the norm being an interactively established one, as discussed above). However, Dörnyei (Reference Dörnyei2009) distinguishes between speed and automaticity, and states that ‘fast processing is not necessarily automatic’ (p. 287), suggesting that automaticity must possess other properties, most readily those associated with the ability to retrieve words and other items via well-established connections in the metal lexicon. It is scarcely believable that fluent speakers reassemble on every new occasion of production the lexico-grammatical patterns that are so common in speech and which are abundantly evidenced in spoken corpora. However, as in the case of speed of delivery, the context of utterance affects the degree of automaticity that is required on any individual occasion (Bialystok, Reference Bialystok1982), with multi-party conversation placing the highest demands on automaticity because of the competition for turn-taking. The cumulative research evidence of preference for very brief pauses between speaker turns in casual conversation (on average considerably less than one second), points to the conclusion that turn-boundaries are a locus of high demand on automaticity. Stivers et al. (Reference Stivers, Enfield, Brown, Englert, Hayashi, Heinemann, Hoymann, Rossano, de Ruiter, Yoon and Levinson2009) illustrates how, across a wide range of different languages, between-turn pauses are consistently very brief (for English around a quarter of a second), underpinning the general hypothesis that human interlocutors seek the shortest delay between speakers while still managing to avoid a discomforting number of overlaps. As with the discussion of convergence in speech rates above, there is some evidence that speakers tend to converge on pause-length between turns (Kousidis and Dorran, Reference Kousidis and Dorran2009). The taking of the speaking turn would seem, therefore, to be a rich vein for exploration of recurrence and automaticity in language forms.

Another, allied aspect of automaticity is the high frequency of prefabricated expressions or language chunks (most typically two to four words in length; see O'Keeffe et al., 2007, for corpus-based lists of the most common ones in spoken British English). The extremely high frequency of occurrence of such chunks in native-speaker and expert-user conversation reveals their regular, fixed forms and the pragmatically specialised functions they have acquired over many millions of utterances (for example, the most common two-word spoken chunk, you know, and its function of monitoring the state of shared knowledge). Dörnyei (Reference Dörnyei2009: 294–297) provides a summary of the concept of chunking from a psycholinguistic perspective and stresses the role of chunks in both production and perception. It has long been recognised that chunks and formulaic expressions of various kinds contribute directly to automaticity (e.g. Gatbonton and Segalowitz, Reference Gatbonton and Segalowitz1988). The internal syntactic structures of chunks such as at the moment, you know (what I mean), I don't know if, here and there, things like that, etc. vary considerably, but they have in common that they are uttered relatively rapidly, automatically and as pre-assembled intonation units, without internal disruption, thus facilitating extended runs of connected speech that manifest less frequent pausing (Wood, Reference Wood2006). Chunks enable greatly reduced retrieval and processing time (Conklin and Schmitt, Reference Conklin and Schmitt2008) and are thus more communicatively efficient both for producer and receiver. This economy of effort for all relevant parties offers a further useful support to the notion of ‘flow’ and an interactive basis for its existence.

That chunks are extremely frequent in everyday spoken interaction may be seen by a comparison of their frequency vis-à-vis the occurrences of high-frequency single words. The data are from a one-million-word sub-corpus of social conversations which form part of the 5-million-word CANCODEFootnote 1 . Table 1 incorporates the frequency of the two-word chunk you know into a list of the top 40 tokens.

Table 1 You know: frequency compared with single words (CANCODE sub-corpus; social conversations).

Know, despite its lexical nature, is within the top 40 tokens, the rest of which are, unsurprisingly, grammatical- or function-words. Of the 9,226 occurrences of know (rank 15 in Table 1), more than half are accounted for by you know constructions. Other high frequency chunks include I think, I mean, and then, sort of, a bit of, at the moment, in the morning, things like that, you know what I mean, or something like that, all of which are characterised in corpus audio recordings by smooth rendition at a pace as fast as, or faster than, their surrounding co-text. In the L2 context, the acquisition and appropriate use of formulaic sequences in spoken language has been shown to enhance the perception of oral proficiency (Boers et al., Reference Boers, Eyckmans, Kappel, Stengers and Demecheleer2006).


One type of automaticity already referred to is the ability of interlocutors to react and respond without delay when it is their turn to speak or when they wish to self-select for the next turn. The typically seamless progression of turn-taking in multi-party conversation, with little overlap or interruption, has long been recognised as a fundamental feature of talk (Sacks et al., Reference Sacks, Schegloff and Jefferson1974). Two basic features of turn-taking are likely to affect the creation and maintenance of flow: turn-opening and turn-closing, with both locations in the ongoing talk presenting themselves as potential points for smooth or disfluent transition. What happens at turn-boundaries may reveal a great deal about how fluency is constructed interactively, aside from the degree of flow that is (or is not) achieved by the single speaker within their turn.

Tao (Reference Tao, Leistyna and Meyer2003) demonstrated how items which occur at the start of speaker turns attend to what the previous speaker has just said: turn-openers characteristically link and provide continuity with the immediately previous talk and can be seen as creating smooth transitions and flow. Evison and McCarthy (forthcoming) present the 20 most frequent turn-openers in the one-million-word sub-corpus of social conversations referred to aboveFootnote 2 . These are listed in Table 2.

Table 2 20 most frequent turn-opener tokens (CANCODE sub-corpus; social conversations).

These items show an overwhelming preference for linkage with the preceding utterance, whether as connectives (e.g. and, but), reactives (oh, [laughter]) or discourse management items (well, right). The preference seems strongly to be to construct one's turn so that it links with and flows from the previous speaker's utterance. Such processing demands are not uppermost in prepared or set-piece monologue performances. In conversation involving two or more parties, the imperative to create and maintain flow ceases to be the sole responsibility of the single speaker within the single speaking turn and becomes a joint responsibility for all participants. This includes a shared responsibility to fill silences and uncomfortably long pauses. For this reason, flow across turn-boundaries is perhaps better captured by the metaphor of confluence, reflecting the jointly produced artefact which constitutes an efficient and successful interaction.

Turn-closings are the reverse of the coin of turn-openings. The turn usually transfers to a new speaker at points such as the completion of syntactic or intonation units; these are the so-called transitional relevance points (TRPs) (Sacks et al., Reference Sacks, Schegloff and Jefferson1974) and, as discussed earlier with regard to the duration of pauses, transition is rapid and automatic. Research has already shown how syntactic elements can bond across turn boundaries, with speakers regularly co-creating structures such as main clause plus subordinate clause. Tao and McCarthy (Reference Tao and McCarthy2001) give examples of this in the context of non-restrictive which clauses across turns. There appears to be a marked tendency for certain lexical items and longer chunks to trigger speaker-change. Evison and McCarthy (forthcoming) give a list of such trigger items. They include vague language tokens such as or something, and stuff (like that), and everything, which invite the listener to fill in absent members of categories from shared knowledge (Evison, McCarthy and O'Keeffe, Reference Evison, McCarthy, O'Keeffe and Cutting2007). For example:

  1. (1)
    • [Council Tax is a form of local taxation in the UK]

    • <$2> The thing is though by living there Annie is avoiding Council Tax and everything.

    • <$1> I know.

  2. (2)
    • <$2> It was gold gold satin curtains+

    • <$1> Yes tha =

    • <$2> +and stuff like that and+

    • <$1> Yeah.

    • <$2> +and purples and things.

An immediate convergent response, albeit brief, ratifies the previous speaker's assumption of shared knowledge on the part of the listener. Evison, McCarthy and O'Keeffe (Reference Evison, McCarthy, O'Keeffe and Cutting2007) also present evidence for the way high-frequency evaluative adjectives such as lovely, awful, wonderful, funny trigger listener responses, with a strong preference for convergent response tokens. A considerable proportion of the occurrences of these adjectives occur immediately before speaker-change. The importance of these types of items is that they invite reciprocity and convergence and project seamlessly to the following utterance by the next speaker. Extracts (3) and (4) from the CANCODE sub-corpus illustrate these triggers.

  1. (3)
    • [Speakers are discussing a well-known celebrity]

    • <$2> I couldn't believe she looks that awful.

    • <$1> Yeah. She must get through some cosmetics.

    • <$2> Disgusting.

  2. (4)
    • <$2> And if you are ever in Lincolnshire don't miss going to the old hall because it is absolutely wonderful.

    • <$1> Really? I've never been.

The contribution of turn-taking to fluency-as-confluence may be summed up in the observation that apart from the opening turn, all turns in a conversation display aspects of response. The primary motivation of turn-construction can therefore be seen as the creation of a responsive turn, rapidly and automatically. Turn-openings and closings are mirror-images of confluence; in other words, each single speaker engages in the co-creation of conversational flow, rather than simply achieving the goal of fluent runs for any individual speaker. The evaluation of fluency without this interactive dimension, it is argued, gives us only a partial picture of the conversational event. A fuller picture is obtained by examining the efforts of conversational participants to create confluence on many different levels.

4. Fluency in pedagogic contexts

How to assess fluency has been an issue for language practitioners for many decades, and language-proficiency scales often explicitly mention fluency as a component of particular levels of achievement. One of the problems with scales such as the CEFR is the lack of empirical underpinning of its descriptors. This is not a criticism of its founders and creators, who lacked access to the spoken corpora increasingly available to linguists and practitioners in the twenty-first century. The English Profile programme is an attempt to address the issue of empirical underpinning with the construction and investigation of massive spoken learner corpora collected, in the first instance, across the diverse languages and cultures of Europe (see The questions which the empirical investigation of the English Profile programme is seeking answers to, include whether learners who are already assigned to CEFR levels, either through examination systems or by their teachers and education systems (what one might call ‘pooled expert judgement’ – the collective voices of experience – which offers us an invaluable inter-subjectivity) actually display in their oral performances the kinds of features this paper claims to be at the heart of fluent performance. In this respect, English Profile lays emphasis on both the monologic and multi-party levels, with ‘interaction’ being seen as a fifth skill to complement the traditional four skills of speaking, listening, writing and reading. In ongoing English Profile research, the present author is currently examining large quantities of spoken learner data from Cambridge ESOL oral examinations to arrive at an understanding of the typical turn-taking patterns and use of chunks among learners of English at different levels. Initial findings point to a marked effect on learner turn-openings from the nature of the task involved (typically either responding to an examiner's questions or interacting with a peer candidate). This is by no means a novel or original insight; studies of test tasks have already noted various types of effect (see O'Sullivan, Reference O'Sullivan2002 and Lumley and O'Sullivan, Reference Lumley and O'Sullivan2005 for two very good examples). Awareness of these task effects may, it is hoped, inform better and more efficient task design and a clearer understanding of how scales such as the CEFR can and should relate to actual performances. For example, this last preoccupation might steer us away from thinking of CEFR can-do statements in terms of real-world performances to a more realistic acknowledgement that what learners can do may not be best revealed by what they do do in situations such as oral test tasks (see O'Sulllivan et al. Reference O'Sullivan, Weir and Saville2002).

An example of the mismatch between oral test performance and probable real-world performances is seen in extract (5), taken from Cambridge ESOL's basic-level Key English Test (KET)Footnote 3 . KET is at A2 of the CEFR, and extract (5) is from a paired task in the oral examination. Two young female candidates ask each other questions, in this case about the interlocutor's breakfasting habits, using picture and vocabulary prompts:

  1. (5)
    • <CANDIDATE 01> Er when do you have breakfast?

    • <CANDIDATE 02> I have my breakfast er at er seven o'clock.

    • <CANDIDATE 01> Where do you have breakfast?

    • <CANDIDATE 02> Er in my kitchen in my house

    • <CANDIDATE 01> In what room?

    • <CANDIDATE 02> In the kitchen.

    • <CANDIDATE 01> And do you have coffee or tea for breakfast?

    • <CANDIDATE 02> Er tea.

    • <CANDIDATE 01> Er what do you eat?

    • <CANDIDATE 02> I eat toast and a cup of tea.

    • <CANDIDATE 01> How many days er how many times a day do you have it?

    • <CANDIDATE 02> Er two times.

    • [4 seconds pause]

    • <CANDIDATE 02> Sorry I don't understand you. Repeat the sentence please.

    • <CANDIDATE 01> How many times a day do you have breakfast?

    • <CANDIDATE 02> One time a day of course!

Clearly, the language here is stilted and rather artificial. Yet the candidates’ grammar is accurate and they show command of the appropriate vocabulary. Turn-openers are mostly abrupt ‘straight in’ wh-questions followed by information-rich answers (with the exception of Sorry). Hesitation markers are present (Er) but there is no backchannel behaviour. Common chunks occur (a cup of tea, how many times, of course). For the most part, turn transitions are normally paced. There is one point where a problem occurs: an uncommonly long pause of four seconds occurs when candidate 02 doubts her understanding of 01's (unusual) question (How many times a day do you have breakfast?). The examiner is present, but is not permitted, at this point, to act as an interlocutor and must maintain a sphinx-like demeanour. In real-world situations, abandoning one conversational party to their fate during a long, problematic pause would be unusual at this point, and one might expect any involved party to intervene with something like ‘I'm not sure she understands your question’. In a real-world situation it would be the responsibility of all parties to attempt to fill the silence and maintain the flow, not just the one who has encountered a communication problem. The communication is rescued, but only after an agonisingly long hiatus. It must be stressed that this is no criticism of the examination format or indeed of the examiner – experienced examiners will always make allowances for situational constraints, but it does bring home the precarious relationship between task and performance, and how confluence can be threatened even when the parties are comfortable with the lexico-grammar of their performances.

Research suggests that speech rate is greater where monologue is supported by backchannel responses from an interlocutor (Wolf, Reference Wolf2008), and in particular that oral narrative skills are boosted or dampened by the active or inactive behaviour of listeners (Bavelas et al., Reference Bavelas, Coates and Johnson2000). Where an interlocutor is passive or silent, the monologue speaker takes on the additional cognitive burden of filling all the silence. In multi-party casual conversation among equals, all speakers have a responsibility to fill silences, even if only through backchannel responses, or non-minimal but non-floor-grabbing responses (McCarthy, Reference McCarthy2003). Guillot (Reference Guillot1999) who states many viewpoints with which the present paper concurs, makes plain that fluency is:

far from being a one-sided speaker-related notion. It has emerged as the product of a (largely intuitive) fine tuning between participants in an exchange according to the parameters of the exchange, as a process of negotiation. (p. 41)

Learner data at higher CEFR levels do appear to display better command of the items that contribute to fluency and confluence. Extract (6), also from Cambridge ESOL oral examination data, this time at Cambridge Certificate in Advanced English (CAE) level (at C1 of the CEFR), shows an interestingly reversed picture to that of extract (5): the candidates here make grammatical errors, but the overall impression is one of a much more fluent and confluent performance, owing to the native-like linkages in turn construction and back-channelling (in bold):

  1. (6)
    • [Names have been omitted to preserve candidate anonymity; the text has been abridged for illustrative purposes]

    • <Examiner> First of all we'd like to know a little bit about you. Erm where do you both live?

    • <Candidate 01> I live in (place name) in South Korea yes.

    • <Candidate 02> And I live in (place name). It's in Switzerland and it's near Zurich.

    • <Examiner> (Candidate name) how long have you been studying English?

    • <Candidate 01> Well actually I study English mm in junior high school and high school for six years around for six years.

    • <Examiner> Different classes. Good. Now I'd like you to ask each other something about things you particularly like about living in this country and entertainment and leisure facilities in this area.

    • [intervening text]

    • <Candidate 02> Erm I like to go erm to see a movie.

    • <Candidate 01> Mm.

    • <Candidate 02> I see a lot of them since I've been here and I like to go to pubs and+

    • <Candidate 01> Ah.

    • <Candidate 02> +together with friends and =

    • <Candidate 01> = Yeah me too actually.

The interactive imperatives of fluency also emerge in Hasselgreen's (Reference Hasselgreen2004) important investigation of the role of what she terms ‘small words’ (which include some common chunks) in fluency. The small words which are typically present in the speech of fluent speakers are interactive and flow-sustaining; they encompass high-frequency types such as well, you know, sort of, right, and or something, two of which we have already commented on above. Such small words are usually subliminal for the native user (Watts, Reference Watts1989) and have generally attracted little attention in the study of language learners’ vocabulary compared with the more salient, content-rich items. The small words which Hasselgreen puts under the microscope are ones which are non-propositional and which sustain interaction, contributing to that fifth skill mentioned above, and underpinning the sense of interactive flow.

Although many studies of fluency among learners have focused on monologue performance and on the reliable instruments of measuring speech rate and pausing, there is an increasing recognition of the need to confront the conceptualisation of fluency as an interactive phenomenon, and the CEFR, in its empirically-informed incarnation, will, it is hoped, take such research into account. With the massive expansion of learner spoken corpora planned for the English Profile programme, researchers will have rich resources for the better understanding of real learner behaviour, as well as the ever-increasing insights of research into native-speaker and expert user fluency based on corpora collected in non-pedagogical contexts (Prodromou, Reference Prodromou2008).

5. Fluency and society

Research suggests that perceptions of fluency in the world of employment and in society in general have real and sometimes life-changing implications for people using languages other than their native or first language. A number of studies point to the fact that employment opportunities may be positively or negatively affected by the extent to which potential and actual employees get good or better jobs, especially in contexts such as those of recently arrived immigrants. Tainer (Reference Tainer1988) underscored the negative economic consequences of inadequate language proficiency among immigrant groups (see also Dustmann, Reference Dustmann1994). Meanwhile, Chiswick and Miller (Reference Chiswick and Miller1998) point to their own investigations and those of others which suggest that immigrants who become fluent in the language of their new country achieve greater economic success. Dávila and Mora (Reference Dávila and Mora2000) speak of the ‘English deficiency earnings penalty’ (p. 369) with regard to levels of fluency among immigrants, as well as recent legislative pressures in the United States in relation to language standards and how these affect immigrant groups. In the United Kingdom, Shields and Wheatley Price (Reference Shields and Wheatley Price2002) report similar economic deficits associated with a lack of fluency. Elsewhere, Yeh and Inose (Reference Yeh and Inose2003) state that lack of fluency is one of the contributory factors to acculturative stress and integration problems among international students. Fluency, deeply rooted as a common, lay metaphor for the perception of flow in language events, may be influencing the non-academic life of language learners and users more than we appreciate. Fluency should therefore be one of the pillars of investigation of the English Profile programme and establishing its criterial features should be one of the project's key concerns.

6. Conclusion

The spread of English language examinations and systems of evaluation such as the CEFR across the world, the increasing importance of interactive skills in a global economy and the desire for objective standards in English language education all push to the forefront the need for empirical ratification of how fluency is realised in real language use. This paper has argued that we will gain most if we see fluency as a co-created achievement, better captured by the metaphor of confluence. Successfully co-creating talk that flows and being able to establish within one's own turns and across turns the satisfactory experience of confluence for all participants is an elusive quality, the evidence for which we are unlikely to find in monologic contexts. Confluence will be best evidenced in the data of multi-party talk and in situations where learners can naturally display their abilities to create and sustain interaction. The continued collection of spoken corpora of native users and learners performing at all levels will provide us with a unique coign of vantage for the investigation of the role played by confluence in the creation of successful interaction.


1 The CANCODE (Cambridge and Nottingham Corpus of Discourse in English) was a collaborative project between the University of Nottingham, UK and Cambridge University Press. Cambridge University Press is the sole copyright holder. Details of the corpus and its construction may be found in McCarthy (Reference McCarthy1998).

2 Thanks are due to Dr Paula Buttery of RCEAL, University of Cambridge, UK, for advice and inspiration on certain aspects of the corpus search methodology.

3 I am grateful to Cambridge ESOL for allowing me access to transcripts and original recordings of oral examination data for the purposes of the English Profile programme.


Bavelas, J. B., Coates, L. & Johnson, T. (2000). Listeners as co-narrators. Journal of Personality and Social Psychology, 79 (6): 941952.CrossRefGoogle ScholarPubMed
Bialystok, E. (1982). On the relationship between knowing and using linguistic forms. Applied Linguistics, 3 (3): 181206.CrossRefGoogle Scholar
Boers, F., Eyckmans, J., Kappel, J., Stengers, H. & Demecheleer, M. (2006). Formulaic sequences and perceived oral proficiency: Putting a Lexical Approach to the test. Language Teaching Research, 10 (3): 245261.CrossRefGoogle Scholar
Bosshardt, H.–G., Sappok, C., Knipschild, M. & Hölscher, C. (1997). Spontaneous imitation of fundamental frequency and speech rate by nonstutterers and stutterers. Journal of Psycholinguistic Research, 26 (4): 425448.CrossRefGoogle ScholarPubMed
Chambers, F. (1998). What do we mean by fluency? System, 25 (4): 535544.CrossRefGoogle Scholar
Chiswick, B. & Miller, P. (1998). English language fluency among immigrants in the United States. Research in Labor Economics, 17: 151200.Google Scholar
Conklin, K. & Schmitt, N. (2008). Formulaic sequences: are they processed more quickly than nonformulaic language by native and nonnative speakers? Applied Linguistics, 29 (1): 7289.CrossRefGoogle Scholar
Council of Europe (2001). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: Cambridge University Press.Google Scholar
Dávila, A. & Mora, M. T. (2000). English fluency of recent Hispanic immigrants to the United States in 1980 and 1990. Economic Development and Cultural Change, 48 (2): 369389.CrossRefGoogle Scholar
Dechert, H. W. (1980). Pauses and intonation as indicators of verbal planning in second-language speech productions: Two examples from a case study. In Dechert, H. W. & Raupach, M. (Eds.), Temporal Variables in Speech (pp. 271285). The Hague: Mouton.CrossRefGoogle Scholar
Derwing, T., Rossiter, M., Munro, M. & Thomson, R. (2004). Second language fluency: judgments on different tasks. Language Learning, 54 (4): 655679.CrossRefGoogle Scholar
Dustmann, C. (1994). Speaking fluency, writing fluency and earnings of migrants. Journal of Population Economics, 7 (2): 133156.CrossRefGoogle Scholar
Dörnyei, Z. (2009) The Psychology of Second Language Acquisition. Oxford: Oxford University Press.Google Scholar
Evison, J. & McCarthy, M. J. (forthcoming). Social talk. In Barron, A. & Schneider, K. (Eds.), Pragmatics of Discourse. Berlin: Mouton de Gruyter.Google Scholar
Evison, J., McCarthy, M. J. & O'Keeffe, A. (2007). ‘Looking out for love and all the rest of it’: Vague category markers as shared social space. In Cutting, J., (Ed.), Vague Language Explored (pp. 138157). Basingstoke: Palgrave Macmillan.CrossRefGoogle Scholar
Fillmore, C. J. (1979). On Fluency. In Fillmore, C. J., Kempler, D. & Wang, W. (Eds.), Individual Differences in Language Ability and Language Behavior (pp. 85101). New York: Academic Press.CrossRefGoogle Scholar
Foster, P. & Skehan, P. (1999). The influence of source of planning and focus of planning on task-based performance Language Teaching Research, 3 (3): 215247.CrossRefGoogle Scholar
Freed, B. F. (1995). What makes us think that students who study abroad become fluent? In Freed, B. F. (Ed.), Second language acquisition in a study abroad context (pp. 123148). Philadelphia, PA: John Benjamins.CrossRefGoogle Scholar
Gatbonton, E. & Segalowitz, N. (1988). Creative automatization: principles for promoting fluency within a communicative framework. TESOL Quarterly, I (3): 473492.CrossRefGoogle Scholar
Giles, H., Coupland, J. & Coupland, N. (1991). Contexts of Accommodation: Developments in Applied Sociolinguistics. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Goldman-Eisler, F. (1968) Psycholinguistics. Experiments in Spontaneous Speech. London and New York: Academic Press.Google Scholar
Goldman-Eisler, F. (1972). Pauses, clauses, sentences. Language and Speech, 15: 103113.CrossRefGoogle ScholarPubMed
Guillot, M.-N. (1999). Fluency and its Teaching. Clevedon: Multilingual Matters.Google Scholar
Hasselgreen, A. (2004). Testing the spoken English of young Norwegians: A study of test validity and the role of ‘smallwords’ in contributing to pupils’ fluency. Cambridge: Cambridge University Press.Google Scholar
Hieke, A. E. (1985). A componential approach to oral fluency evaluation. The Modern Language Journal, 69 (2): 135142.CrossRefGoogle Scholar
Kormos, J. & Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System, 32 (2): 145164.CrossRefGoogle Scholar
Kousidis, S. & Dorran, D. (2009). Monitoring convergence of temporal features in spontaneous dialogue speech. Dublin Institute of Technology: Digital media Centre Conference Papers. Accessed November 2009 at: Scholar
Lennon, P. (1990). Investigating fluency in EFL: a quantitative approach. Language Learning, 40 (3): 387417.CrossRefGoogle Scholar
Lumley, T. & O'Sullivan, B. (2005). The effect of test-taker gender, audience and topic on task performance in tape-mediated assessment of speaking. Language Testing, 22 (4): 415437.CrossRefGoogle Scholar
McCarthy, M. J. (1998). Spoken Language and Applied Linguistics. Cambridge: Cambridge University Press.Google Scholar
McCarthy, M. J. (2003). Talking back: ‘small’ interactional response tokens in everyday conversation. Research on Language in Social Interaction, 36 (1): 3363.CrossRefGoogle Scholar
Mochizuki, N. & Ortega, L. (2008). Balancing communication and grammar in beginning-level foreign language classrooms: A study of guided planning and relativization. Language Teaching Research, 12 (1): 1137.CrossRefGoogle Scholar
O'Sullivan, B. (2002). Learner acquaintanceship and oral proficiency test pair-task performance. Language Testing, 19 (3): 277295.CrossRefGoogle Scholar
O'Sullivan, B., Weir, C. & Saville, N. (2002). Using observation checklists to validate speaking-test tasks. Language Testing, 19 (1): 3356.CrossRefGoogle Scholar
Prodromou, L. (2008). English as a Lingua Franca. London: Contimuum.Google Scholar
Rehbein, J. (1987). On fluency in second language speech. In: Dechert, H. W. & Raupach, M. (Eds.), Psycholinguistic Models of Production (pp. 97105). Norwood, NJ: Ablex.Google Scholar
Riggenbach, H. (1991). Towards an understanding of fluency: A microanalysis of nonnative speaker conversations. Discourse Processes, 14: 423441.CrossRefGoogle Scholar
Rossiter, M. J. (2009). Perceptions of L2 fluency by native and non-native speakers of English. Canadian Modern Language Review, 65 (3): 395412.CrossRefGoogle Scholar
Sacks, H., Schegloff, E. A. & Jefferson, G. (1974). A simplest systematics for the organisation of turn-taking for conversation. Language, 50 (4): 696735.CrossRefGoogle Scholar
Shields, M & Wheatley Price, S. (2002). The English language fluency and occupational success of ethnic minority immigrant men living in English metropolitan areas. Journal of Population Economics, 15 (1): 137160.CrossRefGoogle Scholar
Stivers, T., Enfield, N. J., Brown, P., Englert, C., Hayashi, M., Heinemann, T., Hoymann, G., Rossano, F., de Ruiter, J. P., Yoon, K.-E. & Levinson, S. C. (2009). Universals and cultural variation in turn-taking in conversation. PNAS (Proceedings of the National Academy of Sciences), 106 (26): 1058710592.CrossRefGoogle ScholarPubMed
Street, R. (2006). Speech convergence and speech evaluation in fact-finding interviews. Human Communication Research, 11 (2): 139169.CrossRefGoogle Scholar
Street, R., Brady, R. M. & Putman, W. B. (1983). The influence of speech rate stereotypes and rate similarity on listeners’ evaluations of speakers. Journal of Language and Social Psychology, 2 (1): 3756.CrossRefGoogle Scholar
Street, R. & Capella, J. (1989). Social and linguistic factors influencing adaptation in children's speech. Journal of Psycholinguistic Research, 18 (5): 497519.CrossRefGoogle ScholarPubMed
Tainer, E. (1988). English language proficiency and the determination of earnings among foreign-born men. Journal of Human Resources, 23 (1): 108122.CrossRefGoogle Scholar
Tao, H. (2003). Turn initiators in spoken English: A corpus-based approach to interaction and grammar. In Leistyna, P. & Meyer, C. F. (Eds.), Corpus Analysis: Language Structure and Language Use (pp. 187207). Amsterdam: Rodopi.Google Scholar
Tao, H. & McCarthy, M. J. (2001). Understanding non-restrictive which-clauses in spoken English, which is not an easy thing. Language Sciences, 23: 651677.CrossRefGoogle Scholar
Tauroza, S & Allison, D. (1990). Speech rates in British English. Applied Linguistics, 11 (1): 90105.CrossRefGoogle Scholar
Towell, R. (1987). Variability and progress in the language development of advanced learners of a foreign language. In Ellis, R. (Ed.) Second Language Acquisition in Context (pp. 113127). Toronto: Prentice-Hall.Google Scholar
Towell, R., Hawkins, R. & Bazergui, N. (1996). The development of fluency in advanced learners of French. Applied Linguistics, 17 (1): 84119.CrossRefGoogle Scholar
Watts, R. J. (1989). Taking the pitcher to the ‘well’: native speakers’ perception of their use of discourse markers in conversation. Journal of Pragmatics, 13: 203237.CrossRefGoogle Scholar
Wood, D. (2001). In search of fluency: what is it and how can we teach it? Canadian Modern Language Review, 57 (4): 573589.CrossRefGoogle Scholar
Wood, D. (2006). Uses and functions of formulaic sequences in second language speech: An exploration of the foundations of fluency. Canadian Modern Language Review, 63 (1): 1333.Google Scholar
Wolf, J. P. (2008). The effects of backchannels on fluency in L2 oral task production. System, 36 (2): 279294.CrossRefGoogle Scholar
Yeh, C. J. & Inose, M. (2003). International students’ reported English fluency, social support satisfaction, and social connectedness as predictors of acculturative stress. Counselling Psychology Quarterly, 16 (1): 1528.CrossRefGoogle Scholar
Figure 0

Table 1 You know: frequency compared with single words (CANCODE sub-corpus; social conversations).

Figure 1

Table 2 20 most frequent turn-opener tokens (CANCODE sub-corpus; social conversations).