Hostname: page-component-cd4964975-8cclj Total loading time: 0 Render date: 2023-03-29T15:32:55.796Z Has data issue: true Feature Flags: { "useRatesEcommerce": false } hasContentIssue true

A Validation Study of the CEFR Levels of Phrasal Verbs in the English Vocabulary Profile

Published online by Cambridge University Press:  10 August 2012

Masashi Negishi
Tokyo University of Foreign Studies Email:
Yukio Tono
Tokyo University of Foreign Studies
Yoshihito Fujita
Tokyo University of Foreign Studies
Rights & Permissions[Opens in a new window]


This article reports on a part of the development and validation project for the English Vocabulary Profile (EVP). The previous version of the EVP included 439 phrasal verbs as well as 4,666 individual word entries. Each of their meanings is ordered according to its CEFR level. The aims of the study are to identify the actual difficulty of each phrasal verb, to validate the tentative decision of the CEFR levels, and also to explore factors that explain the difficulties, by using textbook corpora. In order to carry out this research, we developed a phrasal verb test of 100 items, consisting of four A1 items, nineteen A2 items, forty B1 items and thirty-seven B2 items. Approximately 1,600 Japanese students took this test. We analysed the test data, using item response theory. The results of the test show that although the average difficulties of the phrasal verbs in each level were ordered according to the level prediction, the ranges of the difficulties in each level overlapped. The analysis of textbook corpora reveals that there is a complex relationship between the difficulty levels of phrasal verbs and their frequencies in the textbooks. We discuss its implications and possible improvements for the EVP.

Research Article
Copyright © Cambridge University Press 2012

1. Introduction

The English Vocabulary Profile (EVP, formerly known as the English Profile Wordlists) is part of the English Profile Programme, the aim of which, according to Kurteš and Saville (Reference Kurteš and Saville2008), is to produce Reference Level Descriptions for English linked to the Common European Framework of Reference for Languages (CEFR; Council of Europe 2001).

The core objective of the initial phase of the EVP project has been to establish which words are commonly known by learners around the world at the CEFR levels A1 to B2, and to assign these levels not merely to the words themselves but to their individual meanings (Capel Reference Capel2010).

At the time of investigation, the EVP included 439 phrasal verbs as well as approximately 4,700 individual word entries. Each meaning that is listed in an entry is ordered according to its CEFR level. Decisions about level were based partly on the evidence of the Cambridge Learner Corpus (CLC). Additionally, for all EVP entries, a range of other sources was consulted, including native speaker evidence of frequency in the Cambridge International Corpus, ESOL exams vocabulary lists at A2 and B1 levels, coursebook wordlists, readers wordlists, vocabulary skills books, and the Cambridge English Lexicon (Hindmarsh Reference Hindmarsh1980).

We investigated the validity of the initial decisions made on phrasal verbs, for which there is less learner evidence in the CLC, by administering a test to English learners. On 12 April 2011, the Preview version of the English Vocabulary Profile, including examples of the new C1 and C2 data, was launched under its new name. The current English Vocabulary Profile contains words, phrases, phrasal verbs and idioms. According to the glossary on the English Profile website, a phrasal verb is defined as a multi-word verb which involves an adverb particle, e.g. sit down, go away. On the other hand, there is a related term in the glossary called ‘multi-word verb’, which is defined as follows: a verb which ‘may be combined with one or two particles to function as a verb with a unitary meaning. There are three kinds of multi-word verb. Phrasal verbs have adverb particles . . . Prepositional verbs take a preposition . . . and phrasal-prepositional verbs take both an adverb and a preposition’ (e.g. ‘look down on’) (Carter & McCarthy Reference Carter and McCarthy2006: 911).Footnote 1 At the time of investigation, however, there was no such definition, so in this research ‘phrasal verbs’ included the above three types.

The purposes of this research are as follows:

  1. 1. to identify the actual difficulty of each phrasal verb in the Japanese context;

  2. 2. to validate the tentative decisions on CEFR levels for certain phrasal verbs in the EVP; and

  3. 3. to explore factors that explain the difficulties for Japanese learners, using a textbook corpus.

2. Method

2.1. Participants

Some 1,622 Japanese students, consisting of 1,550 senior high school students and 72 university students, participated in this study. They were enrolled in fourteen different high schools and one university located in the areas of Tokyo, Fukushima and Akita prefectures in Japan. Although the majority of the participants were high school students, university students were invited to participate in this research, in order to obtain reliable data across all the levels.

2.2. Test sets

For this particular research, a phrasal verb test was developed. When the test was administered, the EVP included four phrasal verbs at A1 level, twenty-seven phrasal verbs at A2 level, 145 phrasal verbs at B1 level, and 263 phrasal verbs at B2 level, where phrasal verbs with multiple meanings were counted separately. There are fewer phrasal verbs at the lower CEFR levels, which might reflect the EFL/ESL acquisition order. Some 119 items were selected as a pilot test, which was initially administered to two Japanese PhD students studying in the UK, and then the test with English definitions was tried out on two native speakers of English. After excluding the items that they could not answer correctly, 100 items were selected. The final test included four A1 items, nineteen A2 items, forty B1 items and thirty-seven B2 items. In order to counterbalance the number of items at each level with the number of the participants at the corresponding level, we selected as many phrasal verbs as possible from the A1 and A2 levels; in fact, all the A1 phrasal verbs were included in the test, as we had only four of them. In principle, those at B1 and B2 levels were selected randomly. However, more phrasal verbs were selected from B1 level, since we presumed that we had fewer participants at B2 level. Also, the phrasal verbs with multiple meanings tended to be selected to check the validity of the level allocation. Here is an example of the question format:

He (w. . .. . .. . .. . .) <   > his mug and put it back on the shelf. 洗 う

The participants were required to fill in a verb in parentheses and an adverb or a preposition in angled brackets in each sentence, with the help of Japanese translation equivalents. The initial letter of the verb was indicated in the parentheses so that the possibility of more than one phrasal verb that fitted the context was avoided.

The test was designed in this way in order to tap students’ productive use. As said above, the EVP has been based partly on written exam scripts in the CLC, which does not contain an enormous amount of evidence of the use of phrasal verbs at all levels. It was for this reason that we were asked to undertake this validation task.

2.3. Procedure

First of all, the participants answered a questionnaire asking what authorised textbook they had used in junior high school, and then they moved on to the test. The questionnaire and test were conducted in one regular lesson, which means it took approximately 50 minutes for them to finish both. In scoring their answers, one point was allotted for one correct answer. All the items were scored dichotomously (i.e. 0 or 1). Spelling and inflection mistakes were not penalised as long as they were intelligible. The statistical software used to analyse the data was R, Iteman, Rascal (IRT-One parameter), and Xcalibre (IRT-Two parameter). The results were compared against the frequency distributions of textbook corpora to see if there was any effect of the amount of exposure from the textbook input.

3. Results

Table 1 shows the test statistics. The mean score of 31.443 suggests that, overall, the test was quite difficult for the students. However, the alpha was 0.965, which means that the test was internally consistent and reliable.

Table 1 Test statistics

Figure 1 is the item by person distribution map, which shows that more students were distributed for easier items, and fewer students for more difficult items.

Figure 1 The item by person distribution map.

Figure 2 shows the box plots of item difficulty by CEFR level. The item difficulties were calculated with the program RASCAL to answer research questions 1 and 2. RASCAL is based on the one-parameter Rasch logistic IRT model for dichotomous data. In this model, as the index gets closer to minus four, the item is getting easier, and as the index gets closer to plus four, the item is getting more difficult.

Figure 2 The box plots of item difficulty by CEFR level.

In Figure 2, the bottom and top of each box indicates the 25th and 75th percentile (the lower and upper quartiles, respectively) and the band near the middle of the box is the 50th percentile (the median). The whiskers indicate minimum and maximum values with some white dots for outliers. The result of the one-way ANOVA indicates the phrasal verbs at A1 level are significantly easier than those at the other three levels. However, there was no significant difference between A2, B1 and B2 levels, although the average scores seemed to be getting higher as the levels went up.

Tables 2 to 5 show definitions of the phrasal verbs with corresponding item difficulties. The phrasal verbs were classified according to the CEFR levels and in each group they were ordered in line with the item difficulties.

Table 2 Item Difficulties of A1 Phrasal Verbs

Table 3 Item Difficulties of A2 Phrasal Verbs

Table 4 Item Difficulties of B1 Phrasal Verbs

Table 5 Item Difficulties of B2 Phrasal Verbs

The results in Tables 2 to 5 show that there is a wide range of difficulty in each group, in particular A2, B1 and B2 levels. It seems that, at least for Japanese-speaking learners of English, the difficulty levels of phrasal verbs do not always correspond to the CEFR levels proposed in the EVP.

In addition to the analyses, a corpus of junior high school English textbooks was compiled to explore the effects of L2 input on the difficulties of the phrasal verbs. The corpus consists of six series of government-authorised English textbooks, covering years 7 to 9, which corresponds to the first three years of learning English. Each student is supposed to use one of these government-authorised textbooks. Although most of the participants were senior high school (Year 10–12) students, the corpus did not contain senior high school textbooks, primarily because the types of textbooks used at high school varied greatly from school to school and the influence from high school textbooks was expected to be small for Year 10 students. The total size of the corpus was 29,251 tokens, or 3,291 types. The phrasal verbs were extracted by first preparing a list of search patterns for each of the phrasal verbs, using regular expressions, and then using a Perl script to automatically extract all the instances of phrasal verbs from the corpus.

Table 6 shows the frequencies of the phrasal verbs in the corpus. Forty-eight out of the 100 phrasal verbs appeared in the textbooks at least once. The rest of the items, namely fifty-two items, did not occur at all. As can be seen, the frequency of phrasal verbs is really low in junior high school textbooks. This is rather natural, considering the overall level of junior high school English, which roughly corresponds to A level in CEFR. It was also appropriate that half of the phrasal verbs did not appear, because our primary purpose for this corpus analysis was to investigate the relationship between the encounter with phrasal verbs in the textbooks and the learners’ performance of the corresponding phrasal verbs.

Table 6 Frequencies of the Phrasal Verbs

The accuracy rates of phrasal verbs were compared, based on relative frequencies in the textbooks. Phrasal verbs were grouped into four categories: NONE, LOW (one to two), MID (three to six), and HIGH (seven to nine). The mean differences in accuracy scores were examined for statistical significance. Figure 3 is the box plots showing differences in accuracy among phrasal verbs with different frequencies in the textbooks. The results of one-way ANOVA and Kruscal−Wallis test show that a significant mean difference was found between the HIGH group and the NONE group. There was no statistically significant difference between NONE and LOW, or between MID and HIGH groups, although we could observe a gap in the median scores among those groups.

Figure 3 Differences in accuracy among phrasal verbs with different frequencies.

In order to estimate how much exposure native speakers have received in terms of item frequencies, a comparison was made between the frequencies of phrasal verbs and the words in the same frequency range in two native speaker corpora, the 100-million-word British National Corpus (BNC) and the 3.3-billion-word ‘English TenTen’ (enTenTen) Corpus available on the Sketch Engine.

Table 7 shows the estimated amount of exposure regarding phrasal verbs under study. The columns show, from left to right, phrasal verbs, their frequencies in the BNC and enTenTen, and the words in the same frequency range, respectively. Whilst it is a simple estimate, the overall results suggest that the patterns of frequency distributions for different phrasal verbs look very similar across the two corpora and that some phrasal verbs were found to be quite low in frequency.Footnote 2 Phrasal verbs are sometimes difficult to acquire due to their lack of compositional meanings. If the frequencies are very low, it might be natural for L2 learners to have less chance to encounter them in texts, although each of their component verbs and adverbs/prepositions are familiar to them (Waring & Takaki Reference Waring and Takaki2003).

Table 7 Estimated Exposure to Phrasal Verbs

4. Discussion

4.1. The gap between CEFR levels and item difficulties

The results of the analyses indicate that some of the B1 or B2 items are not necessarily as difficult as their level might indicate, and, conversely, that some A2 items are not always easy for Japanese students. The results also suggest possible factors for interpreting the gaps between CEFR levels and actual item difficulties.

For instance, knock out is a B2 item in the EVP, which means the phrasal verb is seen as relatively difficult for learners. However, the item difficulty is −3.47, and it is actually very easy for Japanese learners. The reason is that knock out is an English loanword that is used in boxing, and is also used figuratively.

Another interesting item is leave behind, whose level at the time of this research was A2, but the item difficulty is 2.28, which means it is statistically more difficult. This can be attributed to its very low frequency in English textbooks in Japan, as indicated in Table 6. It appears only once in the textbook corpus, which indicates that Japanese students rarely come across this phrasal verb in the classroom. In addition, the frequency of behind is surprisingly low for a preposition in the textbooks, and therefore students do not acquire the core meaning of the word. Subsequent to our research, the level of leave behind has been raised to B1 in the EVP.

As Figure 2 shows, the ‘outliers’, such as grow up and hurry up, turned out to be much easier than the other phrasal verbs in their assigned levels. The content words in these phrasal verbs are semantically transparent, and the word up is no more than an intensifier. Therefore, when learners encounter these phrasal verbs for the first time, they can at least infer their meanings quite easily, although they may not be able to use them creatively from the beginning. The transparency of the content words in phrasal verbs seems to facilitate learners’ acquiring them.

The most difficult item out of 100 in this study is split up, whose level is B1, with the meaning ‘to finish a relationship’. The percentile correct is 0.00; only two students out of 1,622 got the item right. The reason might be that ‘relationship’ is one of the topics which tend to be avoided in authorised English textbooks in Japan, so Japanese students have very few opportunities to encounter English expressions related to this kind of topic. Japanese students as well as Japanese teachers rarely talk about relationships in the classroom. It is interesting to note that European learners have a lot of exposure in their textbooks to relationship issues and coverage of the language that attaches to them (Annette Capel, personal communication).

In this study, a number of phrasal verbs with multiple meanings were deliberately chosen. Those meanings are quite often assigned to different levels. According to the EVP, go on has three meanings within A1−B2 levels: ‘to happen’ as in What's going on?; ‘to continue to happen or exist’ as in The meeting went on until six o'clock.; and ‘to continue doing something’ as in We can't go on living like this. The first two meanings are labelled B1, and the last one is labelled B2. However, in our study, the B2 item turned out to be the easiest of the three for learners to acquire. Although this phrasal verb is given three distinct meanings, in our own view they are very similar. The essential meaning of go on is more or less related to ‘continuation’. Therefore, these differences in difficulty should be attributed to something other than ‘meaning’. A close examination reveals that only the third (B2) meaning has an animate subject. This is an easy construction for Japanese EFL learners, because animate subjects are ‘unmarked’ in the Japanese language, so they are more familiar to Japanese learners, and therefore easy to process. The reason why What's going on? appears to be easier than The meeting went on . . . might be that the former is learned as a formulaic expression.

Another example of a phrasal verb with multiple meanings is belong to. In the EVP, this phrasal verb in the meaning of ‘to own it’ is assigned A2; the meaning of ‘to be a member of a group or organisation’ is B1. However, in our study, the A2 item turned out to be more difficult than the B1 item. The former is −0.29, and the latter −2.32. The reason might be that Japanese learners come across the B1 meaning much earlier than the A2 meaning, especially in school contexts, where they use belong to to talk about club or extra-curricular activities. Another reason might be that animate subjects are easier for Japanese learners, as is the case with go on.

4.2. Limitations of the study

Some limitations need to be noted regarding the present study. One is that although there are 439 phrasal verbs included in the A1–B2 EVP at the time of writing, only 100 of them were tested in this study. If different items had been selected, the results might have been different.

Another possible limitation is that the response to the elicitation device used in the research is different from the writing process of an exam answer. In the latter process, the examinees can decide which expressions to use in the essay by themselves, whereas in the former process, the examinees are required to produce the phrasal verb indicated by the stem of the test item. This difference might have affected the results.

Another limitation might concern the statistical procedure used for this research. While there were only four items at the A1 level, the number of items at the A2−B2 levels was much greater. Therefore, there might be some room for discussion on the validity of the use of a one-way ANOVA in the analysis.

Finally, it should be noted that only Japanese students participated in this study. If the test had been administered to learners with other L1 backgrounds, the results might well have been different.

5. Further research

For further research, three points should be considered. First, any new selection of phrasal verb items should be based on the frequency information from the English textbooks. In this study, test items were pre-selected without taking the frequency of occurrence of phrasal verbs in textbooks into consideration. There could be better candidates for which phrasal verbs to test if the test was initially designed on the basis of textbook corpus analysis. To this end, in a future study, textbook corpora should incorporate high school textbooks as well, in order to cover B1−B2-level vocabulary.

Second, numerical data such as frequencies from native speaker (NS) and non-native speaker (NNS) corpora could add more value to our data analysis, and more complex models, such as generalised linear regression, could be explored using those variables. Third, more advanced learners, such as university students or adult learners of English, should participate in the survey in order to test those items whose accuracy rates were very low.

6. Conclusion

The primary objective of this study was to identify the actual difficulty for Japanese learners of each phrasal verb in the English Vocabulary Profile, and to explore the factors that might explain those difficulties. The results of our analyses show that some items are not necessarily ordered as suggested by the EVP, which might suggest the need to make some adjustments to the CEFR level decisions. While adjustment was made for the level of leave behind, we await results from other L1 backgrounds, as there is some concern that the Japanese learning context might not be representative of learners worldwide. We hope that further administrations of the test will happen in other countries, using our test.

As for exploring other factors that could determine the level of a phrasal verb, learner and input factors are worth investigating. Specific L1 characteristics, such as proximity in terms of language family and lexico-grammatical similarities and differences might affect the order and the degree of phrasal verb acquisition. Moreover, learners are greatly influenced by the input they receive from textbooks, classroom activities and all sorts of exposure in real life.

Put in a much wider perspective, it might be possible that there is a core group of linguistic items such as words, phrasal verbs, etc. which show a general pattern of increasing difficulty, whereas there are some peripheral linguistic items that are context-specific or perhaps culture-specific. The acquisition of those peripheral linguistic items might be affected by the syllabus adopted, teaching materials used, the learner's L1, etc.

Although there are some limitations mentioned above, it is hoped that the results of this study will contribute to the improvement of the A1−B2 EVP, and give more reliable information to the future users of the full resource.


2 The estimate based on the web corpus could be biased toward the materials available on the Web. The issue of text types and representativeness for Web corpora still remains, but as Kilgarriff and Grefenstette (Reference Kilgarriff, Grefenstette and 2003: 342) wrote: ‘The Web is a dirty corpus, but expected usage is much more frequent than what might be considered noise.’

3 Table 2 only provides frequencies for each phrasal verb. Ideally, dispersion measures should be provided as well, which could not be calculated due to the limitations of enTenTen, whose raw texts were not downloadable from the Sketch Engine site.


Capel, A. (2010). Insights and issues arising from the English Profile Wordlists project. Cambridge ESOL Research Notes 41, 27.Google Scholar
Carter, R. and McCarthy, M. (2006). Cambridge Grammar of English: A Comprehensive Guide. Cambridge: Cambridge University Press.Google Scholar
Council of Europe (2001). Common European Framework of Reference for Languages: learning, teaching, assessment. Cambridge: Cambridge University Press.Google Scholar
Hindmarsh, R. (1980). Cambridge English Lexicon. Cambridge: Cambridge University Press.Google Scholar
Kilgarriff, A. and Grefenstette, , , G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics 29.3, 333347.CrossRefGoogle Scholar
Kurteš, S. & Saville, N. (2008). The English Profile Programme − an overview. Cambridge ESOL Research Notes 33, 24.Google Scholar
Waring, R. & Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language 15 (2): 130163.Google Scholar
Figure 0

Table 1 Test statistics

Figure 1

Figure 1 The item by person distribution map.

Figure 2

Figure 2 The box plots of item difficulty by CEFR level.

Figure 3

Table 2 Item Difficulties of A1 Phrasal Verbs

Figure 4

Table 3 Item Difficulties of A2 Phrasal Verbs

Figure 5

Table 4 Item Difficulties of B1 Phrasal Verbs

Figure 6

Table 5 Item Difficulties of B2 Phrasal Verbs

Figure 7

Table 6 Frequencies of the Phrasal Verbs

Figure 8

Figure 3 Differences in accuracy among phrasal verbs with different frequencies.

Figure 9

Table 7 Estimated Exposure to Phrasal Verbs