Hostname: page-component-77f85d65b8-hzqq2 Total loading time: 0 Render date: 2026-03-29T17:06:34.885Z Has data issue: false hasContentIssue false

Do more proficient writers use fewer cognates in L2? A computational approach

Published online by Cambridge University Press:  05 October 2023

Liat Nativ
Affiliation:
Department of Computer Science, University of Haifa, Haifa, Israel
Yuval Nov
Affiliation:
School of Public Health, University of Haifa, Haifa, Israel
Noam Ordan
Affiliation:
The Israeli Association of Human Language Technologies, Israel
Shuly Wintner
Affiliation:
Department of Computer Science, University of Haifa, Haifa, Israel
Anat Prior*
Affiliation:
Department of Learning Disabilities and Edmond J. Safra Brain Research Center for Learning Disabilities, Faculty of Education, University of Haifa, Haifa, Israel
*
Corresponding Author: Anat Prior Department of Learning Disabilities and Edmond J. Safra Brain Research Center for Learning Disabilities Faculty of Education, University of Haifa, Haifa, Israel E-mail: aprior@edu.haifa.ac.il
Rights & Permissions [Opens in a new window]

Abstract

Bilinguals often show evidence of cross language influences, such as facilitation in processing cognates. Here we use computational methods for analyzing spontaneous English texts written by hundreds of speakers of different L1s, at different levels of English proficiency, to investigate writers’ preference for using cognates over alternative word choices. We focus on English, since a majority of its lexicon is either of Romance or Germanic origin, allowing an investigation of the preference of speakers of Germanic and Romance L1s towards cognates between their L1 and English. Results show that L2 writers tend to prefer English cognates, and that this tendency is weaker as English proficiency level increases, suggesting diminishing effects of CLI. However, a comparison of the L2 writers with native English writers shows general overuse of cognates only for the Germanic, but not the Romance, L1 speakers, most likely due to the register of argumentative writing.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1: Text statistics by proficiency level and language family.

Figure 1

Table 2: Text size comparison between TOEFL and LOCNESS.

Figure 2

Table 3: Number of occurrences of the words in synset 79, in essays of Romance L1 writers, from each proficiency level.

Figure 3

Figure 1: Germanic tendency (GT) of L1 German authors by proficiency and native English authors (Mean, SEM).

Figure 4

Figure 2: Romance tendency (RT) of Romance authors by proficiency and native English authors (Mean, SEM)

Figure 5

Table 4: Number of occurrences and Romance tendency of synset 79, for the original data (L1 Romance authors) and for an example random permutation.

Figure 6

Figure 3: Histograms of the $T_i^{{\rm perm}}$ values, calculated from random permutation of the data, for Germanic (left) and Romance (right) tendencies. The arrows indicate the T values calculated from the original, non-permuted dataset.

Figure 7

Figure 4: Histogram of random $T_i^{{\rm perm}}$ values, representing Germanic (Left) and Romance (right) tendencies, when including data based on native author essays in the LOCNESS dataset. The arrows represent T values calculated with the original dataset.

Supplementary material: File

Nativ et al. supplementary material
Download undefined(File)
File 29.7 KB
Supplementary material: File

Nativ_et_al._Dataset

Dataset

Download Nativ_et_al._Dataset(File)
File