The copenhagen cross-linguistic naming test (C-CLNT): Development and validation in a multicultural memory clinic population

Abstract Objective: Despite recent advances in cross-cultural neuropsychological test development, suitable tests for cross-linguistic assessment of language functions are not widely available. The aims of this study were to develop and validate a brief naming test, the Copenhagen Cross-Linguistic Naming Test (C-CLNT), for the assessment of culturally, linguistically, and educationally diverse older adult populations in Europe. Method: The C-CLNT was based on a set of standardized color drawings. Items for the C-CLNT were selected by considering name agreement and frequency across five European and two non-European languages. Ambiguities in some of the selected items and scoring criteria were resolved after pilot testing in 10 memory clinic patients. The final 30-item C-CLNT was validated by verifying its psychometric properties in 24 controls and 162 diverse memory clinic patients with affective disorder, mild cognitive impairment, and with dementia. Results: The C-CLNT had acceptable scale reliability (coefficient alpha = .67) and good construct validity, with moderate to strong correlations with traditional language tests (r = .42– .75). Diagnostic accuracy for dementia was good and significantly better than that of the Boston Naming Test (areas under the curve of .80 vs .64, p < .001), but was poor for mild cognitive impairment. Only 3% of the variance in C-CLNT test scores was explained by immigrant background, while 6% was explained by age and years of education. In comparison, these proportions were 34 and 22% for the BNT. Conclusions: The C-CLNT has promising clinical utility for cross-linguistic assessment of naming impairment in culturally, linguistically, and educationally diverse older adults.


Introduction
With increasing demographic aging, migration, and globalization, there is a pressing need for standardized neuropsychological tests suited for diverse older populations (Nielsen, 2022).A certain degree of diversity has always been present in European countries, but cultural, language, and educational diversity has increased greatly over the last seven decades due to increasing mobility within the European Union as well as immigration from countries outside Europe (Van Mol & De Valk, 2016).Although immigration patterns differ between countries, the largest immigrant groups across Europe originate from other European countries, the Middle East, North Africa, and South Asia, followed by groups of Sub-Saharan African and Latin American origin (Nielsen et al., 2022).Despite recent advances in cross-cultural neuropsychological test development, suitable tests for cross-linguistic assessment of language functions are not widely available (Franzen et al., 2020).Naming impairment is frequent across several neurocognitive disorders, including stroke (RELEASE Collaborators, 2021), traumatic brain injury (Strain et al., 2017), and a variety of neurodegenerative disorders (Grossman et al., 2004).Most dementia syndromes are associated with naming impairment due to varying degrees of semantic memory impairment, impaired lexical retrieval, or impaired visual perception, depending on the subtype (Taler & Phillips, 2008).For instance, anomia is one of the core features of semantic dementia (Gorno-Tempini et al., 2011) and although memory impairment is generally the core feature of early AD, anomia is another common feature, especially as the disease progresses (McKhann et al., 2011).Thus, assessment of naming impairment is standard in most neuropsychological assessments and is typically measured with confrontation naming tasks (Strauss et al., 2006) for example, the Boston Naming Test (BNT; Kaplan et al., 2001).
Performance on confrontation naming tests is influenced by several linguistic and cultural variables.The difficulty level of the individual items depends on factors such as word frequency, familiarity, age of acquisition, length, visual complexity, and name and image agreement (Ivanova & Hallowell, 2013), and these vary between cultures and languages (Ardila, 2007;Bertola & Malloy-Diniz, 2018;George & Mathuranath, 2007).For instance, items such as a pretzel, beaver, and asparagus included in the BNT may be familiar to people living in North America and parts of Europe but less familiar or virtually unknown in other cultural contexts (Franzen et al., 2023).In contrast, abacus is a difficult item to name in North America but relatively easier in China as abacuses are more common there (Gollan et al., 2012).
Although the BNT has been adapted to several languages and has had widespread clinical and research applications (Maruta et al., 2011;Rabin et al., 2016), cross-cultural research has shown that the items included in the BNT are suboptimal for assessing confrontation naming abilities in culturally, linguistically, and educationally diverse populations.More specifically, several studies have shown large differences in BNT performance between ethnoracial groups in the United States (US) (Baird et al., 2007;Boone et al., 2007), and lower performances in bilinguals (Gollan et al., 2007;Kohnert et al., 1998;Roberts et al., 2002) and secondlanguage speakers (Stålhammar et al., 2022), even after controlling for differences in level of education and other demographic variables.Furthermore, BNT performance has been shown to be influenced by level of education (Strauss et al., 2006).This may reflect increasing vocabulary and exposure to a wider range of concepts with increasing levels of education.In addition, it has also been suggested that people with limited education and literacy may find it difficult to process items presented as black-and-white line drawings but perform better when the same items are presented in colored photographs or drawings (Reis et al., 2006;Reis et al., 2001).While some of these challenges may be overcome by using language and culture-adapted versions of the BNT and applying relevant normative adjustments, this clearly does not solve all issues.
Use of language and culture-specific confrontation naming tests and norms derived for native speakers have limited feasibility in most European memory clinics, in which patients may differ widely in their cultural, linguistic, and educational characteristics (Franzen et al., 2023;Nielsen, 2022).Therefore, to develop a reliable and valid solution for clinical practice it is important to develop confrontation naming tests with potential applicability across diverse cultural and language groups (Franzen et al., 2022).
During the last two decades, there have been several efforts to develop cross-linguistic naming tests.However, many of these efforts have resulted in tests with an inadequate balance between cross-linguistic properties and sensitivity to naming impairment.Thus, tests such as Body Part Naming from the Cross-Cultural Neuropsychological Test Battery (Dick et al., 2002), the Cross-Linguistic Naming Tests (Ardila, 2007), and Picture Naming from the European Cross-Cultural Neuropsychological Test Battery (Nielsen et al., 2018) have good cross-linguistic properties but poor sensitivity to milder language impairment in patients with Alzheimer's disease (AD) and other dementia disorders due to ceiling effects (Abou-Mrad et al., 2017;Araujo et al., 2020;Ardila, 2007;Dick et al., 2002;Gálvez-Lara et al., 2015;Nielsen et al., 2019b).In contrast, the abbreviated version of the Multilingual Naming Tests (MINT; Ivanova et al., 2013) has high sensitivity to milder language impairment in AD but is biased toward more highly educated white English speakers in the US (Franzen et al., 2023;Li et al., 2022;Paplikar et al., 2022;Stasenko et al., 2019).More recent efforts include the Indian Council of Medical Research Picture Naming Tests (ICMR-PNT; Paplikar et al., 2022) and the Naming Assessment in Multicultural Europe (NAME; Franzen et al., 2023).Both instruments have shown promising clinical utility for cross-linguistic assessment.However, the ICMR-PNT may be less useful outside the Indian subcontinent as some items, such as tabala (a musical instrument), are culturespecific, and the 60-item NAME is rather long, taking up to 20 minutes to administer, which impedes its clinical utility in a busy clinical setting.
Building on these efforts, our aims were to develop and validate a brief cross-linguistic naming test for assessment of culturally, linguistically, and educationally diverse older adult populations in Europe.We compared the diagnostic accuracy of the novel confrontation naming test and traditional language tests in a diverse memory clinic population as well as the psychometric properties of the tests.Items included objects as well as pictured actions since action naming impairment is also diagnostic of dementia (Parris & Weekes, 2001).The rationale for the study design, comparing patients with dementia, mild cognitive impairment (MCI), affective disorder, and subjective cognitive decline (SCD), is that these conditions are common differential diagnoses in patients referred to neuropsychological evaluation in memory clinic settings.

Participants
Patients were recruited from the Copenhagen University Hospital Memory Clinic at Rigshospitalet, which is a multidisciplinary outpatient clinic based in the Department of Neurology.For this study, patients with immigrant background referred for neuropsychological evaluation as part of their diagnostic assessment were selectively included between June 2021 and June 2022.Patients with a majority ethnic Danish background were consecutively recruited in the same period.As described below, all patients in the clinic are assessed with cognitive screening tests as part of the basic diagnostic assessment.One of the criteria for referral to more comprehensive neuropsychological evaluation (approximately 2 hours) in the clinic is a Mini-Mental State Examination (MMSE; Folstein et al., 1975) or Rowland Universal Dementia Assessment Scale (RUDAS; Storey et al., 2004) score ≥ 22 at the initial visit in the clinic, but patients with lower MMSE or RUDAS scores may also be referred if necessary (e.g., in the case of patients with aphasia).In total, 169 patients who completed both the Copenhagen Cross-Linguistic Naming test (C-CLNT) and BNT were included in the study.Exclusion criteria included severe psychiatric symptoms and a diagnosis other than dementia, MCI, affective disorder, or SCD.In total, seven patients were excluded (two diagnosed with sequelae from traumatic brain injury, two with sequelae from stroke, two with epilepsy and psychiatric disorder, and one with atypical Parkinsonian disorder), resulting in a final sample of 162 patients.
All patients had an extensive diagnostic assessment including an interview with the patient and (when possible) an informant; a neurological, physical, and psychiatric examination including cognitive assessment with the MMSE and Addenbrooke's Cognitive Examination (Mathuranath et al., 2000) or the RUDAS and Multicultural Examination (Nielsen et al., 2019a) in case of cultural, linguistic and/or educational barriers; laboratory screening with blood tests and electrocardiography; and structural brain imaging with magnetic resonance imaging and/or computerized tomography.Further investigations, including functional imaging with [ 18 F]FDG-PET, amyloid imaging with [ 11 C]PIB-PET, and/or dopamine transporter imaging with [ 18 F] FE-PE2I PET, cerebrospinal fluid biomarker analysis, and comprehensive psychiatric or neuropsychological evaluation were performed on clinical indication.Diagnoses were based on evidence from all clinical and investigational results, except the C-CLNT, applying the 5 th edition of the Diagnostic and Statistical Manual of Mental Disorders (American Psychological Association, 2013) criteria for dementia, and diagnostic research criteria for specific dementia subtypes (Gorno-Tempini et al., 2011;McKeith et al., 2017;McKhann et al., 2011;Rascovsky et al., 2011;Sachdev et al., 2014), MCI (Winblad et al., 2004), and SCD (Jessen et al., 2014).Affective disorder (e.g., depression, anxiety, post-traumatic stress disorder) was diagnosed by applying the 10 th edition of the International Classification of Diseases criteria (World Health Organization, 1993).Professional interpreters provided by interpretation services were freely available to patients during diagnostic assessments, including neuropsychological evaluation, when considered necessary.
Also, 24 cognitively intact participants aged 60 years or older were recruited from local general practice clinics and through the social networks of multicultural and multilingual researchers.Participants were assessed in their private homes, in the Copenhagen University Hospital Memory Clinic, or in another suitable location, depending on their preference.Participation was voluntary and without any economic incentive.All cognitively intact participants were living independently, reported no significant memory problems, psychiatric or neurological disorders, or substance abuse, and scored ≥ 24/30 points on the MMSE or ≥ 23/30 points on the RUDAS, and ≤ 6/15 points on the 5/15item Geriatric Depression Scale (Weeks et al., 2003).

Procedure
All participants underwent an approximately two-hours clinical assessment, in which medical and demographic data were collected and neuropsychological tests, including the C-CLNT, were administered.All assessments were made by specialists in neuropsychology.The comprehensive neuropsychological evaluation in the Copenhagen University Hospital Memory Clinic is based on a flexible assessment approach, meaning that a standardized, fixed set of neuropsychological tests covering the main cognitive domains is given to most patients with some flexibility to add or subtract tests given the specific referral question (Nielsen et al., 2022).The applied tests generally come from the international literature, but locally developed tests are also used.In case of cultural, linguistic, and/or educational barriers, patients are mainly assessed with tests from the European Cross-Cultural Neuropsychological Test Battery (Nielsen et al., 2019b;Nielsen et al., 2018).Participants with immigrant background were assessed in their primary language, either by a multilingual neuropsychologist (in Danish, English, Kurdish, or Turkish; n = 30) or through interpreter-mediated assessment (n = 30).
Demographic data collected at the clinical assessment included data on age, sex, years of education, country of origin, and mother tongue.For participants with an immigrant background, years of residence in Denmark were calculated by subtracting the year of the assessment from the year of immigration.Also, mother tongue was classified as a European or non-European language and cultural distance between the original culture and Danish culture was calculated using the Kogut and Singh Index (KSI; Kogut & Singh, 1988).
All participants were asked about any vision or hearing impairment and were assessed using their hearing aids or prescribed glasses when this was confirmed.
Results for the C-CLNT were compared with three traditional language tests: BNT (Full and 15-item version) and Category Fluency.In all tests, correct responses in any language were accepted.
The BNT contains 60 black-and-white line drawings ranked according to difficulty.In this study, the Danish adaptation of the BNT was used, in which the original items were ranked according to difficulty in a sample of older Danish typical participants (Jørgensen et al., 2017), using a discontinuation rule of six consecutive failures.The score is the number of correct responses, including responses after semantic cues.The score range is 0-60 points.
Scores for the abbreviated 15-item version of the BNT (BNT-15) introduced by the Consortium To Establish a Registry for Alzheimer's Disease (Morris et al., 1988) were extracted from the full BNT.The range of scores is 0-15 points.
In Category Fluency (Strauss et al., 2006), participants are given one minute to produce as many different animal names as possible.As it may be challenging to perform fast-paced simultaneous translation of animal names in interpreter-mediated assessments, in this study interpreters were instead instructed to say "yes" for every new animal name and the neuropsychologist put a checkmark for each "yes" on the record form.Immediately following the test, the neuropsychologist checked with the interpreter for any repetitions of animal names.The score is the number of different animal names produced in one minute.
Also, scores from the MMSE and RUDAS were treated as a single measure of general cognitive function (MMSE/RUDAS) in all comparisons.The rationale behind this was that the two instruments have the same range of scores (0-30 points), are highly correlated (Naqvi et al., 2015), have similar diagnostic performance for dementia (Nielsen & Jorgensen, 2020), and were used interchangeably with patients and cognitively intact participants, depending on participant characteristics.In total, 150 participants were assessed with the MMSE and 36 with the RUDAS.
The study adhered to the Declaration of Helsinki for experiments involving humans (reference no.22007675) and was approved by the Danish Data Protection Agency (RH-2018-34).

Development of the copenhagen cross-linguistic naming test
The C-CLNT was based on MULTIMAP, a free open-access database of 218 standardized color drawings representing both objects and actions (Gisbert-Muñoz et al., 2021).MULTIMAP includes relevant linguistic variables, including name agreement, frequency (per one million), and number of letters, across several languages (i.e., Spanish, Basque, Catalan, Italian, French, English, German, Mandarin Chinese, and Arabic).However, data on number letters were not included in the present study as this variable seems less relevant when performance time is not an issue.Also, some languages, including Chinese Mandarin, are nonalphabetical languages making this variable inconvenient.MULTIMAP name agreement was established through an online survey with 99 (English) to 128 (Mandarin Chinese) speakers of each language, and frequency data for the words in each language was extracted from text corpora in various online databases (Gisbert-Muñoz et al., 2021).
Based on the original set of MULTIMAP drawings and the procedures described for developing cross-language combinations (Gisbert-Muñoz et al., 2021), an initial set of 38 items (26 objects and 12 actions) was selected by considering MULTIMAP name agreement (≥ 80% for objects, ≥ 75% for actions) data across Spanish, Italian, French, English, German, Mandarin Chinese, and Arabic languages.The adopted name agreement cutoff for object items followed the recommendations for developing bilingual naming tests based on the MULTIMAP drawings (Gisbert-Muñoz et al., 2021).However, as action word meanings are more variable across languages than object name meanings (Gentner, 2006), a slightly lower cutoff was used for action items in order to be able to include more items.Subsequently, eight items (glasses, horse, onion, egg, hand, table, weigh, hunt) were excluded due to their ambiguity in the Danish cultural context and to reduce crosslanguage differences in name agreement and frequency.
The final C-CLNT consisted of 30 standardized color drawings (20 objects and 10 actions) with comparable name agreement (F (6, 174) = 2.05, p = .06)and frequency F (6, 174) = 1.62, p = .11)across seven languages (Supplementary Table S1).The items were ordered according to mean frequency across the target languages and pilot tested in 10 memory clinic patients (2 AD, 4 MCI, 4 affective disorders; 6 male/4 female; mean age 71.2 ± 10.8 years; mean education 14.3 ± 2.3 years).Based on pilot test performances, ambiguity in the scoring criteria for the item bone was resolved (i.e., "meat bone" and "chicken bone" was not accepted as correct in Danish).The final set of items selected for the C-CLNT is presented in Table 1, and examples of items are provided in Figure 1.

Administration and scoring
Administration and scoring procedures for the C-CLNT are similar to those of the BNT (Kaplan et al., 2001).Participants are shown each item one at a time and allowed 20 seconds to respond.
When appropriate (e.g., in case of a visual misperception), a semantic cue can be provided.However, a semantic cue is not provided if the incorrect response falls within the same semantic category as the correct response (i.e., if a nail is named a "screw" or a fly is named a "bee" or "wasp").If the semantic cue fails to elicit a correct response, a phonetic cue may be given.Participants are allowed 5 seconds to respond following a semantic cue or phonetic cue.There is no discontinuation rule.The administration time is generally < 5 minutes.
The C-CLNT total score is the number of correct responses, including responses after semantic cues.Responses after phonemic cues are not added to the total correct but may be noted to provide qualitative information about naming performance.In the context of multilingualism and inherent language mixing, participants are allowed to respond in any language.A correct response in any language is considered correct.

Statistical analyses
The significance of group differences on continuous variables was determined using analysis of variance (ANOVA) with pretesting for homogeneity of variances.Welch's ANOVA was used when the assumption of homogeneity of variances was not met.Effect sizes were calculated as partial eta squared (PES).Fischer's Exact Test or Pearson's χ 2 -test was used to test the significance of group differences in the distribution of categorical variables.Internal consistency of the C-CLNT was determined by coefficient α as an approximation of scale reliability.To assess construct validity, Spearman's rank correlation coefficient was used to assess associations between the C-CLNT and traditional language tests.The effect of years of education, age, sex, and immigrant background on neuropsychological test scores was evaluated using hierarchical regression analyses with plots of residuals as model control.To assess discriminant validity, a receiver operating characteristic curve (ROC) was applied to examine the areas under the curve (AUC), sensitivity, and specificity of the C-CLNT and other language tests for dementia.AUCs were compared using the method proposed by DeLong et al. (1988).Optimal cutoff values were established with Youden's J (calculated as: J = sensitivity þ specificity -1).All analyses were performed with SPSS version 28.0.A p-value < .05(two-tailed) was considered significant.

Results
A total of 186 participants were included in the study, of which 126 (68%) had a majority ethnic Danish background and 60 (32%) had an immigrant background.Among participants with immigrant backgrounds, 24 originated from a Middle Eastern country, 15 from a South or East Asian country, 13 from another European country, 4 from a North African country, 3 from a Sub-Saharan African country, and one from an Oceanian country.In total, 45 (75%) of the participants with immigrant background had a non-European language as their mother tongue.The mean KSI cultural distance between the original cultures and Danish culture was 86.2 ± 19.3, ranging from 18.3 (Sweden) to 120.9 (Iraq).Compared to majority ethnic Danish participants, participants with immigrant background were significantly younger (68.3 (range: 42-87) vs 73.4 (range: 48-91) years; F (1, 183) = 13.93,p < .001)and had fewer years of education (10.4 (range: 0-17) vs 13.2 (range: 7-17) years; Welch' s F (1, 72.73) = 11.89,p < .001).There was no significant difference in sex distribution.Among patients, 56 were diagnosed with dementia (19 AD, 14 vascular dementia (VaD), 4 mixed AD/VaD, 6 dementia with Lewy bodies/Parkinson's disease dementia, 2 frontotemporal dementia, 6 other specified dementia (normal pressure hydrocephalus, encephalitis, Wernicke-Korsakoff syndrome, HIV-associated neurocognitive disorder), and 5 unspecified dementia), 67 with MCI, 20 with affective disorder, and 19 with SCD.As patients with SCD did not have formal impairment on neuropsychological testing, or any other neurological or psychiatric diagnosis explaining cognitive complaints (Jessen et al., 2014), they were grouped with cognitively intact participants to form a control group.Participants' characteristics and neuropsychological test performance for the resulting four groups are presented in Table 2.
There were no significant group differences in sex and years of education, but there were significant differences in age Participants with immigrant background obtained significantly lower scores on all language tests compared to participants with majority ethnic Danish background (see Fig. 2).However, differences were considerably lower for the C-CLNT.

Scale reliability
Across all participant groups, coefficient α for C-CLNT was .67 indicating acceptable scale reliability.

Discriminant validity
ROC curve analysis revealed that the C-CLNT was highly accurate in discriminating the group of patients with dementia from the other groups (control, affective disorder, MCI).AUCs for the C-CLNT, full BNT, BNT-15, and Category Fluency are illustrated in Figure 3, and AUC values, optimal cutoff scores, sensitivity, and specificity are presented in Table 3.The AUC value for the C-CLNT (AUC = .80)was significantly higher than for the full BNT (AUC = .64,z = 3.50, p < .001)and BNT-15 (AUC = .59,z = 4.46, p < .001),but comparable to Category Fluency (AUC = .83).In a subsample of patients with MCI and controls alone (n = 109), the AUC for the C-CLNT was .53.
Overall, the accuracy of the C-CLNT in discriminating patients with dementia from the other groups did not significantly differ between participants with majority ethnic Danish and immigrant

Effects of demographic variables
When combining the four groups and correcting for MMSE/ RUDAS score, there was a significant positive correlation between years of education and the C-CLNT (r = .18,p = .01),the full BNT (r = .47,p < .001),BNT-15 (r = .42,p < .001),and Category Fluency (r = .15,p = .05),and between age and the full BNT (r = .20,p = .008)and BNT-15 (r = .21,p = .005).Sex was not significantly related to any of the tests.When the influence of demographic variables on the C-CLNT and other language tests was evaluated with a series of hierarchical regression analyses controlling for MMSE/RUDAS score, significant effects of age and years of education, and immigrant background were present on all tests.However, the variance in test scores explained by immigrant background was 3% for the C-CLNT compared to 8, 28%, and 34% for Category Fluency, BNT-15, and full BNT, respectively (see Table 5).
Repeating the regression analysis in participants with immigrant background and entering years of residence in Denmark, non-European mother tongue, and KSI cultural distance in the last block instead of immigrant background led to a similar picture (see Supplementary Table S2).In these analyses, the variance in test scores explained by years of residence in Denmark, non-European mother tongue, and KSI cultural distance was 3, 8, 18, and 24% for the C-CLNT, Category Fluency, BNT-15, and full BNT, respectively.Adding the use of an interpreter to the regression analyses did not show any significant effects.

Abbreviated 20-item version of C-CLNT
An abbreviated version of the C-CLNT was created by excluding the 10 action items included in the full C-CLNT, leaving only the 20 object items.The 20-item C-CLNT was highly correlated with the full C-CLNT (r = .87,p < .001)and had comparable psychometric properties.The AUC of the 20-item C-CLNT for dementia was .78,95% CI [.70-.86], which did not significantly differ from the AUC of full C-CLNT (z = 1.21, p = .23).At the cutoff ≤ 18, the 20-item C-CLNT had a sensitivity of .83 and a specificity of .70.

Discussion
In this study, we described the development and validation of the C-CLNT for the assessment of naming impairment in a culturally, linguistically, and educationally diverse memory clinic patient population in Denmark.The C-CLNT is based on a set of standardized color drawings with data on linguistic variables available across several languages.Items for the final 30-item version of the C-CLNT were selected by considering name agreement and frequency across five European and two non-European languages.The C-CLNT was found to have promising psychometric properties and diagnostic accuracy for detecting naming impairment in culturally, linguistically, and educationally diverse patients with dementia but not MCI.Concerning psychometric properties, the internal reliability of the C-CLNT was acceptable according to standard criteria (coefficient alpha = .67).The convergent validity of the C-CLNT was also good, with scores being moderately to strongly correlated with traditional confrontation naming tests, moderately with Category Fluency, and only weakly with general cognitive functioning.Correlations between the C-CLNT and BNT were strongest in the subsample of majority ethnic Danish participants, reflecting the suboptimal utility of the BNT in culturally, linguistically, and educationally diverse populations.
In the context of the cultural, linguistic, and educational diversity among patients in European memory clinics, it is desirable to have a single standardized confrontation naming test for the cross-linguistic assessment of naming impairment in neurocognitive disorders.The results from this study indicate that the C-CLNT is suitable for detecting naming impairment in patients referred for neuropsychological evaluation in such a setting.Compared with traditional language tests, the diagnostic accuracy of the C-CLNT for dementia (AUC of .80)was significantly better than that of the full BNT (AUC of .64)and ), but comparable to Category Fluency (AUC of .83).At the cutoff ≤ 28, the sensitivity of the C-CLNT was .75, which indicates that it does not suffer from low sensitivity and diagnostic accuracy as described for several other cross-linguistic naming tests (Abou-Mrad et al., 2017;Araujo et al., 2020;Ardila, 2007;Dick et al., 2002;Gálvez-Lara et al., 2015;Nielsen et al., 2019b).Overall, the diagnostic accuracy of the C-CLNT was slightly lower than that reported for the MINT (AUC of .85;Stasenko et al., 2019), ICMR-PNT (AUC of .81-1.00;Paplikar et al., 2022), and NAME (AUC of .88;Franzen et al., 2023).However, a direct head-to-head comparison of the tests is impossible as study methods and samples differed between studies.In future research, it would be interesting to make a head-to-head comparison of cross-linguistic naming tests in the same study population.Like other (cross-linguistic) naming tests (Li et al., 2022;Paplikar et al., 2022;Stasenko et al., 2019), the C-CLNT showed poor diagnostic accuracy for MCI.This is most likely due to anomia being an uncharacteristic feature of MCI but typically presents in later stages of AD and other dementia disorders (McKhann et al., 2011).
Examination of error types revealed that typical errors generally reflected conceptual deficits and stimulus-bound responses (Rouleau et al., 1992).For instance, participants frequently responded "bee" or "wasp" for fly which was considered to reflect a stimulus-bound response as the drawing used to depict a fly has yellow and black stripes on its lower back (see Fig. 1).Also, patients with dementia more frequently responded "dress" for skirt and "meat bone" or "chicken bone" for bone, which was considered to represent a conceptual deficit as Danish language differentiates between bones in living creatures and bones for consumption, like the difference between cow and beef or pig and pork in English.These error types may not be related to impaired lexical retrieval and anomia but rather reflect impaired semantic memory and/or visual perception, which is also commonly impaired in dementia disorders and known to be important for confrontation naming ability (Taler & Phillips, 2008).
C-CLNT scores were not associated with sex or age, and only negligibly with years of education.Only 6% of the variance in C-CLNT scores was explained by age and years of education, with slightly lower variance explained by immigrant background (3%).In comparison, 34, 28, and 8% of the variance in test scores was explained by immigrant background on the full BNT, BNT-15, and Category Fluency, respectively.Combined with the findings on diagnostic accuracy, these findings support the cross-linguistic properties of the C-CLNT in a diverse memory clinic setting and highlight important limitations of traditional confrontation naming tests.Conversely, when adapting administration procedures to reduce bias in interpreter-mediated assessment, Category Fluency proved to have high clinical utility for cross-linguistic assessment.This is in line with previous reports of Category Fluency being relatively uninfluenced by culture and language (Nielsen et al., 2018), and having high diagnostic accuracy for dementia in multicultural populations (Nielsen et al., 2019b).
An abbreviated 20-item version of the C-CLNT, using only the object items, was highly correlated with the full C-CLNT, which examined both action and object naming, and showed comparable psychometric properties and diagnostic accuracy for dementia.Although assessment of action naming has been suggested to contribute to differential diagnostics as object and action naming may be differentially affected across dementia disorders (Cotelli et al., 2006;Parris & Weekes, 2001), in the present study action naming did not contribute to the overall classification of dementia.Thus, the 20-item C-CLNT may generally be adequate for crosslinguistic assessment of naming impairment in a memory clinic setting.However, future studies might test the effects of action naming with a larger sample of items than used in the present study and a more diverse patient group.
This study has some limitations.Although we were able to analyze C-CLNT performance across patients with affective disorders, MCI, and dementia, the clinical groups were not fully matched on age and proportion of participants with immigrant background, which may have exacerbated some group differences.Furthermore, our dementia sample was too small to analyze the C-CLNT across specific dementia subtypes.Also, the C-CLNT demonstrated a ceiling effect in all clinical groups, except the dementia group, and was not able to discriminate between participants in the control and MCI group.Although the C-CLNT appears to be sensitive to naming impairment in patients with mild dementia, this may indicate that the C-CLNT is not sensitive to more subtle naming impairment.The scale reliability of the C-CLNT was acceptable, but not high.This means that it may not be consistent in measuring naming ability, and the results should  be interpreted with caution.Additionally, in interpreter-mediated assessments, interpreters often struggled with translations of responses for items of the BNT as they did not know the corresponding words in Danish.Also, interpreters assisted in determining whether a nonstandard response was a correct synonym or an incorrect response.On Category Fluency, instructing the interpreters to simply say "yes" for every new animal name did not allow for more careful inspection of repetitions, intrusions, or questionable responses (e.g., same animal name in two languages).Other approaches include having the interpreters write down the responses in the language of their choice or recording and transcribing the responses.However, as the use of interpreters did not differ between clinical groups, these issues are unlikely to have significantly influenced the results.Finally, although the diagnostic accuracy of the C-CLNT did not significantly differ between participants with majority ethnic Danish and immigrant backgrounds, further studies comparing larger cultural and language groups are needed to support the cross-linguistic, cross-cultural, and diagnostic properties of the C-CLNT.Also, reliability metrics, including test-retest, intra-rater, and inter-rater reliability, should be established to provide further support for the psychometric properties of the C-CLNT.As suggested by the European Consortium on Cross-Cultural Neuropsychology (ECCroN) (Franzen et al., 2022), such studies should preferably take several diversity-related variables into account, including limited education and literacy, quality of education, and acculturation.
In conclusion, the novel C-CLNT has promising clinical utility for cross-linguistic assessment of naming impairment in culturally, linguistically, and educationally diverse older adults.Although the C-CLNT was developed by taking into consideration the cultural and linguistic diversity in Europe and was validated in a diverse memory clinic population, the C-CLNT may also be suitable for assessment of naming impairment in other cultural and clinical contexts, including culturally and linguistically diverse populations in other world regions, and patients with stroke or traumatic brain injury.However, before such applications further research is needed to establish the utility of the C-CLNT in these contexts.
Supplementary material.The supplementary material for this article can be found at https://doi.org/10.1017/S1355617723000437.

Figure 1 .
Figure 1.Examples of MULTIMAP items included in the C-CLNT (bone, fly, nail, hang).Items presented with permission from the authors.

BNT=
Boston Naming Test; C-CLNT = Copenhagen Cross-Linguistic Naming Test; MCI = mild cognitive impairment; MMSE = mini-mental state examination; RUDAS = Rowland Universal Dementia Assessment Scale.a Comparison based only on participants with an immigrant background.

Table 1 .
List of items comprising the C-CLNT, and item scores by group MCI = mild cognitive impairment; N/A = not applicable.

Table 2 .
Participant characteristics and neuropsychological test performance

Table 3 .
Diagnostic accuracy of the C-CLNT, BNT, and Category Fluency Boston Naming Test; CI = confidence interval; C-CLNT = Copenhagen Cross-Linguistic Naming Test. a Optimal cutoff for discriminating between patients with dementia and other groups based on Youden's J.