The California Verbal Learning Test-III (CVLT-III): Adaptation, validation, and initial norms in the Hebrew-speaking Israeli population

Yoram Braw

doi:10.1017/S1355617725101616

The California Verbal Learning Test-III (CVLT-III): Adaptation, validation, and initial norms in the Hebrew-speaking Israeli population

Published online by Cambridge University Press: 28 November 2025

Yoram Braw

Show author details

Yoram Braw*: Affiliation:
Department of Psychology, Ariel University , Ariel, Israel
*: Email: yoramb@ariel.ac.il

Article contents

Abstract
Objective:
Methods:
Results:
Conclusions:
Statement of Research Significance
Introduction
Method
Results
Discussion
Supplementary material
Funding statement
Competing interests
Footnotes
References

Rights & Permissions

Abstract

Objective:

Neuropsychological assessments commonly include word list learning tasks to assess verbal memory and learning. The California Verbal Learning Test (CVLT) provides multiple outcome measures and information regarding strategies used to enhance the coding and retrieval of information. Despite its popularity, the CVLT has not yet been formally translated into Hebrew and adapted to the Israeli population.

Methods:

The CVLT-III was adapted to Hebrew (CVLT-IIIHebrew), and normative data of healthy Hebrew-speaking adults living in Israel (age range: 20 – 65, education range: 9 – 20) were collected (N = 235).

Results:

CVLT-IIIHebrew core scores were influenced by age, education level, and, to a lesser extent, sex. Normative data for the Hebrew-speaking Israeli population were generated using an overlapping interval strategy, and regression models were used to evaluate the necessity of adjusting core scale scores for sociodemographic variables. Internal reliability was very high. Clinicians can employ an easy-to-use calculator for adjusting CVLT-IIIHebrew core scores.

Conclusions:

The adapted CVLT-IIIHebrew provides a valuable tool for evaluating the verbal memory of Hebrew speakers. Caution, however, is warranted when assessing individuals with lower education levels, as the normative sample was relatively highly educated. This highlights the importance of expanding the normative sample to include a broader spectrum of educational levels and ages. Moreover, the inclusion of Israeli minority groups, currently unrepresented in this normative sample, is of importance.

Keywords

California verbal learning test (CVLT)memory adaptation cross-cultural Hebrew normative data

Information

Type: Research Article
Information: Journal of the International Neuropsychological Society , Volume 31 , Issue 9-10 , November 2025 , pp. 695 - 708

DOI: https://doi.org/10.1017/S1355617725101616 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of International Neuropsychological Society

Statement of Research Significance

Research Question(s) or Topic(s): This study addressed the need for a Hebrew version of the California Verbal Learning Test, Third Edition (CVLT-III_Hebrew). It examined the test’s adaptation for Hebrew speakers in Israel and established initial norms for this population. Main Findings: The CVLT-III was successfully translated and adapted into Hebrew. The study established initial norms for this version based on the performance of healthy Hebrew-speaking Israeli adults. Age and education level affected test performance, while sex impacted performance to a lesser degree. The CVLT-III_Hebrew showed high internal reliability. Study Contributions: This study provides the first formal Hebrew adaptation and initial norms for the CVLT-III in Israel. The study’s findings offer clinicians a valuable tool for evaluating verbal memory while emphasizing the need to expand the norms to include individuals with lower education levels and those belonging to Israeli minorities.

Introduction

The evaluation of memory and learning, commonly impacted by neuropsychiatric disorders, is integral to neuropsychological assessments (Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012; Reynolds et al., Reference Reynolds, Altmann and Allen2021; Sherman et al., Reference Sherman, Tan and Hrabok2022). Verbal memory is frequently assessed using word lists, with the California Verbal Learning Test (CVLT; Delis et al., Reference Delis, Kramer, Kaplan and Ober1987) consistently ranked among the three most popular memory tasks by clinicians (Rabin et al., Reference Rabin, Nester, Barr, Boyle, Stern., Stein., Sahakian., Golden., Lee. and Chen2023). Unlike memory tasks that screen for impairment by providing a single outcome score, the CVLT offers an in-depth analysis of memory processes and a wealth of quantitative outcome measures. Notably, the test is unique in its ability to assess learning strategies. More specifically, the CVLT’s primary wordlist includes semantically related words, allowing the examiner to clarify strategy use by comparing the examinee’s performance in the free recall and cued-recall trials (Bair et al., Reference Bair, Patrick, Noyes, Hale, Campbell, Wilson, Ransom and Spencer2023). Thereby, the CVLT differs from tasks based on a presentation of unrelated words, such as the Rey Auditory Verbal Learning Test (RAVLT; Rey, Reference Rey1964). The CVLT’s reliability and validity, especially those of its core scores, are well established (see reviews; Delis et al., Reference Delis, Kramer, Kaplan and Ober2017; Farrer & Drozdick, Reference Farrer and Drozdick2020a; Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012, pp. 478 – 481; Sherman et al., Reference Sherman, Tan and Hrabok2022, pp. 624 – 635). The CVLT consistently shows robust construct validity in both various age groups and clinical populations, including neurological and psychiatric disorders (e.g., traumatic brain injury and schizophrenia), and correlations with relevant fronto-temporal brain structures (e.g., Keith et al., Reference Keith, Haut, Wilhelmsen, Mehta, Miller, Navia, Ward, Lindberg, Coleman, McCuddy, Deib, Giolzetti and D’Haese2023). Importantly, the CVLT has been proven capable of detecting age-related declines in verbal memory processes such as acquisition, recall, and recognition discrimination, with older adults specifically showing an increase in recall errors and response bias. This sensitivity aids in detecting early signs of neurodegenerative disorders (e.g., Alzheimer’s disease) and tracking the course of these patients’ memory deficits over time. Regarding reliability, the CVLT-III’s factor has robust alternate form reliability for its core scores, adequate test–retest reliability, and excellent internal reliability (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, pp. 36 – 44).

The impact of cross-cultural factors on examinees’ performance in cognitive testing assessment has garnered increased research attention in recent years (Fernández & Evans, Reference Fernández, Evans, Fernández and Evans2022; Franzen et al., Reference Franzen, Pomati, Papma, Nielsen, Narme, Mukadam, Lozano-Ruiz, Ibanez-Casas, Goudsmit, Fasfous, Daugherty, Canevelli, Calia, van den Berg and Bekkhus-Wetterberg2022; Merkley et al., Reference Merkley, Esopenko, Zizak, Bilder, Strutt, Tate and Irimia2023; Ramani et al., Reference Ramani, Young and Zakzanis2024). These studies, including those undertaken in Israel, stressed the limited number of cross-culturally adapted tests and the importance of using local norms (e.g., Kave et al., Reference Kave, Sapir-Yogev, Bregman and Shiner2022; Staios et al., Reference Staios, Kosmidis, Tsiaras, Nielsen, Papadopoulos, Kokkinias, Velakoulis, March and Stolwyk2023). The CVLT-III updated extensive normative data spans ages 16 to 90, with participants demographically matched to the most recent census of the U.S. population (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017). This represents progress as it minimizes the biases inherent in traditional tests, which often depend on norms established by a homogeneous group of white, English-speaking, middle-class, and highly educated individuals (Ardila, Reference Ardila2020). It is also reassuring that ethnicity explained a negligible percentage (0.3%) of the normative sample’s variance (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, p. 35). Concerns regarding the impact of cross-cultural factors, however, have not been sufficiently alleviated to date. Normative data for the CVLT of non-English-speaking examinees and countries other than the U.S. are numbered (for recent publications, see Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Feyzioğlu, Reference Feyzioğlu2020; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022; Romaszko-Wojtowicz et al., Reference Romaszko-Wojtowicz, Borkowska, Opalach, Romaszko, Łowczak and Buciński2023). Moreover, at least some of these studies suggest differences in performance (Kim & Kang, Reference Kim and Kang1999) or differ in the impact of sociodemographic variables such as sex and education level on CVLT performance compared to the normative sample (Chang et al., Reference Chang, Kramer, Lin, Chang, Wang, Huang, Lin, Chen and Wang2010; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022). These findings further stress the importance of using local norms and were the impetus for the current research project. More specifically, while Hebrew translations of the CVLT were created for specific research projects (Poreh et al., Reference Poreh, Avital, Dines and Levin2015; Toren et al., Reference Toren, Sadeh, Wolmer, Eldar, Koren, Weizman and Laor2000), these were ad-hoc translations of previous CVLT editions rather than the latest third edition. In addtion, these studies did not provide normative data. The current study aimed to meet this need by adapting the CVLT-III to the Israeli Hebrew-speaking population (CVLT-III_Hebrew) and establishing initial normative data among healthy Israeli adults.

Age and education level were hypothesized to significantly impact CVLT-III Hebrew performance. Consistent with previous research, age was expected to be the strongest sociodemographic predictor of CVLT performance (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, pp. 77 – 81; Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012, pp. 478 – 481; Sherman et al., Reference Sherman, Tan and Hrabok2022, pp. 624 – 635). For example, the CVLT-III manual reports that age accounted for 25.9% of the variance in the sum of raw scores for Trials 1 – 5, with education level explaining an additional 4.5% (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, p. 35). Age was therefore hypothesized to inversely impact CVLT-III Hebrew performance, while education level was expected to have a positive, though smaller, impact. The impact of participant sex on CVLT-III_Hebrew performance was also hypothesized, with females expected to perform better than males (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, pp. 79 – 80; Hirnstein et al., Reference Hirnstein, Stuebs, Moe and Hausmann2023). However, this hypothesis was made more tentatively as the effect of sex on CVLT performance has not been uniformly identified in cross-cultural research (see Discussion for a comprehensive review). Finally, while non-linear age effects on CVLT performance, particularly a more rapid decline in later years, have been observed, their impact is generally weaker and primarily evident in studies that include geriatric populations (Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022). Given that older adults were not evaluated in the current study, a quadratic effect of age was not hypothesized. To further validate the adapted test, the author assessed the CVLT-III_Hebrew’s internal reliability, evaluated the impact of sociodemographic characteristics on test performance, and compared the performance of the Israeli sample to that of participants who were tested using other non-English versions of the CVLT and case reports of participants whose performance was analyzed using the CVLT-III’s normative data (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017). Thereby, the project aimed to add a rigorously researched wordlist memory test with unique characteristics to the tools at the clinician’s disposal.

Method

Participants

Healthy adults participated in the study (N = 249). They were recruited through announcements on Ariel University’s online research platform and social networks between 10/2020 and 10/2023. Inclusion criteria were: (a) Adult age (≥ 18 years and ≤ 65 years). (b) Self-reported fluency in Hebrew. Exclusion criteria were: (a) Major neuropsychiatric disorders. (b) Neurodevelopmental disorders, including learning disabilities and Attention-Deficit/Hyperactivity Disorder (ADHD). (c) Medical conditions that may impair cognition (e.g., diabetes and sleep apnea). (d) Motor and sensory disability precluding cognitive testing (e.g., uncorrected hearing impairment). (e) Alcohol and drug abuse. Inclusion/exclusion criteria were determined based on a written self-report, with any ambiguities resolved by the research staff before the participant signed the study’s informed consent form. Participants were not compensated for study participation.

Ten candidates were excluded from the study due to pre/co-morbid neurological (n = 2), psychiatric (n = 2), and neurodevelopmental disorders (n = 6). Two additional candidates were excluded due to diabetes (n = 2). The data of two participants were excluded from analyses based on an apriori decision rule for detecting poorly motivated participants; motivation item <4 in the study’s debriefing survey (following Berger et al., Reference Berger, A., Braw, Elbaum, Wagner and Rassovsky2021; Braw et al., Reference Braw, Elbaum, Lupu and Ratmansky2024). Overall, data from 235 healthy adults was analyzed (age range: 20 – 65, education level range: 9 – 20, n females = 163, n male = 72). Table 1 presents the sociodemographic characteristics of the participants.

Table 1.

Sociodemographic characteristics of the normative sample (per age group and total)

Note. Data of parametric variables are presented as mean ± SD.

The study was performed in accordance with the Helsinki Declaration and approved by Ariel University’s Ethics Committee (approval no.: AU-SOC-YB-20221218). All participants signed a written informed consent form before entering the study.

Tools

California Verbal Learning Test-III (CVLT-III; Delis et al., Reference Delis, Kramer, Kaplan and Ober2017): The test stimuli comprise two word lists, each containing 16 nouns (lists A and B). List A comprises four categories (i.e., furniture, vegetables, means of transportation, and animals), with four words in each category. List B is also comprised of four categories, two of which are identical to those of list A. The administration procedure is as follows: (a) Immediate recall trials (trials 1 – 8): These trials included: (1) Learning trials (trials 1 to 5 free recall): An immediate recall of List A, which is repeated five times. (2) Interference trial (list B free recall): An immediate recall of list B. (3) Short Delay Free Recall (SDFR): A free recall of list A. (4) Short Delay Cued Recall (SDCR): A free recall of words belonging to each semantic category after being provided with the name of the category. (b) Delayed memory trials (trials 9 – 12): These trials included: (1) Long Delay Free Recall (LDFR; trial 9): A free recall of list A after a 20-minute delay. (2) Long Delay Cued Recall (LDCR; trial 10): A free recall of list A after being provided with the names of the categories. (3) Long delay Yes/No Recognition (trial 11): The participant is read a list of words, either words from list A or foils, and is requested to respond “yes” if the word belongs to list A and “no” if it does not. (4) Forced Choice Recognition (trial 12): This optional trial, aimed at assessing performance validity, was not included in the current study. The CVLT’s outcome measures are presented in the test manual (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017), as well as the tables that accompany the current study (for additional information, see Farrer & Drozdick, Reference Farrer and Drozdick2020a; Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012, pp. 478 – 481; Sherman et al., Reference Sherman, Tan and Hrabok2022, pp. 624 – 635). Two measures (Total recall [sum of correct responses across Trials 1 – 5] and Yes/No recognition Hits−False Positives) are not part of the core scores listed in the CVLT-III’s manual (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, p. 4). These measures were added following the request of an anonymous reviewer due to their utility and use in clinical practice.

Adaptation procedure

The author directly translated the CVLT-III words from English to Hebrew based on the following criteria: (a) The frequency of each translated word was checked in Linzen’s word frequency database (Linzen, Reference Linzen2009). Low-frequency words (i.e., a frequency of ≤ six appearances per million words, following Kavé et al., Reference Kavé, Gorokhod, Yerushalmi and Salner2019) were replaced by more frequently used words belonging to the same semantic category. Concurrently, the four most prototypical words in each category were avoided (following Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, p. 29). All words belonging to one list B category (“parts of a house”) are low-frequency words in Hebrew. It was, therefore, decided to replace the category with that of “nature” (e.g., mountain). This procedure was similar to that used when the CVLT was adapted to Chinese and the “tools” category was replaced (Chang et al., Reference Chang, Kramer, Lin, Chang, Wang, Huang, Lin, Chen and Wang2010). (b) When multiple Hebrew words corresponded to a single English word, the original word was replaced with a single Hebrew word belonging to the same semantic category. (c) In several cases, a plural word in English was translated to the singular form of the Hebrew word due to significant differences in the number of syllables. Following current standards (International Test Commission, 2017; Nguyen et al., Reference Nguyen, Rampa, Staios, Nielsen, Zapparoli, Zhou, Mbakile-Mahlanza, Colon, Hammond, Hendriks, Kgolo, Serrano, Marquine, Dutt, Evans and Judd2024), the adapted CVLT-III was evaluated and further reviewed by a multidisciplinary team that included two licensed rehabilitation psychologists and a speech therapist. The team members were native Hebrew speakers, familiar with the Israeli culture, and experienced in cognitive testing. The revised edition of the CVLT-III_Hebrew underwent three iterative cycles until reaching its final form. Each cycle comprised pilot testing of the CVLT-III_Hebrew by graduate students in clinical neuropsychology, followed by further revision of the CVLT. The final items in CVLT-III_Hebrew were deemed familiar to Israelis across age, sex, and socioeconomic class divides. The CVLT-III’s publisher approved the project (Pearson; Master License Agreement No. LSR-620161, April/25/2023).

Procedure

The experimental procedures were conducted in a quiet, well-lit room with the experimenter sitting on the opposite side of a table from the participant. After signing an informed consent form and filling out a demographic-medical questionnaire, the participants performed the CVLT-III_Hebrew’s immediate recall trials. After 20 minutes in which they performed non-verbal filler tasks, the participants performed the delayed trials of the CVLT- III_Hebrew (LDFR, LDCR, and Yes/No recognition) and completed a debriefing survey in which they noted their motivation to perform the experimental procedures as instructed (1 – 7 Likert scale; higher scores indicating stronger motivation). Trained graduate students in clinical neuropsychology conducted all experimental procedures.

Data analyses

Statistical analyses generally followed the procedures implemented in the Spanish Multicenter Normative Studies (NEURONORMA), a large-scale project aimed at providing normative data to clinicians in a cross-cultural context (e.g., Pena-Casanova et al., Reference Pena-Casanova, Blesa, Aguilar, Gramunt-Fombuena, Gomez-Anson, Oliva, Molinuevo, Robles, Barquero, Antunez, Martinez-Parra, Frank-Garcia, Fernandez, Alfonso and Sol2009; Pena-Casanova et al., Reference Pena-Casanova, Casals-Coll, Quintana, Sanchez-Benavides, Rognoni, Calvo, Palomo, Aranciva, Tamayo and Manero2012; Perez-Enriquez et al., Reference Perez-Enriquez, Garcia-Escobar, Florido-Santiago, Pique-Candini, Arrondo-Elizaran, Grau-Guinea, Pereira-Cuitino, Manero, Puig-Pijoan, Pena-Casanova and Sanchez-Benavides2024), which applied procedures that were previously developed as part of the Mayo Older American Normative Studies (MOANS; Ivnik et al., Reference Ivnik, Malec, Smith, Tangalos, Petersen, Kokmen and Kurland1992). The analysis comprised these steps:

a. Preliminary statistical procedures: Prior to conducting linear regressions, a series of preliminary statistical procedures were performed to ensure data integrity and model validity. First, descriptive statistics (means, SDs, and ranges) for all variables were calculated to summarize the sample characteristics and identify any unusual patterns in the data. Outlier detection was conducted using both SD criteria (values exceeding ±3 SD from the mean) and the more robust Sn measure, which was evaluated using the R robustbase package (Jones, Reference Jones2019). The selection of predictors was determined by theoretical relevance and empirical associations. As part of the data preparation, age and education level were mean-centered to reduce bias due to multicollinearity and improve the interpretation of coefficients, and squared terms for these mean-centered variables were also created to explore nonlinear associations (Espenes et al., Reference Espenes, Eliassen, Ohman, Hessen, Waterloo, Eckerstrom, Lorentzen, Bergland, Halvari Niska, Timon-Reina, Wallin, Fladby and Kirsebom2023). Regarding empirical associations among the remaining potential sociodemographic predictors, I examined Pearson product–moment correlations to minimize the impact of multicollinearity as a potential confounder and simplify the main regression models. A criterion of ∣r∣ > .8 for pairwise correlations was utilized as an initial indicator of potentially problematic multicollinearity among these predictors.
b. Evaluating the impact of sociodemographic variables on CVLT-III _Hebrew performance: To justify a parsimonious prediction model for the primary analyses, the impact of sociodemographic effects on CVLT-III_Hebrew performance was explored by performing linear regressions in which five sociodemographic variables predicted each of the CVLT-III_Hebrew’s core raw scores: mean-centered age, mean-centered age², mean-centered education level, mean-centered education level², and sex. Predictors were retained if they significantly contributed to the overall model (p < .05) and the unique variance (semi-partial correlation², sr ²) was at least 5%. The analyses indicated that the non-linear terms had a negligible impact. Only two of the 15 models were significant, and the quadratic age term (mean-centered age²) was not a significant predictor in any model. Although the quadratic education term (mean-centered education level²) significantly predicted total intrusions (p = .043), its contribution of unique variance fell well below the 5% criterion (sr ² = .017). Given these findings, the exploration of non-linear effects was not pursued further. Consequently, all main analyses utilized a simplified model that included mean-centered age, mean-centered education level, and sex as predictors.
c. Division of normative data according to age: The sample was stratified by age, a decision informed by the impact of aging on CVLT-III_Hebrew performance (noted in the Introduction) and our observation that Israeli clinicians are more familiar with the traditional normative data presentation (delCacho-Tena et al., Reference delCacho-Tena, Christ, Arango-Lasprilla, Perrin, Rivera and Olabarrieta-Landa2024). This was done using the overlapping interval strategy (Pauker, Reference Pauker1988) which enabled sample size within each age group to reach the minimum recommended size of 50 to 70 participants per age group, thereby increasing the stability of means and SDs (Bridges & Holler, Reference Bridges and Holler2007; Piovesana & Senior, Reference Piovesana and Senior2018). Next, scaled scores (SS; Mean = 10, SD = 3, range: 2 – 18) per age group were created to approximate the normal distribution upon which linear regressions could be performed. This was done by transforming the CVLT-III_Hebrew core raw scores into percentile ranks (i.e., cumulative percentiles) and then SSs per age group.
d. Norm adjustments based on sociodemographic variables: The overall contribution of sociodemographic variables to the prediction of CVLT-III_Hebrew performance was evaluated using linear regressions in which sociodemographic variables (mean-centered age, mean-centered education level, and sex) predicted each of the 15 CVLT-III_Hebrew core raw scores. Next, the need to adjust core SSs based on sociodemographic variables was evaluated using linear regressions in which each core SS served as the dependent variable and sociodemographic variables (mean-centered age, mean-centered education level, and sex) were predicted per age groupFootnote ¹ . Adjusted scaled scores (SS_adj) were calculated using the formula: SS_adj = SS − [B_age × (age – 39.5) + B_{education level} × (education level – 15) + B_sex × sex (male = 0, female = 1)]Footnote ² . All adjusted SSs were truncated to the lower whole number. The selection of sociodemographic variables to be included in the formula was based on the earlier-mentioned criteria (i.e., the variable significantly predicted the model and sr ² was ≥ 5%).

Comparisons with earlier normative samples and internal reliability

To compare the current study’s normative data with existing published norms, the following statistical comparisons were performed: (a) I applied two different normative standards to the raw core scores of the CVLT-III_Hebrew obtained from the current sample; the newly developed Israeli norms and the original CVLT-III norms (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017). SSs and index scores, normed using each of the normative data sets, were then compared using paired-samples t-tests. In addition, the CVLT performance of three case reports, presented by Farrer and Drozdick (Reference Farrer and Drozdick2020b)Footnote ³ , was normed using both normative data sets and then compared using paired-samples t-tests. (b) The current sample’s CVLT performance was compared using one-sample t-tests to representative studies of healthy participants conducted in a cross-cultural context. As the studies differed in reported CVLT scores, the most commonly reported measure (Total recall; Sum of correct responses across Trials 1 – 5) was selected (i.e., Spanish, Turkish, Korean, and Chinese versions; references for the studies are presented in the Results section).

Considering the challenges that word list learning tasks pose for estimating internal reliability due to item interdependence, the split method was utilized (see discussion in Sherman et al., Reference Sherman, Tan and Hrabok2022, p. 627). More specifically, the immediate recall trials were split (trials 1 & 3 vs. trials 2 & 4, and trials 2 & 4 versus trials 3 & 5), and the Spearman-Brown formula was applied to the average of the correlations (lengthening factor = 2.5).

Additional analyses and general remarks

Supplementary Material 1 presents Pearson product–moment correlations between CVLT-III_Hebrew core raw scores and sociodemographic variables (age, education level, and sex). Analyses were conducted using SPSS 27.0, with p < .05 considered statistically significant in all statistical analyses.

Results

Division of normative data into age groups and calculation of unadjusted scaled scores

Six overlapping age groups were created: 18 – 30 years (n = 93), 26 – 38 years (n = 94), 34 – 46 years (n = 50), 40 – 52 years (n = 44), 48 – 60 years (n = 52), and 56 – 65 years (n = 46). Sociodemographic characteristics of the sample and CVLT-III_Hebrew raw core and process scores per age group can be found in Table 1 and Tables 2–3, respectively.

Table 2.

Core raw scores according to age

Note. All data are presented as Mean ± SD.

FP = False positives, LDCR = Long Delay Cued Recall, LDFR = Long Delay Free Recall, SDCR = Short Delay Cued Recall, SDFR = Short Delay Free Recall, T1/2/3/4/5 = Trial 1/2/3/4/5, Total recall (T1–5) = Sum of correct responses across Trials 1–5.

Table 3.

Process raw scores according to age

Note. All data are presented as Mean ± SD.

LDCR = Long Delay Cued Recall, LDFR = Long Delay Free Recall, SDCR = Short Delay Cued Recall, SDFR = Short Delay Free Recall, T1–T5 = Trials 1 to 5.

Unadjusted SS (SS; Mean = 10, SD = 3, range: 2 – 18) and percentiles for the CVLT-III_Hebrew core raw scores per age group can be found in Tables 4– 9.

Table 4.

CVLT-III_hebrew core SSs; age range = 18 – 30 years, median age = 24, n = 93

Data adjustments based on sociodemographic variables

Linear regressions in which sociodemographic variables (mean-centered age, mean-centered education level, and sex) predicted each of the CVLT-III_Hebrew core raw scores revealed that mean-centered age and mean-centered education level each significantly predicted five, partially overlapping, scores. Semi-partial correlations and coefficients of determination (sr ²) reflecting the associations between sociodemographic variables and CVLT-III_Hebrew core raw scores can be found in Supplementary Material 2.

Regressions in which sociodemographic variables (mean-centered age, mean-centered education level, and sex) predicted SSs per age group indicated the need for 13 age-based adjustments, seven education level-based adjustments, and two sex-based adjustments. Table 10 presents formulas for adjusting CVLT-III_Hebrew core SSs.

Supplementary Material 3 features a calculator designed for conveniently adjusting core SSs, using an examinee’s sociodemographic data and CVLT-III_Hebrew performance. This calculator is also accessible online at https://bit.ly/45rqXFd.

Clinical example

Calculation of SS and SS_adj of a 38-year-old female examinee (education level = 8 years) with a T1 correct (i.e., number of correct responses in trial 1) raw score of 7: (a) Determine SS and %tile: Locate the raw score in Table 6 and determine SS, which is in the left column, and percentile, which is in the right column (SS = 9, %tile = 29 – 40; respectively). Note that although both Tables 5 and 6 cover the examinee’s age range, Table 6 was selected as the examinee’s age (38 years) is closer to the median age of the group presented in Table 6 than Table 5 (40 vs. 32 years, respectively). (b) Determine whether SS adjustments are mandated: SSs necessitating adjustments are marked using a gray background in Tables 4–9, as is the case for T1 correct in the 34 – 46 years age range (see Table 6). (c) Calculate SS adjustments if mandated: The SS of T1 correct can be adjusted using the regression formulas that are listed in Table 10 or by using the CVLT-III_Hebrew’s norm calculator (i.e., the spreadsheet in Supplementary Material 3 or online at https://bit.ly/45rqXFd). Using the norm calculator, the examinee T1 correct SS (= 9) should be adjusted based on the examinee’s sex; this is done by entering the examinee’s sex in the relevant cell (column A, row 8) and T1 correct SS in the suitable place in the upper right table (column E, row 5). The SS_adj is then automatically calculated and presented in the lower table (= 11).

Table 5.

CVLT-III_hebrew core SSs; age range = 26 – 38 years, median age = 32, n = 94

Table 6.

CVLT-III_hebrew core SSs; age range = 34 – 46 years, median age = 40, n = 50

Table 7.

CVLT-III_hebrew core SSs; age range = 40 – 52 years, median age = 46, n = 44

Table 8.

CVLT-III_hebrew core SSs; age range = 48 – 60 years, median age = 54, n = 52

Table 9.

CVLT-III_Hebrew core SSs; age range = 56 – 65 years, median age = 60.5, n = 46

Note. FP = False positives, LDCR = Long Delay Cued Recall, LDFR = Long Delay Free Recall, SDCR = Short Delay Cued Recall, SDFR = Short Delay Free Recall, SS = Scaled scores, T1/2/3/4/5 = Trial 1/2/3/4/5, Total recall (T1–T5) = Sum of correct responses across Trials 1–5.

¹ Higher raw scores indicate poorer performance.

SSs necessitating adjustments are marked using a gray background.

Supplementary Material 3 includes a spreadsheet for the adjustment of CVLT-III_Hebrew core SSs based on the examinee’s sociodemographic data. The spreadsheet is also available online at the following link: https://bit.ly/45rqXFd.

Table 10.

Adjustments of CVLT-III_Hebrew core SSs based on socio-demographic variables (age, education level, and sex)

Note. Grayscale levels differ according to the variable that is used to adjust the SS (age, education level, or sex).

CVLT-III_Hebrew core SSs can be adjusted using the spreadsheet provided in Supplementary Material 3 (also available online at the following link: https://bit.ly/45rqXFd).

FP = False positives, LDCR = Long Delay Cued Recall, LDFR = Long Delay Free Recall, N.R. = Not relevant (i.e., no adjustments of SS needed). SDCR = Short Delay Cued Recall, SDFR = Short Delay Free Recall, SS = Scaled scores, T1/2/3/4/5 = Trial 1/2/3/4/5, Total recall (T1 – T5) = Sum of correct responses across Trials 1 – 5.

Comparisons with earlier normative samples and internal reliability

Comparisons with earlier normative data sets: (a) Comparisons with the CVLT-III’s normative data: The CVLT-III_Hebrew core scores indicated poorer performance when using the normative data from the current study versus the original CVLT-III norms (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017), ps < .001. See Supplementary Material 4. Correspondingly, all three case reports presented in Supplementary Material 5 exhibited lower Total recall SSs when using the current study’s norms compared to those derived using the normative sample (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017); t(10) = 5.68, p < .001, d = 1.71; t(10) = 4.30, p = .002, d = 1.30; t(10) = 8.19, p < .001, d = 2.47 (case report 1 through 3, respectively). (b) Significantly more words were recalled in trials 1 to 5 (Total recall; sum of correct responses in Trials 1 – 5) by participants in the current study compared to those in studies of healthy participants that were performed in a cross-cultural context (Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Kim & Kang, Reference Kim and Kang1999; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022)Footnote ⁴ , ps < .001. A notable exception was Feyzioğlu (Reference Feyzioğlu2020), in which Turkish healthy adults had a significantly higher Total recall than participants in the current study (p = .046). Additional information about the studies, such as the sociodemographic characteristics of their participants, is available in Supplementary Material 6.

Internal reliability based on the split method was very high (= .90).

Additional analyses

Supplementary Material 6 presents Pearson product–moment correlations between CVLT-III_Hebrew core raw scores and sociodemographic variables (age, education level, and sex).

Discussion

The current study aimed to provide normative data for the CVLT-III_Hebrew, a translation of the CVLT-III to Hebrew, and its adaptation to the Israeli population. To achieve this aim, 235 healthy adults, aged 18 to 65 years, performed the CVLT-III_Hebrew. After ensuring data integrity, the data was stratified by age, a decision informed by the impact of aging on CVLT-III_Hebrew performance and our observation that clinicians in Israel are more familiar with the traditional normative data presentation (delCacho-Tena et al., Reference delCacho-Tena, Christ, Arango-Lasprilla, Perrin, Rivera and Olabarrieta-Landa2024). This was done using the overlapping interval strategy (Pauker, Reference Pauker1988), which maximizes the sample size for normative data generation. Next, raw scores were transformed into scaled scores (SSs; Mean = 10, SD = 3, range: 2 – 18) per age group, approximating the normal distribution upon which linear regressions exploring the impact of sociodemographic variables could be performed. These regressions evaluated the need and extent of score adjustments in several stages. First, the non-linear effects of sociodemographic factors on CVLT-III_Hebrew performance were explored with the analyses indicating that quadratic variables (i.e., mean-centered age² and mean-centered education level²) had a negligible impact. This was likely related to the fact that the current study did not include geriatric patients, an age at which accelerated decline in verbal memory is expected (Liampas et al., Reference Liampas, Folia, Ntanasi, Yannakoulia, Sakka, Hadjigeorgiou, Scarmeas, Dardiotis and Kosmidis2023). Given these findings, non-linear effects were not further explored, and all main analyses used a simplified model that included mean-centered age, mean-centered education level, and sex as predictors. Second, the overall impact of sociodemographic factors on CVLT-III_Hebrew performance was investigated. This was done using whole-sample regressions in which the sociodemographic variables (i.e., mean-centered age, mean-centered education level, and sex) predicted the raw core CVLT-III_Hebrew scores. These analyses indicated that age and education level each significantly predicted five, partially overlapping, core scores. More specifically, age and education levels explained up to 5.1 and 4% of the shared variance, respectively. These findings align with the well-established age-related decline in declarative memory functioning (Lighthall et al., Reference Lighthall, Conner and Giovanello2019), which is linked to structural changes in medial temporal lobe (MTL) structures, particularly the hippocampus, and its connectivity with other cortical and subcortical regions (Dickerson & Eichenbaum, Reference Dickerson and Eichenbaum2010; Nyberg, Reference Nyberg2017). Correspondingly, a decline in CVLT performance with aging was consistently found in earlier studies (Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012; Sherman et al., Reference Sherman, Tan and Hrabok2022, pp. 625 – 626), including those that were performed outside of the U.S. (e.g., Argento et al., Reference Argento, Pisani, Incerti, Magistrale, Caltagirone and Nocentini2014; Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Chang et al., Reference Chang, Kramer, Lin, Chang, Wang, Huang, Lin, Chen and Wang2010; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Kim & Kang, Reference Kim and Kang1999; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022)Footnote ⁵ . The impact of education level on CVLT performance found in the current study also corresponds to earlier findings (Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012; Sherman et al., Reference Sherman, Tan and Hrabok2022, pp. 625 – 626). Education level, however, was a weaker predictor of CVLT performance than age in the current study, aligning with the CVLT-III’s decision not to stratify scores by education (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017). In contrast, examinees’ sex had only a minor impact on CVLT-III_Hebrew performance. More specifically, sex did not significantly predict performance on any CVLT-III_Hebrew measures and accounted for ≤ 1% of the unique variance. This contrasts with the CVLT-III’s normative data, in which sex explained an additional 5.1% of the variance beyond the variance that was explained by age (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017), corresponding to the small but reliable advantage of females over males in verbal-episodic memory across different ages and task types (see meta-analysis; Hirnstein et al., Reference Hirnstein, Stuebs, Moe and Hausmann2023). At the same time, the extent and consistency of this sex effect across all CVLT measures vary in cross-cultural studies. Some studies found sex effects across many measures (Argento et al., Reference Argento, Pisani, Incerti, Magistrale, Caltagirone and Nocentini2014; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Kim & Kang, Reference Kim and Kang1999), while others observed effects only in specific measures or described them as not robust (Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Chang et al., Reference Chang, Kramer, Lin, Chang, Wang, Huang, Lin, Chen and Wang2010; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022). Regarding the current study, the observation that sex had a limited impact on CVLT-III_Hebrew performance might stem from the relative underrepresentation of older males in our sample (see also the limitations paragraph). This is particularly relevant as memory function is known to be influenced by an interaction between age and sex (Asperholm et al., Reference Asperholm, Nagar, Dekhtyar, Herlitz and Ginsberg2019). Enlarging the normative sample for the CVLT-III_Hebrew would allow for a more thorough investigation into how sociodemographic variables contribute to test performance. The highly educated nature of the current study’s sample in this regard is a critical factor, as it may partially account for the performance differences observed between our participants and the original normative sample (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017). Notably, when the current study’s norms were applied, our participants were generally rated as performing more poorly (∼ one SD) than when the original norms were used. This discrepancy was further confirmed by re-norming three relevant case reports (Farrer & Drozdick, Reference Farrer and Drozdick2020b) using both our current norms and the original CVLT-III norms (see Supplementary Material 6). Furthermore, significantly more words were recalled in trials 1 to 5 (Total recall; sum of correct responses in Trials 1 – 5) by participants in the current study compared to most studies of healthy participants that were performed in a cross-cultural context (Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Kim & Kang, Reference Kim and Kang1999; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022).Footnote ⁶ Both lines of evidence suggest the current study’s sample might be overrepresented by participants with relatively high verbal memory functioning, perhaps due to a highly educated sample, although other sources may be at play (e.g., four of the earlier-mentioned cross-cultural studies included geriatric participants; Campos-Magdaleno et al., Reference Campos-Magdaleno, Nieto-Vieites, Frades-Payo, Montenegro-Pena, Facal, Lojo-Seoane and Delgado-Losada2024; Garcia-Herranz et al., Reference Garcia-Herranz, Diaz-Mardomingo, Suarez-Falcon, Rodriguez-Fernandez, Peraita and Venero2022; Kim & Kang, Reference Kim and Kang1999; Lou et al., Reference Lou, Yang, Cai, Yu, Zhang, Shi and Zhang2022). The influence of education level on CVLT-III_Hebrew’s performance is further discussed in the limitations paragraph, along with a call for expanding the normative data of the CVLT-III_Hebrew and thereby minimizing possible sources of bias.

A second set of regressions was conducted to determine the necessity and degree of sociodemographic adjustments. These regressions, with core SSs as the dependent variables for each age group, indicated the need to make 13 adjustments based on age, seven based on education level, and two based on sex. Most age adjustments were needed for the 26 to 38 age range, while education level adjustments primarily applied to participants aged 48 or older. In other words, notable age-related performance changes in younger adults and the comparatively low education level of older adults required additional adjustments to the CVLT-III_Hebrew scores. Sex-based adjustments, on the other hand, were limited in number and showed no clear link to the participants’ age ranges. These sociodemographic adjustments to core SSs can be readily implemented using a norm calculator (refer to the Supplementary Material 3 spreadsheet or the online calculator available at: https://bit.ly/45rqXFd), as exemplified in the clinical example that was presented in the Results section. By applying these standardized normative procedures—converting raw scores to SSs and then adjusting for sociodemographic variables as needed—clinicians can directly compare an examinee’s CVLT-derived core and process scores. This standardization also allows for straightforward comparison of the examinee’s CVLT performance with results from other neuropsychological tests, simplifying the understanding of their unique strengths and weaknesses (Slick & Sherman, Reference Slick and Sherman2022). Such comparisons are becoming more common with the increase in cognitive tests at the disposal of the Israeli clinician. More specifically, Kavé and Sapir-Yogev recently developed a story recall task (Kave & Sapir-Yogev, Reference Kave and Sapir-Yogev2020), adding to the former adaptation of tests such as the well-established Hebrew translation of the RAVLT (Vakil & Blachstein, Reference Vakil and Blachstein1997; Vakil et al., Reference Vakil, Blachstein and Sheinman1998; Vakil et al., Reference Vakil, Greenstein and Blachstein2010). Regarding the latter task, it should be noted that the CVLT-III and RAVLT are not identical despite their many commonalities. More specifically, the CVLT-III’s first list comprises words belonging to four semantic categories, and it includes cued free-recall trials that aid in detecting the use of strategies to code and retrieve the words. The semantic composition of the first list also means that the examinee’s CVLT scores no longer express verbal learning ability per se but rather the interaction between their memory and conceptual functions (Lezak et al., Reference Lezak, Howieson, Bigler and Tranel2012, p. 478). Clinicians are therefore advised to exercise discretion when comparing a patient’s performance in the two tests (e.g., a patient tested using the RAVLT and then retested later using the CVLT). Overall, the provision of initial sociodemographically adjusted CVLT-III_Hebrew norms and the availability of an intuitive score adjustment calculator will markedly increase the available tools at the disposal of the Israeli clinician and will hopefully lead to improved diagnostic clarity, more tailored intervention strategies, and precise tracking of cognitive alterations in clinical settings.

As part of the current study, I aimed to uphold key requirements for normative studies, including appropriate inclusion and exclusion criteria, use of standardized test administration and scoring, and statistical analyses that are adequate for clarifying the contribution of key sociodemographic variables (Casaletto & Heaton, Reference Casaletto and Heaton2017; Mondini et al., Reference Mondini, Cappelletti and Arcara2023; delCacho-Tena et al., Reference delCacho-Tena, Christ, Arango-Lasprilla, Perrin, Rivera and Olabarrieta-Landa2024). A key strength of this CVLT adaptation is the CVLT-III Hebrew’s excellent internal reliability (= .90), consistent with the split-half reliability (r = .94) found by Sherman et al., (Reference Sherman, Tan and Hrabok2022, p. 627). Additionally, the relationships observed between sociodemographic variables and CVLT-III_Hebrew performance were mostly as anticipated (see earlier discussion regarding the impact of sex). However, this study has several important limitations to consider. First, the sizes of two age groups (n 40 – 52 years = 44, n 56 – 65 years = 46) fell slightly below the minimum recommended sample size of 50 – 70 participants per age group (Bridges & Holler, Reference Bridges and Holler2007; Piovesana & Senior, Reference Piovesana and Senior2018). Moreover, the skewness of some CVLT-III_Hebrew raw scores (e.g., number of intrusions) likely requires a larger sample size per age group to ensure accurate results, as previous work recommends a minimum of 85 participants per cell when dealing with such skewed data (Piovesana & Senior, Reference Piovesana and Senior2018). This concern is somewhat alleviated by the transformation of raw scores to SSs, which approximates the normal distribution. However, it is still recommended to further increase the size of the normative sample in the future. Cautious interpretation is particularly warranted for the older age cohorts due to their smaller sample size and a skewed sex distribution. For example, the 48 – 60 years age group was composed of 44 females and only 8 males. Additionally, clinicians should be cautious when using the normative data from the current study when assessing examinees with limited schooling, as the study’s participants were relatively well-educated. This may reflect the fact that the percentage of people with an academic degree in Israel is among the highest in the world (e.g., 57.9% of the Israeli Jewish population has post-secondary education; Israeli Central Bureau of Statistics, 2023). However, it raises concerns when testing examinees with lower educational levels and may also explain, at least partially, why participants in the current study recalled more words in the CVLT-III_Hebrew than participants in most earlier studies that were performed in a cross-cultural context, as noted earlier. Second, Israel has several minorities that differ in religion and a myriad of sociodemographic variables (e.g., quality of schooling). For example, approximately 21% of the Israeli population comprises Arab citizens, encompassing diverse religious and cultural groups (i.e., Muslims, Christians, Druze, and Circassians; Israeli Central Bureau of Statistics, 2022). As the current study provides normative data based on the performance of Jewish Israelis, employing the norms when testing examinees from minority groups should be done cautiously. This also calls for complementing the current study with normative studies of ethnic and religious minorities in Israel. Better stratification of participants according to their resident location is also called for, considering thatunderrepresentation of participants from the periphery of Israel in the current study (see Table 1) and socio-economic disparities associated with place of residence in Israel (Israeli Central Bureau of Statistics, 2024). Evaluating the impact of socio-economic status, which was not gathered as part of the current study, is also recommended in these future studies (Farah, Reference Farah2017). Finally, the CVLT-III includes a forced-choice trial (termed Forced Choice Recognition) used as an embedded validity indicator (Axelrod et al., Reference Axelrod, Miller, LaBuda and Boone2021, pp. 132 – 137). This optional trial was not performed as part of the current study, a limitation considering the importance of performance validity determination in neuropsychological assessment (Bush et al., Reference Bush, Ruff, Troster, Barth, Koffler, Pliskin, Reynolds and Silver2005; Sweet et al., Reference Sweet, Heilbronner, Morgan, Larrabee, Rohling, Boone, Kirkwood, Schroeder and Suhr2021). Providing normative data for this trial, analyzing measures that were not gathered as part of the current study (e.g., serial position effects), as well as adapting the alternate and brief forms of the CVLT-III, is therefore warranted and will promote clinical work and empirical research. Finally, as an anonymous reviewer noted, words belonging to one list B category (“parts of house”) were replaced in the CVLT-III_Hebrew with another category (“nature”). Evaluating the impact of this change on examinees’ performance in future studies is also of value.

Summary

With the increasing integration of neuropsychological assessments in Israel (Kave et al., Reference Kave, Bloch, Shabi and Maril2020; Vakil & Hoofien, Reference Vakil and Hoofien2016) and the need for additional wordlist learning tests in Hebrew, the need arose to adapt the CVLT-III to Hebrew. The current study details the adaptation process and provides initial normative data for the CVLT-III_Hebrew. These aims were achieved by deriving age-adjusted SSs for the CVLT-III_Hebrew and then utilizing regression-based adjustments to control the impact of sociodemographic variables. These adjustments can be easily performed using a norm calculator (Supplementary Material 3), which is also available as an online tool at https://bit.ly/45rqXFd. Concurrently, caution is warranted when testing examinees with low education level and older examinees (i.e., ≥ 48 years), as elaborated earlier. Clinicians should also be aware that the normative data may be biased when evaluating examinees from minority groups in Israel. Enlarging the sample size, evaluation of Israeli minorities, and adaptation of the optional forced-choice recognition memory subtest of the CVLT-III are among the endeavors that await further research and will enhance the coverage of the CVLT-III_Hebrew norms. With the increasing acknowledgment of the unique challenges of performing neuropsychological assessments in a cross-cultural context, these are important and will progress the services neuropsychologists provide to the Hebrew-speaking population.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S1355617725101616.

Acknowledgments

The study’s findings are based on data gathered as part of a graduate course in Rehabilitation psychology titled “Advancement of Neuropsychological Assessment in Israel”, supervised by Prof. Yoram Braw. I thank the following students for aiding in data collection: Tsofiya Ansbacher, Noam Baruch, Liraz Bernstein, Moriah Cohen, Sapir Eliyahu, Yaara Elran, Hatzav Hanoch, Linoy Karni, Mor Nahari, Sarah Oved, Elior Oren, Chen Rashef, Yotam Shuker, Shaked Yeshaiahu, Mai Akiva, Noy Harel, and Daniela Winter.

Funding statement

None.

Competing interests

The authors declare that they have no conflict of interest.

Footnotes

¹ Note that these regressions were performed per age group with core SSs serving as the dependent variables, in contrast to the first set of regressions in which analyses were performed on the whole sample and raw scores served as the dependent variables. This stems from their different aims (i.e., evaluating the overall contribution of sociodemographic variables to the prediction of CVLT performance vs. determining the need and parameters for adjusting SSs based on the examinee’s sociodemographic characteristics).

² Means were subtracted from age and education level to improve the standardized reference (mean age = 39.5 years, mean education level = 15); B = unstandardized regression coefficients.

³ Two additional case reports by Farrer & Drozdick (Reference Farrer and Drozdick2020b) were excluded from analysis. These reports involved pediatric and geriatric examinees, age ranges that were not covered by the current normative study.

⁴ Studies were excluded from the comparison if they used the CVLT Brief Form (Chang et al., Reference Chang, Kramer, Lin, Chang, Wang, Huang, Lin, Chen and Wang2010), included neuropsychiatric patients (Romaszko-Wojtowicz et al., Reference Romaszko-Wojtowicz, Borkowska, Opalach, Romaszko, Łowczak and Buciński2023), or lacked raw CVLT data (Argento et al., Reference Argento, Pisani, Incerti, Magistrale, Caltagirone and Nocentini2014).

⁵ The rate of acquisition, overall recall, and recognition discrimination in CVLT are particularly vulnerable to the effect of aging (Delis et al., Reference Delis, Kramer, Kaplan and Ober2017, p. 78), similar to the current study’s findings (see Supplementary Material 2).

⁶ An exception was Feyzioğlu (2020), in which Turkish healthy adults had a significantly higher Total recall than participants in the current study.

References

Ardila, A. (2020). Cross-cultural neuropsychology: History and prospects. RUDN Journal of Psychology and Pedagogics, 17(1), 64–78.CrossRef Google Scholar

Argento, O., Pisani, V., Incerti, C. C., Magistrale, G., Caltagirone, C., & Nocentini, U. (2014). The California verbal learning test-II: Normative data for two Italian alternative forms. The Clinical Neuropsychologist, 28(Suppl 1), S42–54.CrossRef Google Scholar PubMed

Asperholm, M., Nagar, S., Dekhtyar, S., Herlitz, A., & Ginsberg, S. D. (2019). The magnitude of sex differences in verbal episodic memory increases with social progress: Data from 54 countries across 40 years. PLoS ONE, 14(4), e0214945.CrossRef Google Scholar PubMed

Axelrod, B. N., Miller, J. B., & LaBuda, J. A. (2021). Embedded performance validity scores in standard memory tests (ch. 7). In Boone, K. B. (Ed.), Assessment of feigned cognitive impairment: A neuropsychological perspective (2 ed. pp. 124–145). Guilford Press.Google Scholar

Bair, J. L., Patrick, S. D., Noyes, E. T., Hale, A. C., Campbell, E. B., Wilson, A. M., Ransom, M. T., & Spencer, R. J. (2023). Semantic clustering on common list-learning tasks: A systematic review of the state of the literature and recommendations for future directions. Journal of Clinical and Experimental Neuropsychology, 45(7), 652–692.CrossRef Google Scholar PubMed

Berger, C., A., L., Braw, Y., Elbaum, T., Wagner, M., & Rassovsky, Y. (2021). Detection of feigned ADHD using the MOXO-d-CPT. Journal of Attention Disorders, 25(7), 1032–1047.CrossRef Google Scholar PubMed

Braw, Y. C., Elbaum, T., Lupu, T., & Ratmansky, M. (2024). Chronic pain: Utility of an eye-tracker integrated stand-alone performance validity test. Psychological Injury and Law, 17(2), 139–151.CrossRef Google Scholar

Bridges, A. J., & Holler, K. A. (2007). How many is enough? Determining optimal sample sizes for normative studies in pediatric neuropsychology. Child Neuropsychology, 13(6), 528–538.CrossRef Google Scholar PubMed

Bush, S. S., Ruff, R. M., Troster, A. I., Barth, J. T., Koffler, S. P., Pliskin, N. H., Reynolds, C. R., & Silver, C. H. (2005). Symptom validity assessment: Practice issues and medical necessity NAN policy & planning committee. Archives of Clinical Neuropsychology, 20(4), 419–426.CrossRef Google Scholar PubMed

Campos-Magdaleno, M., Nieto-Vieites, A., Frades-Payo, B., Montenegro-Pena, M., Facal, D., Lojo-Seoane, C., & Delgado-Losada, M. L. (2024). Normative data for the spanish versions of the CVLT, WMS-logical memory, and RBMT from a sample of middle-aged and old participants. Psychological Assessment, 36(2), 114–123.CrossRef Google Scholar PubMed

Casaletto, K. B., & Heaton, R. K. (2017). Neuropsychological assessment: Past and future. Journal of the International Neuropsychological Society, 23(9-10), 778–790.CrossRef Google Scholar PubMed

Chang, C. C., Kramer, J. H., Lin, K. N., Chang, W. N., Wang, Y. L., Huang, C. W., Lin, Y. T., Chen, C., & Wang, P. N. (2010). Validating the chinese version of the verbal learning test for screening alzheimer’s disease. Journal of the International Neuropsychological Society, 16(2), 244–251.CrossRef Google Scholar PubMed

delCacho-Tena, A., Christ, B. R., Arango-Lasprilla, J. C., Perrin, P. B., Rivera, D., & Olabarrieta-Landa, L. (2024). Normative data estimation in neuropsychological tests: A systematic review. Archives of Clinical Neuropsychology, 39(3), 383–398.CrossRef Google Scholar PubMed

Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (1987). California verbal learning test. The Psychological Corporation.Google Scholar

Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (2017). California verbal learning test (Third edn. NCS Pearson, Inc.Google Scholar

Dickerson, B. C., & Eichenbaum, H. (2010). The episodic memory system: Neurocircuitry and disorders. Neuropsychopharmacology, 35(1), 86–104.CrossRef Google Scholar PubMed

Espenes, J., Eliassen, I. V., Ohman, F., Hessen, E., Waterloo, K., Eckerstrom, M., Lorentzen, I. M., Bergland, C., Halvari Niska, M., Timon-Reina, S., Wallin, A., Fladby, T., & Kirsebom, B. E. (2023). Regression-based normative data for the rey auditory verbal learning test in norwegian and swedish adults aged 49-79 and comparison with published norms. The Clinical Neuropsychologist, 37(6), 1276–1301.CrossRef Google Scholar PubMed

Farah, M. J. (2017). The neuroscience of socioeconomic status: Correlates, causes, and consequences. Neuron, 96(1), 56–71.CrossRef Google Scholar PubMed

Farrer, T. J., & Drozdick, L. W. (2020a). Essentials of the California Verbal Learning Test: CVLT-C, CVLT-2, & CVLT3. John Wiley & Sons.Google Scholar

Farrer, T. J., & Drozdick, L. W. (2020b). Illustrative case reports (ch. 7). In Essentials of the California Verbal Learning Test: CVLT-C, CVLT-2, & CVLT3. John Wiley & Sons.Google Scholar

Fernández, A. L., & Evans, J. (2022). Cross-cultural testing: Adaptation, development, or cross-cultural tests?. In Fernández, A. L., & Evans, J. (Eds.), Understanding cross-cultural neuropsychology: Science, testing, and challenges (pp. 125–134). Routledge.CrossRef Google Scholar

Feyzioğlu, A. (2020). California verbal learning test: The normative study of turkish adult sample. Haydarpaşa Numune Medical Journal, 60(4), 383–394.Google Scholar

Franzen, S., Pomati, S., Papma, J. M., Nielsen, T. R., Narme, P., Mukadam, N., Lozano-Ruiz, Á., Ibanez-Casas, I., Goudsmit, M., Fasfous, A., Daugherty, J. C., Canevelli, M., Calia, C., van den Berg, E., & Bekkhus-Wetterberg, P. (2022). Cross-cultural neuropsychological assessment in Europe: Position statement of the european consortium on cross-cultural neuropsychology (ECCroN). The Clinical Neuropsychologist, 36(3), 546–557.CrossRef Google Scholar

Garcia-Herranz, S., Diaz-Mardomingo, M. D. C., Suarez-Falcon, J. C., Rodriguez-Fernandez, R., Peraita, H., & Venero, C. (2022). Normative data for the Spanish version of the California verbal learning test (TAVEC) from older adults. Psychological Assessment, 34(1), 91–97.CrossRef Google Scholar PubMed

Hirnstein, M., Stuebs, J., Moe, A., & Hausmann, M. (2023). Sex/Gender differences in verbal fluency and verbal-episodic memory: A meta-analysis. Perspectives on Psychological Science, 18(1), 67–90.CrossRef Google Scholar PubMed

International Test Commission. (2017). The ITC Guidelines for Translating and AdaptingTests (Second edition; version 2.4). http://www.intestcom.org.Google Scholar

Israeli Central Bureau of Statistics. (2022). Population, by population group, religion, sex and age, 2022 Census estimate [Data Set]. Central Bureau of Statistics Retrieved from https://www.cbs.gov.il/en/Surveys/Pages/Population-Census.aspx Google Scholar

Israeli Central Bureau of Statistics. (2023). Quality of Life, Sustainability and National Resilience Indicators, 2023. Central Bureau of Statistics Retrieved from https://www.cbs.gov.il/he/publications/Pages/2024/%D7%9E%D7%93%D7%93%D7%99-%D7%90%D7%99%D7%9B%D7%95%D7%AA-%D7%97%D7%99%D7%99%D7%9D-%D7%A7%D7%99%D7%99%D7%9E%D7%95%D7%AA-%D7%95%D7%97%D7%95%D7%A1%D7%9F-%D7%9C%D7%90%D7%95%D7%9E%D7%99-2023.aspx Google Scholar

Israeli Central_Bureau of Statistics. (2024, July 29). Characterization and Classification of Geographical Units by the Socio-Economic Level of the Population 2021 (Press Release No. 230/2024). Retrieved from https://www.cbs.gov.il/en/publications/pages/2025/%D7%90%D7%A4%D7%99%D7%95%D7%9F-%D7%99%D7%97%D7%99%D7%93%D7%95%D7%AA-%D7%92%D7%90%D7%95%D7%92%D7%A8%D7%A4%D7%99%D7%95%D7%AA-%D7%95%D7%A1%D7%99%D7%95%D7%95%D7%92%D7%9F-%D7%9C%D7%A4%D7%99-%D7%94%D7%A8%D7%9E%D7%94-%D7%94%D7%97%D7%91%D7%A8%D7%AA%D7%99%D7%AA-%D7%9B%D7%9C%D7%9B%D7%9C%D7%99%D7%AA-%D7%A9%D7%9C-%D7%94%D7%90%D7%95%D7%9B%D7%9C%D7%95%D7%A1%D7%99%D7%99%D7%94-%D7%91%D7%A9%D7%A0%D7%AA-2021.aspx.Google Scholar

Ivnik, R. J., Malec, J. F., Smith, G. E., Tangalos, E. G., Petersen, R. C., Kokmen, E., & Kurland, L. T. (1992). Mayo’s older Americans normative studies: WAIS-R norms for ages 56 to 97. The Clinical Neuropsychologist, 6(S1), 1–30.CrossRef Google Scholar

Jones, P. R. (2019). A note on detecting statistical outliers in psychophysical data. Attention, Perception & Psychophysics, 81(5), 1189–1196.CrossRef Google Scholar PubMed

Kave, G., Bloch, A., Shabi, A., & Maril, S. (2020). Neuropsychological assessment in the Israeli healthcare system: A practitioners’ survey. Israel Journal of Health Policy Research, 9(1), 46.CrossRef Google Scholar PubMed

Kavé, G., Gorokhod, R., Yerushalmi, A., & Salner, N. (2019). Frequency effects on spelling in Hebrew-speaking younger and older adults. Applied Psycholinguistics, 40(5), 1173–1188.CrossRef Google Scholar

Kave, G., & Sapir-Yogev, S. (2020). Associations between memory and verbal fluency tasks. Journal of Communication Disorders, 83, 105968.CrossRef Google Scholar PubMed

Kave, G., Sapir-Yogev, S., Bregman, N., & Shiner, T. (2022). On the importance of using local tests and local norms in the assessment of memory. Applied Neuropsychology-Adult, 29(6), 1492–1498.CrossRef Google Scholar PubMed

Keith, C. M., Haut, M. W., Wilhelmsen, K., Mehta, R. I., Miller, M., Navia, R. O., Ward, M., Lindberg, K., Coleman, M., McCuddy, W. T., Deib, G., Giolzetti, A., & D’Haese, P. F. (2023). Frontal and temporal lobe correlates of verbal learning and memory in aMCI and suspected Alzheimer’s disease dementia. Neuropsychology, development, and cognition. Section B, Aging, Neuropsychology and Cognition, 30(6), 923–939.CrossRef Google Scholar

Kim, J. K., & Kang, Y. (1999). Normative study of the Korean-California verbal learning test (K-CVLT). The Clinical Neuropsychologist, 13(3), 365–369.CrossRef Google Scholar PubMed

Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Memory I: Tests (ch. 11). In Neuropsychological assessment (5 ed.). Oxford University Press.Google Scholar

Liampas, I., Folia, V., Ntanasi, E., Yannakoulia, M., Sakka, P., Hadjigeorgiou, G., Scarmeas, N., Dardiotis, E., & Kosmidis, M. H. (2023). Longitudinal episodic memory trajectories in older adults with normal cognition. The Clinical Neuropsychologist, 37(2), 304–321.CrossRef Google Scholar PubMed

Lighthall, N. R., Conner, L. B., & Giovanello, K. S. (2019). Learning and memory in the aging brain: The function of declarative and nondeclarative memory over the lifespan. In The aging brain: Functional adaptation across adulthood (pp. 73–109). American Psychological Association, https://doi.org/10.1037/0000143-004 CrossRef Google Scholar

Linzen, T. (2009). Corpus of blog postings collected from the Israblog website. Tel Aviv University.Google Scholar

Lou, F. H., Yang, G. T., Cai, L. H., Yu, L. C., Zhang, Y., Shi, C., & Zhang, N. (2022). Effects of age, sex, and education on California verbal learning test-II performance in a chinese-speaking population. Frontiers in Psychology, 13, 9935875.CrossRef Google Scholar

Merkley, T. L., Esopenko, C., Zizak, V. S., Bilder, R. M., Strutt, A. M., Tate, D. F., & Irimia, A. (2023). Challenges and opportunities for harmonization of cross-cultural neuropsychological data. Neuropsychology, 37(3), 237–246.CrossRef Google Scholar PubMed

Mondini, S., Cappelletti, M., & Arcara, G. (2023). Test scores (ch. 5). In Methodology in neuropsychological assessment: An interpretative approach to guide clinical practice (pp. 63–83). Routledge.Google Scholar

Nguyen, C. M., Rampa, S., Staios, M., Nielsen, T. R., Zapparoli, B., Zhou, X. E., Mbakile-Mahlanza, L., Colon, J., Hammond, A., Hendriks, M., Kgolo, T., Serrano, Y., Marquine, M. J., Dutt, A., Evans, J., & Judd, T. (2024). Neuropsychological application of the international test commission guidelines for translation and adapting of tests. Journal of the International Neuropsychological Society, 30(7), 621–634.CrossRef Google Scholar PubMed

Nyberg, L. (2017). Functional brain imaging of episodic memory decline in ageing. Journal of Internal Medicine, 281(1), 65–74.CrossRef Google Scholar PubMed

Pauker, J. D. (1988). Constructing overlapping cell tables to maximize the clinical usefulness of normative test data: Rationale and an example from neuropsychology. Journal of Clinical Psychology, 44(6), 930–933.3.0.CO;2-H>CrossRef Google Scholar

Pena-Casanova, J., Blesa, R., Aguilar, M., Gramunt-Fombuena, N., Gomez-Anson, B., Oliva, R., Molinuevo, J. L., Robles, A., Barquero, M. S., Antunez, C., Martinez-Parra, C., Frank-Garcia, A., Fernandez, M., Alfonso, V., Sol, J. M., & for the NEURONORMA Study Team (2009). Spanish multicenter normative studies (NEURONORMA project): Methods and sample characteristics. Archives of Clinical Neuropsychology, 24(4), 307–319.CrossRef Google Scholar PubMed

Pena-Casanova, J., Casals-Coll, M., Quintana, M., Sanchez-Benavides, G., Rognoni, T., Calvo, L., Palomo, R., Aranciva, F., Tamayo, F., & Manero, R. M. (2012). Spanish normative studies in a young adult population (NEURONORMA young adults project): Methods and characteristics of the sample. Neurología, 27(5), 253–260.Google Scholar

Perez-Enriquez, C., Garcia-Escobar, G., Florido-Santiago, M., Pique-Candini, J., Arrondo-Elizaran, C., Grau-Guinea, L., Pereira-Cuitino, B., Manero, R. M., Puig-Pijoan, A., Pena-Casanova, J., & Sanchez-Benavides, G. (2024). Spanish normative studies (NEURONORMA-plus project): Norms for the Wisconsin card sorting test, the modified Taylor complex figure, and the ruff-light trail learning test. Neurologia (Engl Ed), 39(3), 235–243.CrossRef Google Scholar PubMed

Piovesana, A., & Senior, G. (2018). How small is big: Sample size and skewness. Assessment, 25(6), 793–800.CrossRef Google Scholar

Poreh, A. M., Avital, R., Dines, P. L., & Levin, J. B. (2015). The effects of age of language acquisition on verbal memory tests in a sample of older adults immigrants. Psychology & Neuroscience, 8(1), 66–74.CrossRef Google Scholar

Rabin, L. A., Nester, C. O., & Barr, W. B. (2023). Current neuropsychological test usage practices. In Boyle, G. J., Stern., Y., Stein., D. J., Sahakian., B. J., Golden., C. J., Lee., T. M.-C., & Chen, S.-H. A. (Eds.), The SAGE handbook of clinical neuropsychology: Clinical neuropsychological assessment and diagnosis (pp. 13–22). Sage.CrossRef Google Scholar

Ramani, S., Young, G., & Zakzanis, K. K. (2024). The culturally minded independent psychological examiner: A review of indian and chinese cultural characteristics and its implications for psychological injury. Psychological Injury and Law, https://doi.org/10.1007/s12207-024-09513-8 Google Scholar

Rey, A. (1964). L’examen clinique en psychologic [The clinical exam in psychology]. Universitaires de France.Google Scholar

Reynolds, C. R., Altmann, R. A., & Allen, D. N. (2021). Neuropsychological testing (ch. 13). In Mastering modern psychological testing: Theory and methods (2 ed.). Springer.CrossRef Google Scholar

Romaszko-Wojtowicz, A., Borkowska, A., Opalach, C., Romaszko, M., Łowczak, A., & Buciński, A. (2023). California verbal learning test trial among the Polish homeless. Acta Elbing, 50(1), 10–18.Google Scholar

Sherman, E. M. S., Tan, J. E., & Hrabok, M. (2022). Memory (ch. 10). In A compendium of neuropsychological tests: Fundamentals of neuropsychological assessment and test reviews for clinical practice (4 ed.). Oxford University Press.Google Scholar

Slick, D. J., & Sherman, E. M. S. (2022). Psychometrics in neuropsychological assessment (ch. 1). In A Compendium of neuropsychological tests: Fundamentals of neuropsychological assessment and test reviews for clinical practice (4 ed. pp. 1–23). Oxford University Press.Google Scholar

Staios, M., Kosmidis, M. H., Tsiaras, Y., Nielsen, T. R., Papadopoulos, A., Kokkinias, A., Velakoulis, D., March, E., & Stolwyk, R. J. (2023). Do normative data specific to Greek Australian older adults improve validity of neuropsychological assessment results? Journal of the International Neuropsychological Society, 29(10), 953–963.CrossRef Google Scholar PubMed

Sweet, J. J., Heilbronner, R. L., Morgan, J. E., Larrabee, G. J., Rohling, M. L., Boone, K. B., Kirkwood, M. W., Schroeder, R. W., Suhr, J. A., & Conference Participants (2021). American Academy of Clinical Neuropsychology (AACN) 2021 consensus statement on validity assessment: Update of the 2009 AACN consensus conference statement on neuropsychological assessment of effort, response bias, and malingering. The Clinical Neuropsychologist, 35(6), 1053–1106.CrossRef Google Scholar PubMed

Toren, P., Sadeh, M., Wolmer, L., Eldar, S., Koren, S., Weizman, R., & Laor, N. (2000). Neurocognitive correlates of anxiety disorders in children: A preliminary report. Journal of Anxiety Disorders, 14(3), 239–247.CrossRef Google Scholar PubMed

Vakil, E., & Blachstein, H. (1997). Rey AVLT: Developmental norms for adults and the sensitivity of different memory measures to age. Clinical Neuropsychologist, 11(4), 356–369.CrossRef Google Scholar

Vakil, E., Blachstein, H., & Sheinman, M. (1998). Rey AVLT: Developmental norms for children and the sensitivity of different memory measures to age. Child Neuropsychology, 4(3), 161–177.CrossRef Google Scholar

Vakil, E., Greenstein, Y., & Blachstein, H. (2010). Normative data for composite scores for children and adults derived from the rey auditory verbal learning test. The Clinical Neuropsychologist, 24(4), 662–677.CrossRef Google Scholar PubMed

Vakil, E., & Hoofien, D. (2016). Clinical neuropsychology in Israel: History, training, practice and future challenges. The Clinical Neuropsychologist, 30(8), 1267–1277.CrossRef Google Scholar PubMed

Table 1. Sociodemographic characteristics of the normative sample (per age group and total)

Table 2. Core raw scores according to age

Table 3. Process raw scores according to age

Table 4. CVLT-IIIhebrew core SSs; age range = 18 – 30 years, median age = 24, n = 93

Table 5. CVLT-IIIhebrew core SSs; age range = 26 – 38 years, median age = 32, n = 94

Table 6. CVLT-IIIhebrew core SSs; age range = 34 – 46 years, median age = 40, n = 50

Table 7. CVLT-IIIhebrew core SSs; age range = 40 – 52 years, median age = 46, n = 44

Table 8. CVLT-IIIhebrew core SSs; age range = 48 – 60 years, median age = 54, n = 52

Table 9. CVLT-IIIHebrew core SSs; age range = 56 – 65 years, median age = 60.5, n = 46

Table 10. Adjustments of CVLT-IIIHebrew core SSs based on socio-demographic variables (age, education level, and sex)

Braw supplementary material 1

Braw supplementary material

File 26.8 KB

Braw supplementary material 2

Braw supplementary material

File 27.5 KB

Braw supplementary material 3

Braw supplementary material

File 20.5 KB

Braw supplementary material 4

Braw supplementary material

File 26 KB

Braw supplementary material 5

Braw supplementary material

File 36.2 KB

Braw supplementary material 6

Braw supplementary material

File 34 KB

Braw supplementary material 7

Braw supplementary material

File 32 KB

Article contents

The California Verbal Learning Test-III (CVLT-III): Adaptation, validation, and initial norms in the Hebrew-speaking Israeli population

Abstract

Keywords

Information

Statement of Research Significance

Introduction

Method

Participants

Tools

Adaptation procedure

Procedure

Data analyses

Comparisons with earlier normative samples and internal reliability

Additional analyses and general remarks

Results

Division of normative data into age groups and calculation of unadjusted scaled scores

Data adjustments based on sociodemographic variables

Clinical example

Comparisons with earlier normative samples and internal reliability

Additional analyses

Discussion

Summary

Supplementary material

Acknowledgments

Funding statement

Competing interests

Footnotes

References

Braw supplementary material 1

Braw supplementary material 2

Braw supplementary material 3

Braw supplementary material 4

Braw supplementary material 5

Braw supplementary material 6

Braw supplementary material 7

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests