Validation and Normative Data of the Spanish Version of the Face Name Associative Memory Exam (S-FNAME)

Abstract Objective: The relevance of the episodic memory in the prediction of brain aging is well known. The Face Name Associative Memory Exam (FNAME) is a valued associative memory measure related to Alzheimer’s disease (AD) biomarkers, such as amyloid-β deposition preclinical AD individuals. Previous validation of the Spanish version of the FNAME test (S-FNAME) provided normative data and psychometric characteristics. The study was limited to subjects attending a memory clinic and included a reduced sample with gender inequality distribution. The purpose of this study was to assess S-FNAME psychometric properties and provide normative data in a larger independent sample of cognitively healthy individuals. Method: S-FNAME was administered to 511 cognitively healthy volunteers (242 women, aged 41–65 years) participating in the Barcelona Brain Health Initiative cohort study. Results: Factor analysis supported construct validity revealing two underlying components: face-name and face-occupation and explaining 95.34% of the total variance, with satisfactory goodness of fit. Correlations between S-FNAME and Rey Auditory-Verbal Learning Test were statistically significant and confirmed its convergent validity. We also found weak correlations with non-memory tests supporting divergent validity. Women showed better scores, and S-FNAME was positively correlated with education and negatively with age. Finally, we generated normative data. Conclusions: The S-FNAME test exhibits good psychometric properties, consistent with previous findings, resulting in a valid and reliable tool to assess episodic memory in cognitively healthy middle-aged adults. It is a promising test for the early detection of subtle memory dysfunction associated with abnormal brain aging.


INTRODUCTION
Episodic memory is part of the declarative memory system referred to as the ability to learn and store unique events or personal experiences (emotions, thoughts, and perceptions) that imply temporal and spatial data (Lezak, Howieson & Loring, 2004;Pause et al., 2013;Weintraub et al., 2013).
In this context, associative memory is an important domain in early AD diagnosis. Tasks involving cross-modal associations, as faces and names pairs due to its complex nature and high ecological validity, have been suggested as promising tools (Loewenstein, Curiel, Duara & Buschke, 2018).
Neuroimaging studies have provided further support for using the face-name paradigm as a marker of prodromal AD (Rentz et al., 2013;Jurick et al., 2018). Findings revealed that the encoding and memory formation of novel associations is differentially impaired in AD's early stages compared with normal aging. This paradigm has also been used to optimize the characterization and differentiation of Mild Cognitive Impairment (MCI) subtypes (Rentz et al., 2011;Polcher et al., 2017;Jurick et al., 2018;Kormas et al., 2018;Rubiño & Andrés, 2018).
S-FNAME test consists of asking the examinees to remember 16 face-name pairs and 16 face-occupation pairs for a total of 32 cross-modal pairs. The initial learning phase includes initial cued recalls for face-name pairs and faceoccupation pairs followed by an immediate cued recall of both name and occupation associated with each face. Finally, a 30-Minute Delayed Cued Recall requires to evoke all information associated with each face after a 30-min delay .
Convergent validity has been previously tested calculating correlation with other memory tests as Rey Auditory-Verbal Learning Test, Rey-Osterrieth Complex Figure Test, Free, Cued Selective Reminding Test, and The Word List Learning test from the Wechsler Memory Scale-Third Edition (Amariglio et al., 2012;Papp et al., 2014;Vila-Castelar et al., 2019).
The FNAME Exam is currently considered a valuable tool within the clinical and research field. For example, according to Vila-Castelar et al. (2019), FNAME was applied in the US POINTER study (NCT03688126) whose objective was to reduce risks and protect the brain through lifestyle intervention. Also, the European Prevention of Alzheimer's Dementia (EPAD) Scientific Advisory Group for Clinical and Cognitive Outcomes recommended FNAME as an appropriate cognitive test associated with preclinical brain changes that should be included in clinical assessment protocols (Ritchie et al., 2017). In Catalonia, the S-FNAME is used in the Fundació ACE Healthy Brain Initiative (FACEHBI) (Rodriguez-Gomez et al., 2017), a broad longitudinal cohort of 200 middle-aged adults with Subjective Cognitive Decline, focused on increasing the understanding of preclinical AD.
As mentioned above, in Spain,  provided normative data and psychometric characteristics of S-FNAME. However, this study revealed some limitations regarding the sample: although it was composed of cognitively healthy individuals, the sample was limited to subjects attending a memory clinic. The size was reduced (n = 110) with gender inequality distribution. In conclusion, given the relevance of assessing cross-modal associative face-name memory, as a sensitive measure of preclinical AD, we aimed to determine the validity and reliability of the S-FNAME test from a population-based larger sample, as well as to develop population-specific normative tables based on a Spanish sample between 41 and 65 years.

Participants
This study was carried out within the in-person assessment of a longitudinal prospective population-based cohort study ongoing in Barcelona: the Barcelona Brain Health Initiative (BBHI; Cattaneo et al., 2018). 511 cognitively healthy volunteers (242 women) aged between 41 and 65 years (M = 52.66, SD = 7.05) completed the neurocognitive assessment (Table 1) and were included in this study.
Most of the participants were Catalonia residents (96.09%) and only 3.91% came from other areas within Spain. 95% of our participants were Catalan-Spanish bilinguals (5% were only Spanish speakers). Catalan dominant bilinguals (58.5%) reported to be early and high proficient bilinguals and regularly exposed to both languages, living most of them in a highly bilingual context, such as Barcelona city or its metropolitan area.
Participants with history or current neurological or psychiatric disease diagnosis, traumatic brain injury (TBI) with loss of consciousness, substance abuse/dependence, or treatment with psychopharmacological drugs were excluded. We excluded examinees with objective deficits in neuropsychological tests included in the BBHI protocol (see Cattaneo et al., 2018), and those who scored below 26 points on the Mini-Mental State Examination (Folstein, Folstein, & McHugh, 1975;Blesa et al., 2001).
All participants provided explicit informed consent and the protocol was approved by the Comité d'Ètica i Investigació Clínica de la Unió Catalana d'Hospitals, local ethics committee.

Procedures and Materials
S-FNAME was administered according to the standardized procedure published by . The test was the first applied during BBHI cognitive assessment session (Cattaneo et al., 2018), and its administration took between 35 and 40 min. It is essential to highlight that no other memory tests were applied between the S-FNAME initial learning and the 30-min delayed recall.
S-FAME application procedure began with the face study phase: subjects were presented all 16 faces, four faces to a page, each in a quadrant. Participants were requested to look at each face for 2 s while the professional pointed his/her finger on it. In the Initial study of face-name pairs (FN-N), participants were shown the same 16 faces with names underneath. They had to learn the name associated with each face during this single trial. Then, in the Initial cued recall of face-name pairs (ILN), the subjects were again presented faces and were asked to evoke each face's corresponding name. Scores were the result of correctly recalled pairs (ILN).
The Initial study of face-occupation pairs (FN-O) consisted of presenting the same 16 faces, but this time with occupations underneath. Subjects were requested to study occupation-face associations, and in the Initial cued recall of face-occupation pairs, examinees were shown the faces. They were asked to evoke the related occupation of each face. Scores were the result of correctly recalled pairs (ILO). Then in the Immediate cued recall, subjects were presented with all stimuli and asked to remember both names (CRN score) and occupations (CRO score) associated with each.
Finally, in the 30-min delayed cued recall, participants were again shown the faces and were asked to evoke the name (CRN30 score) and occupation (CRO30 score) associated with each face. All scores (ILN, ILO, CRN, CRO, CRN30, and CRO30) ranged from 0 to 16.
RAVLT was chosen to obtain convergent validity evidence considering it is one of the gold standard instruments for episodic memory assessment. We included First Trial, Total Learning, and Delayed Recall scores. The other non-mnesic tests were used to evaluate divergent validity, considering that these tasks measure different constructs (visuospatial and visuoconstructive abilities, fluid intelligence, processing speed, and cognitive flexibility) than those assessed by the FNAME.

Data Analysis
Statistical analyses were performed using SPSS version 22.0 (Statistical Package for Social Sciences, Chicago, IL, USA). The frequency table was presented (see Table 1) to illustrate the distribution of sociodemographic variables (age ranges, educational level, gender) (see Table 1). Descriptive analyses for age (continuous measure) and years of education were included (see Table 2).
Years of education were estimated by explicitly asking the participants to report the total time of formal education achieved counting from the time when education becomes obligatory in Spain (primary school). In Spain, the mandated educational system includes elementary/primary school (6 years), secondary obligatory school (4 years) and baccalaureate/high school or Middle Grade Vocational Training (2 years). Higher education comprises undergraduate degree (4 years) and postgraduate degrees (specialization, master, and PhD programs).
Also, S-FNAME scores: ILN, ILO, CRN, CRO, CRN30, CRO30, subtotal scores for names (FN-N = ILN þ CRN þ CRN30) and occupations (FN-O = ILO þ CRO þ CRO30) and total score (S-FNAME Total= ILN þ ILO þ CRN þ CRO þ CRN30 þ CRO30), as well as performance on the other neuropsychological tests were shown in Table 2. To examine construct validity, we carried out an exploratory factor analysis (EFA) to determine the underlying factorial structure. Then, confirmatory factor analysis (CFA) was executed using IBM SPSS AMOS. To examine the goodness of fit of the factorial structures, we used the absolute, incremental, and indicators: Chi-square (χ 2 ), Normed Chi-square (χ²/df), Goodness of Fit Index (GFI), Adjusted Goodness Fit Index (AGFI), Root Mean Square Error of Approximation (RMSEA), Normed Fit Index (NFI), and Tucker-Lewis Index (TLI) (Hair, Anderson, Tatham, & Black et al., 1999;Mulaik, 2009). Convergent validity was assessed using Pearson correlation coefficients between S-FNAME and RAVLT scores: RAVLT Total Learning (sum of RAVLT learning trials I, II, III, IV, and V) and RAVLT Delayed Recall (retrieval after 30 min). Divergent validity was examined, measuring Pearson correlation coefficients with non-memory tests: TMTA, TMTB, and Matrix Reasoning and Block Design subtests from WAIS-IV. Reliability was calculated using Cronbach's α to assess the internal consistency of S-FNAME.
Correlation indices between demographic variables and S-FNAME scores were calculated, and to confirm the contribution of gender, age, and education on S-FNAME scores were used s. multiple linear regression analyses.
Finally, an Analysis of Variance (ANOVA) was used to examine the multivariate effect of gender, age ranges (<55 and ≥55 years old), and educational level (<16 and ≥16 years of formal education) for all the S-FNAME scores (ILN, ILO, CRN, CRO, CRN30, CRO30, subtotals FN-N, FN-O, and S-FNAME Total). The configuration of the age groups was obtained after multiple comparisons between different age ranges, in order to guarantee that the resulting groups reflected significant differences in S-FNAME scores, instead of arbitrarily dividing age (Ferreira & Campagna, 2014).
Then, norms were generated considering the combination of those variables that reflected a significant effect. For all the analyses, statistical significance was determined when p < .05.
Descriptive analyses for age, years of formal education, S-FNAME performance (ILN, ILO, CRN, CRO, CRN30, CRO30, FN-N, FN-O, and S-FNAME Total), and other neurocognitive tests scores used for validation are represented in Table 2. As shown, subjects obtained higher scores on the occupation-face than on the total face-name tests.
S-FNAME scores and age were negatively correlated (p < .001). Instead, the association between years of formal education and S-FNAME scores reflected a positive trend (p < .001) ( Table 3).
To examine the construct validity of S-FNAME, we ran an exploratory factor analysis (AFE) followed by confirmatory factor analysis. AFE was carried out using principal component analysis and direct Oblimin rotation (Hair et al., 1999). The factor analysis was viable according to the Kaiser-Meyer-Olkin measure of sampling adequacy (KMO = .821), and Bartlett's test of sphericity (χ 2 (15)= 4638.35; p < .001). The determinant of the correlation matrix was, as expected, near to 0 (0.001; Hair et al., 1999).
EFA yielded two factors with an eigenvalue greater than 1. This solution explained 95.34% of the total variance of the construct. The two-factor model showed that factor 1 loads name scores (ILN, CRN, CRN30) and factor 2 were related to occupation scores (ILO, CRO, CRO30), as expected based on theory and previous findings (Table 4). Following the confirmatory modeling strategy (Hair et al., 1999), we carried out a confirmatory factor analysis for the two-factor model to verify its goodness of fit. AFC revealed satisfactory goodness of fit values. Chi squares value (χ² (8) = 21.86; p = .005) is low, but it is statistically significant. A p-value greater than .05 is expected, but it is important to note that this indicator should not be considered determining in large populations due to its lack of sensitivity. Thus, χ²/df index is more appropriate due to its lower sensitivity to sample size; χ²/df = 2.73 showed adequate absolute goodness of fit (Hair et al., 1999;Mulaik, 2009). GFI and AGFI were satisfactory, above .95 (GFI = .98; AGFI = .96). Also, RMSEA was within an acceptable range, between .05 and .08 (RMSEA = .058). Regarding the incremental fit indicators, NFI and TLI were above .90, reflecting good fit (NFI = .97; TLI = .96) (Hair et al., 1999;Mulaik, 2009).
To examine the convergent validity of S-FNAME Pearson correlation coefficients (r) were calculated between S-FNAME (ILN, ILO, FN-N, FN-o, and S-FNAME Total) and RAVLT scores (Trial I, Total Learning and Delayed Recall). Statistically significant (p < .001) associations were found for both RAVLT scores with medium effect size (Table 5).
According to divergent validity, we obtained Pearson correlation coefficients between S-FNAME scores and nonmemory test. We found an inverse association between S-FNAME scores and TMTA and TMTB scores (p < .001), except the correlation between FN-O and TMTA, which is close to 0. Also, a positive association (p < .001) resulted between S-FNAME measures and Matrix Reasoning, Block Design of WAIS-IV, but with a small effect size (Table 5).

Normative Data
Multiple regression linear analyses were carried out to examine the demographic variables' contribution as predictors of S-FNAME performance. The results confirmed the significant contribution of age, gender, and year of formal education to the variance of S-FNAME scores (p < .001), as illustrated in Table 6.
Finally, we stratified scores by gender, age, and educational level in Table 7 and calculated norms through percentile transformations (see Tables 8-10).

DISCUSSION
The present study aimed to assess the psychometric properties of the S-FNAME and develop normative data based on a large sample of cognitively healthy middle-aged adults. The refinement of highly sensitive measures to detect major neurocognitive disorders' prodromal markers is extremely valuable and needed work (Polcher et al., 2017).  Face-Name Association tests are highly demanding and cross-modal measures that, due to their complexity, minimize possible compensating strategies and are capable of identifying changes not detected by other tests (Loewenstein et al., 2018;Rubiño & Andrés, 2018;Alegret et al., 2020).
Our results confirmed that S-FNAME is a valid and reliable measure. Factor analysis revealed two underlying dimensions with satisfactory goodness of fit. Face-occupation association pairs (ILO, CRO, CRO30) loaded to a first component while face-name association pairs (ILN, CRN, CRN30) to the second one. This two-factor model demonstrates the test's construct validity and supports previous FNAME validation findings (Amariglio et al., 2012;Kormas et al., 2018).
Convergent validity was demonstrated with strong and positive correlations with RAVLT, a traditional and widely used episodic-memory test. Specifically, initial learning and long-term memory RAVLT scores showed statistically     significant association with immediate and delayed recall face-name and face-occupation scores, respectively. These results confirmed, like previous studies, that S-FNAME is an episodic memory test (Amariglio et al., 2012;Papp et al., 2014;Kormas et al., 2018;Vila-Castelar et al., 2019). Furthermore, we found inverse correlations between S-FNAME and TMT performance and positive, but with small effect size, correlation with Matrix Reasoning and Block Design. All of these standardized tests are non-memory measures. Negative associations with tests measuring cognitive functions other than memory, lower than relations with memory tests, give evidence to divergent validity of S-FNAME. Findings are similar to those reported in previous research Kormas et al., 2018;Vila-Castelar et al., 2019). Results also revealed excellent internal consistency of the test, supporting its reliability.
S-FNAME performance was related to gender, age, and educational level. Women obtained higher scores than men on all S-FNAME scores, as previously reported by . Previous findings have described that women's performance on episodic memory tasks is higher (Andreano & Cahill, 2009). Also, our findings are in line with the results of Rentz et al. (2017), which demonstrated that women scored higher than men in early midlife. Still, gender differences were attenuated after menopause, especially in coding and evocation, with differences in storage and consolidation remaining.
Concerning education, there are previous contradictory results; while some studies reported a significant positive effect similar to our findings (Papp et al., 2014;Vila-Castelar et al., 2019), other authors failed to find this association (Amariglio et al., 2012;Kormas et al., 2018). This apparent inconsistency is possibly due to methodological issues and cultural differences in the samples used. In this line, Rubiño & Andrés (2018) highlight that the effect of age and educational level should be experimentally examined and must be verified in future studies using appropriate versions of the test in different samples.
Face-name component scores outperformed face-occupation component scores. It has been suggested that the face-name task is more sensitive to detect early changes in abnormal aging due to it is a more demanding and ecologically relevant task in older adults. It has also been related to higher brain Aβ deposition in healthy individuals with subjective cognitive decline (Rentz et al., 2011;Jurick et al., 2018). Specifically, using S-FNAME, the worse face-name performance was significantly related to higher amyloid-β deposition in the bilateral Posterior Cingulate Cortex (Sanabria et al., 2018). Rentz et al. (2017) and Rubiño & Andrés (2018) explained that face-name associations imply pairing unrelated, abstract, and unique information making it more difficult than face-occupation associations that involve previously stored semantic knowledge. Finally, we developed valuable normative data. Carrying out standardization studies is exceptionally relevant in clinical neuropsychology and research fields (Lezak et al., 2004;Peña-Casanova et al., 2009;Alegret et al., 2012;Del Pino, Peña, Schretlen, Ibarretxe-Bilbao& Ojeda, 2015).
S-FNAME is a promising tool for detecting a subtle decline in abnormal aging (Ritchie et al., 2017;Loewenstein et al., 2018). Population-specific normative tables for the age range included in this study are handy considering the importance of early detection of meaningful preclinical AD changes. Standardize and validate sensitive tests allow a better understanding of prodromal dementia and the application of timely treatment to prevent disability (Ritchie et al., 2017;Buckley & Pascual-Leone, 2020).

Limitations
Our sample size was satisfactory for psychometric studies (Prieto & Muñiz, 2000;Evers, Sijtsma, Lucassen, & Meijer, 2010) and met the requirements to run factorial analyses (Hair et al., 1999;Costello & Osborne, 2005). Also, it was larger than those used in previous validation studies in Spain, which mentioned sample size and gender distribution as a limitation . However, for normative studies, larger samples are recommended to obtain subgroups with a sufficient number of subjects according to the distribution of sociodemographic variables.
Also, our sample was representative of the current Spanish population in terms of age and gender. However, it does not offer a representative sample of the educational distribution (Instituto Nacional de Estadística [INE], 2019). Our cohort includes a large proportion of participants with a superior educational level and a reduced number of individuals with elementary or secondary obligatory school completed. Therefore, the main limitation of our study is that the level of education of the sample is above the Spanish population average. Del Pino et al. (2015) noted that an unequal distribution and overrepresentation of highly educated individuals are recurrent in Spanish normative studies. Psychometric studies of other memory tests have also faced such limitations (Speer et al., 2013;Lavoie et al., 2018). Specifically, in previous validation studies of the FNAME Exam, Alegret et al. (2012) and Papp et al. (2014) divided educational level using the same cut-off point (16 years of formal education).
Further, a recent study examining the relationship between educational measures and dementia risk suggests that the stratification in high and low education (i.e., tertiary vs. non-tertiary education) revealed the strongest associations .
It is crucial to emphasize that clinicians must be careful when interpreting the results derived from these norms during the evaluation of individuals with low educational levels. As Alegret et al. (2012) mentioned, ideally, normative research would obtain data from epidemiological sampling; however, when individuals are recruited into studies through referral clinics, it is common to find bias regarding the educational level or socioeconomic status. Hence, we strongly recommend to stratify the sample of future research according to specific educational levels of the Spanish system.
The sample was obtained from the baseline assessment of the BBHI cohort study and, consequently, it is composed of healthy, cognitively unimpaired subjects. A limitation of our psychometric study is the lack of clinical samples that would allow us to determine the sensitivity and specificity of the SFNAME. Such data are important within the aging assessment practice to choose instruments capable of detecting cognitive impairment. Thus, future studies should target specific patient populations, including Spanish clinical samples, in order to provide such psychometric properties.
Finally, the need for longitudinal studies that provide evidence of the predictive validity of the S-FNAME remains. The BBHI study aims to determine brain health predictors and includes S-FNAME as one possible early marker of changes associated with aging (Cattaneo et al., 2018).