Visual associative learning to detect early episodic memory deficits and distinguish Alzheimer ’ s disease from other types of dementia

Objective: We investigated how well a visual associative learning task discriminates Alzheimer ’ s disease (AD) dementia from other types of dementia and how it relates to AD pathology. Methods: 3,599 patients (63.9 ± 8.9 years old, 41% female) from the Amsterdam Dementia Cohort completed two sets of the Visual Association Test (VAT) in a single test session and underwent magnetic resonance imaging. We performed receiver operating curve analysis to investigate the VAT ’ s discriminatory ability between AD dementia and other diagnoses and compared it to that of other episodic memory tests. We tested associations between VAT performance and medial temporal lobe atrophy (MTA), and amyloid status ( n = 2,769, 77%). Results: Patients with AD dementia performed worse on the VAT than all other patients. The VAT discriminated well between AD and other types of dementia (area under the curve range 0.70 – 0.86), better than other episodic memory tests. Six-hundred forty patients (17.8%) learned all associations on VAT-A, but not on VAT-B, and they were more likely to have higher MTA scores (odds ratios range 1.63 (MTA 0.5) through 5.13 for MTA ≥ 3, all p < .001) and to be amyloid positive (odds ratio = 3.38, 95%CI = [2.71, 4.22], p < .001) than patients who learned all associations on both sets. Conclusions: Performance on the VAT, especially on a second set administered immediately after the first, discriminates AD from other types of dementia and is associated with MTA and amyloid positivity. The VAT might be a useful, simple tool to assess early episodic memory deficits in the presence of AD pathology.


Introduction
Episodic memory impairment clinically characterizes Alzheimer's disease (AD) (Scheltens et al., 2021).The medial temporal lobe plays a prominent role in episodic memory functioning (Gomar et al., 2017); the perirhinal and hippocampal regions in particular (Gomar et al., 2017;Qin et al., 2009).The medial temporal lobe is also the area most prone to early neuropathological changes observed in AD, most notably the spreading of tau (Cho et al., 2016).
Visual associative learning is an episodic memory paradigm that seems especially dependent on the function of the entorhinal and hippocampal regions of the medial temporal lobe (Barnett et al., 2016;de Rover et al., 2011) and may thus be impaired in early stages of AD.Consequently, visual associative learning has the potential to contribute to accurate and timely diagnosis, which is important for optimal patient management and potential treatment.Several studies have shown that visual associative learning tasks discriminate well between people living with AD and healthy controls (Hicks et al., 2021;Lindeboom et al., 2002), and that it relates to medial temporal lobe functioning (Rombouts et al., 1998;de Rover et al., 2011).
The Visual Association Test (VAT) is a brief visual associative memory task, originally developed in the Netherlands, in which patients are asked to recall unusual associations of line-drawn images (Lindeboom et al., 2022;Lindeboom et al., 2002).While the VAT is usually administered as a single set of images, we propose an alternative administration where a second, different set of images is administered immediately following the first.This might allow to investigate proactive interference, which is a potentially valuable marker of subtle episodic memory deficits, in which the information encoded during an initial task interferes with the ability to encode additional information in a second task (Keppel & Underwood, 1962).Proactive interference has previously been shown to be greater in individuals with mild cognitive impairment, compared to cognitively normal older individuals (Ebert & Anderson, 2009;Hanseeuw et al., 2010;Loewenstein et al., 2004).The alternative administration method is mentioned in the manual as an experimental method (Lindeboom et al., 2022), for which an evidence base is currently lacking.
In this study, we aimed to investigate the value of this alternative administration for differential diagnosis of AD.First, we investigated the discriminatory ability of the VAT for distinguishing between different types of dementia, including AD, frontotemporal dementia, dementia with Lewy bodies and vascular dementia, and compared it to other episodic memory tasks.Then, we related VAT scores to markers of AD, including medial temporal lobe atrophy (MTA) and amyloid-β accumulation.Based on earlier studies with other memory tasks, we hypothesize that VAT performance will be worse in those diagnosed with AD dementia, compared to other types of dementia, and that more severe MTA and amyloid-β positivity will be associated with poorer performance on the VAT, specifically on a second set administered immediately after a first.

Participants
We included patients from the Amsterdam Dementia Cohort (van der Flier & Scheltens, 2018), who visited the Alzheimer Center Amsterdam between May 2001 and June 2022.All patients underwent extensive screening that included taking patient history, neurological examination, neuropsychological assessment, brain magnetic resonance imaging (MRI), and lumbar puncture.Diagnoses were made in consensus meetings according to diagnostic criteria for mild cognitive impairment (Petersen et al., 2014), AD dementia (McKhann et al., 2011), frontotemporal dementia, including the behavioral and right-temporal subtypes, and frontotemporal dementia associated with amyotrophic lateral sclerosis (Neary et al., 1998;Rascovsky et al., 2011;Ulugut Erkoyun et al., 2020), dementia with Lewy bodies (McKeith et al., 2017), vascular dementia (Roman et al., 1993), primary progressive aphasia (Gorno-Tempini et al., 2011), and other types of dementia.When none of these criteria were met and there were no primary psychiatric disorders, participants were labeled as reporting subjective cognitive decline (Jessen et al., 2014).More details on study procedures are provided elsewhere (van der Flier & Scheltens, 2018).
For the analyses in the current study, we included patients who completed at least one trial on two sets of the VAT in the same assessment session, and who had structural MRI and/or amyloid-β available.This study was approved by the medical ethical review board of VU University Medical Center and carried out in accordance with the Helsinki Declaration of 1975.Written informed consent was obtained from all patients.

Neuropsychological assessment
All patients underwent a standardized neuropsychological assessment of approximately one hour as part of the routine diagnostic workup that included several tests for attention, speed of processing, executive functioning, visuospatial abilities, language, and episodic memory.The full neuropsychological assessment is described in more detail in van der Flier et al. (2014).

Visual association test (VAT)
In the VAT, patients are presented six cue cards with line drawings of common objects or animals (e.g., a gorilla) and are asked to name them (Lindeboom et al., 2022).If necessary, naming is aided by the experimenter and responses may be oral, written, drawn, or mimed.Next, association cards with these same drawings are shown in an unusual combination with another object (e.g., a gorilla holding an umbrella).Of note, the subject is merely instructed to name both objects depicted, that is, there is no explicit instruction to memorize what object was associated with each cue.Subsequently, the cue cards are shown again without delay and patients are asked to recall the associated object (Lindeboom et al., 2022).We administered a single set of six images ("VAT-A"), according to established administration guidelines, giving patients up to three opportunities to recall the associated objects (Lindeboom et al., 2022).When the patient recalled all six associations, administration ended (and maximum trial scores of 6 were carried forward to any remaining trials).While current practice is to stop administration after completion of this single set, we then administered a second set ("VAT-B") immediately after completing the first, with the exact same instructions.When the participant had recently completed the VAT elsewhere (<12 months prior to their visit), we administered parallel sets "C" and "D" (n = 261, 7%), with different cue and association cards, in an identical fashion.According to the VAT's manual, all four sets have equal difficulty and discriminatory ability (Lindeboom et al., 2022).Below, we will simply refer to the set administered first as "VAT-A" and the set administered second as "VAT-B".
The number of correctly recalled associated objects was tallied for each trial and tallies for the three trials were summed for each set, with total scores per set ranging from 0 (no association recalled correctly) to 18 (all associations recalled correctly).In some cases (n = 324, 9%), a third trial was not administered, even if not all six associations were learned.In those instances, we carried forward the score on the second trial (results from sensitivity analyses excluding these cases are included in Supplemental Tables 3, 5, and 8).Further, for both sets separately, the score on the third and final trial was used to determine whether a patient successfully learned all associations (trial score of 6, considered normal) or not (trial score of ≤ 5, considered abnormal).Based on this performance, patients were grouped as follows: (1) patients who learned all associations on both sets (VAT-A & VAT-B normal), (2) patients who learned all associations on the first, but not on the second set (VAT-A normal, VAT-B abnormal), and (3) patients who did not learn all associations on either set (VAT-A and VAT-B abnormal).

Other episodic memory tests
Other episodic memory tests included in the present study were the Dutch version of the Auditory Verbal Learning Test (AVLT; Saan & Deelman, 1986) and the Rey-Osterrieth Complex Figure Test (RCFT;Meyers & Meyers, 1995).For the AVLT, we used the total score over five trials on the immediate recall (range 0-75) and the total score of the delayed recall (range 0-15), administered after a 15-minute interval during which no other episodic memory tests were administered.For the RCFT, we used the immediate recall after 3 minutes (range 0-36).For both tests, higher scores represent better memory performance.

Statistical analyses
All analyses were run in R version 4.3.1 ("Beagle Scouts"; R Core Team, 2023).Differences between the three VAT performancebased groups in sociodemographic characteristics were tested using analysis of variance with Tukey's HSD correction for multiple testing for continuous data and chi-squared tests for categorical data.
Using VAT total scores, we performed receiving operator curve (ROC) analysis and calculated the area under the curve (AUC) for various contrasts of diagnoses, for VAT-A and VAT-B separately.To calculate the combined AUC of VAT-A and VAT-B, we first used logistic regressions with contrasts of diagnoses as dependent variables and VAT-A and VAT-B total scores as independent variables.We then saved the predicted values and entered those in the ROC analysis.We also calculated AUCs for the AVLT (immediate and delayed recall) and RCFT.An AUC between 0.7 and 0.8 was considered acceptable, between 0.8 and 0.9 was considered excellent, and >0.9 was considered outstanding (Hosmer & Lemeshow, 2000).
Ordinal logistic regressions were employed to analyze the relationship between VAT performance in the above-mentioned groups and the average MTA score of both hemispheres.We report odds ratios (OR) and 95% confidence intervals (95%CI), adjusted for sex, age, and education.Models that were additionally adjusted for diagnosis are shown in the Supplementary Material.
Next, using linear mixed models (LMMs) with random intercepts and slopes for trials, we modeled learning curves over three trials on both sets, and included three-way interactions with set, trial, and average MTA score.
In the subsample of patients for whom amyloid status was determined, we ran ordinal logistic regressions to investigate the relationship between the three VAT performance groups and amyloid status.Within amyloid positive patients, we also modeled learning curves over three trials on both sets, by average MTA score.Finally, in sensitivity analyses, we investigated the influence of using parallel versions on all models by separating patients who completed the original sets (VAT-A and VAT-B) from those who completed the parallel sets (VAT-C and VAT-D).These analyses are reported in the Supplementary Material.
Most patients learned all associations on both sets (n = 2,442, 68%).Six-hundred forty patients (18%) learned all six associations on set A, but did not learn all associations on set B, while 517 patients (14%) were unable to learn all associations on either set.Patients who learned all associations on both sets were younger, had more years of education, and had higher GDS scores, than those who did not.MMSE, VAT, AVLT, and RCFT were highest in patients who learned all associations and lowest in patients who were unable to learn the associations on either set.Patients in this latter group were more likely to be female (see Table 1).Patients who completed the parallel sets of VAT-C and VAT-D (n = 261, 7%) were more likely to be diagnosed with AD dementia, were more often amyloid positive and had higher MTA scores than patients who completed the original sets of VAT-A and VAT-B.Those who completed the parallel sets were also more likely to have an abnormal performance on both sets (n = 60, 23%) than those who completed the original sets (n = 517, 14%).Below, when describing analyses including all patients, we will refer to the set administered first as set A and the set administered second as set B.
Figure 1 shows the proportions of patients in each of the performance groups, stratified by diagnosis.Less than a third of patients (n = 246; 29%) with AD dementia learned all associations on both sets, while the proportion of this group was considerably larger in all other diagnostic groups (see Fig. 1).

Differential diagnosis
Patients diagnosed with AD dementia performed worse on the VAT than all other patients.AUCs were computed to investigate how well the VAT could distinguish AD dementia from other diagnoses.The AUCs are displayed in Table 2 and visualized in Figure 2. The combined VAT-A and VAT-B had an excellent ability to discriminate between AD dementia and subjective cognitive decline (AUC = 0.93, 95% confidence interval (95%CI) = [0.91,0.94]), and an acceptable ability to distinguish mild cognitive impairment from AD dementia (AUC = 0.74, 95%CI = [0.71,0.76]).The immediate recall of the AVLT and RCFT discriminated similarly well between AD dementia and subjective cognitive decline and mild cognitive impairment, as did the delayed recall of the AVLT (see Table 2).
As displayed in Table 2, both VAT-A and VAT-B separately had an acceptable ability to discriminate AD dementia from other types of dementia, particularly from frontotemporal dementia and primary progressive aphasia.The combination of the VAT-A and VAT-B had a slightly superior discriminating ability than either set separately.VAT-A and VAT-B combined were also superior to the AVLT (both immediate and delayed recall) and RCFT in discriminating between different types of dementia.
Discriminatory ability for the VAT was virtually the same among only patients who completed the parallel sets of the VAT, albeit with larger confidence intervals due to the smaller sample.AUCs stratified by VAT version are shown in Supplemental Table 2, and without scores carried forward in Supplemental Table 3.

Relation to Alzheimer's disease pathology
Compared to patients who learned all associations on both sets, patients with higher MTA scores were up to five times more likely to successfully learn all associations on the first, but not the second set.Similarly, patients with higher MTA scores were more likely to be unable to learn all associations on either set.It appeared, with each increase in MTA score, the odds of being unable to learn associations increased (see Table 3).These associations remained identical among only patients who completed the original sets of VAT-A and VAT-B but were not evident in the group of patients who completed VAT-C and VAT-D.Results stratified by VAT version are reported in Supplemental Table 4, and without scores carried forward in Supplemental Table 5.We also investigated the relationship between performance on the VAT and global cortical atrophy and white matter hyperintensities.More global cortical atrophy, but not white matter hyperintensities, were related to difficulty learning all associations on only the second or on both sets.All results are included in Supplemental Table 6.
LMMs showed that scores on set B were consistently lower across all average MTA scores (all p < .001).Scores on trials 2 and 3 increased less from trial 1 with each successive MTA score, on set B more so than on set A, as derived from three-way interaction between set, trial, and MTA score.Figure 3 shows learning curves over the three trials of VAT sets A and B, stratified by the average MTA score of the two hemispheres.The learning curve on set B was consistently lower than on set A, with patients who had an MTA of 0 achieving a steeper learning curve on VAT-B than on VAT-A.In patients with higher MTA scores, the learning curves appear almost parallel, with the learning curve on the second set lying lower than on the first set.
Amyloid-β status was available for 2,769 patients (76.9%), 1,379 of whom (49.8%) were amyloid positive.Amyloid positive patients were more likely to learn all associations on set A but not on set B (odds ratio (OR) = 3.38, 95% confidence interval (95%CI) = [2.71,4.22], p < .001)than to learn all associations on both sets.Likewise,  amyloid positive patients were more likely to be unable to learn all associations on either set (OR = 6.35, 95%CI = [4.84,8.34], p < .001).These findings remained after additional adjustments for diagnosis (see Supplemental Table 7) and without carrying forward scores (Supplemental Table 8).Learning curves by MTA among amyloid positive patients are shown in Supplemental Figure 1.

Discussion
In this study, we investigated the diagnostic value of a short visual associative memory task, the VAT, specifically for distinguishing AD from other types of dementia.The VAT could discriminate well between AD dementia and earlier syndromes (SCD and MCI), as well as other types of dementia, just as well or even better than other episodic memory tests.We showed that, even in patients with relatively little atrophy of the medial temporal lobe, performance on a second set of images presented immediately after the first set, was poorer.Together, our findings provide evidence for the good diagnostic value of the VAT, especially when administered as two subsequent short memory tasks.The total score of VAT-A over three trials had a fair diagnostic accuracy for distinguishing between AD dementia and other types of dementia, with the distinction between AD dementia and dementia with Lewy bodies being the most difficult.Others have shown the favorable diagnostic value of the administration of a single set of the VAT (Lindeboom et al., 2002;Meyer et al., 2016).Meyer et al. (2016) argued that the absence of clear floor effects on the VAT in prodromal AD makes the VAT a suitable instrument for assessing early memory deficits.A previous study showed that a dichotomized score on only the first set was predictive of future progression to dementia (Jongstra et al., 2018).Here we show that the combined discriminatory ability of VAT-A and VAT-B, when administered as two separate sets, improves diagnostic accuracy.Moreover, this administration of the two sets of the VAT had   slightly higher diagnostic accuracy than the more elaborate AVLT (both immediate and delayed recall) and the visual RCFT in distinguishing between AD dementia and other types of dementia.Next, we showed that performance on the VAT also relates to MTA, which is characteristic of AD.Patients who successfully learned the first set of associations but could not learn all associations on the second set, were more likely to have more severe MTA than patients who learned all associations on both sets.Indeed, associative learning has been related to hippocampal dysfunction (Collie et al., 2002), which is a hallmark of AD.Previous studies have shown the relationship between performance on associative learning tests and hippocampal atrophy (Miller et al., 2008), as well as activation of the temporal lobe (Rombouts et al., 1998).
Importantly, we demonstrate here that using the standard administration of a single set of six images might lead to the inaccurate conclusion that there is no learning deficit: almost one fifth of our sample learned all associations on the first set of six images but was subsequently unable to learn all associations on the second set.Notably, the image pairs in both sets have been found to be of equivalent difficulty (Lindeboom et al., 2022), so it seems likely that the lower performance on VAT-B is due to some other phenomenon.
The developers of the VAT already hinted at the possibility of eliciting early signs of memory impairment using this alternative administration where two sets of six image pairs are presented separately and immediately following one another.They hypothesized that proactive interference underlies this phenomenon, which is a potentially valuable marker of subtle episodic memory deficits, in which the information encoded during an initial task interferes with the ability to encode additional information in a second task (Keppel & Underwood, 1962).Proactive interference has previously been shown to be greater in individuals with mild cognitive impairment, compared to cognitively normal older individuals (Ebert & Anderson, 2009;Hanseeuw et al., 2010;Loewenstein et al., 2004).As such, tasks that can elicit proactive interference may allow for early diagnosis of memory deficits.Indeed, several others have suggested that proactive interference may occur in early disease stages (Hanseeuw et al., 2010;Villeneuve & Belleville, 2012).However, thus far, no empirical evidence has been presented to show that the VAT may be used to elicit early learning deficits potentially due to proactive interference.
Our findings provide the first evidence for the clinical utility of the VAT, administered as two short visual associative memory tasks, as part of the standard neuropsychological assessment for dementia diagnosis.Not only does performance on the VAT  discriminate between AD and other types of dementia, but it also relates to AD neuropathological changes.What is more, the VAT also holds several other advantages over traditionally used word list learning tasks.First, the VAT is often perceived as less burdensome by the patient than the intimidating Auditory Verbal Learning Test where they need to learn long lists of words.Second, the VAT can be completed in less time, particularly in those who have no learning deficits, as the administration of a set may be discontinued when all associations have been learned.Third, the VAT relies less heavily on language.Patients with word finding difficulties may use other means to convey that they know what object was associated with the cue card.Last, there is a version of the VAT available that was designed to be more broadly culturally applicable by employing colored pictures instead of black-and-white line drawings (Franzen et al., 2019).This Modified VAT was shown to be better suited for relatively low-educated and non-Western immigrants (Franzen et al., 2019).Together, these characteristics may further support the selection of the VAT for differential dementia diagnostics.
Based on these findings and our clinical experience, we recommend the following administration logic: administer the first set of six images up to three times, until all associations are learned.If, after three trials, three or fewer associations have been learned (e.g., a learning curve of 1-1-2 or 0-2-3), it is not necessary to administer a second set, because the learning deficit is already evident.However, if on the last trial, four or more associations are recalled, administering a second set immediately after might reveal more subtle learning deficits.
In our study, we administered a parallel version of the VAT (sets C and D) when patients were referred to our clinic for a second or third opinion and had recently undergone neuropsychological assessment prior to their visit to our memory clinic.As a result, these patients were more likely to be diagnosed with AD dementia, probably because they were further along the disease trajectory.Previous research showed no significant difference between sets A and C, and sets B and D, albeit in a small sample (Lindeboom et al., 2022).Based on this finding, we did not make a distinction between VAT-A/B and VAT-C/D in our analysis.However, we found that the association between performance on VAT-C/D and MTA and amyloid diminished after adjustments for age, sex, and education, contrary to the association with the performance on VAT-A/B.These findings could suggest that the VAT-C and VAT-D might not be suitable as parallel tests.On the other hand, patients in our sample who completed sets C and D may represent a specific group of patients with a complex clinical disease presentation, as they required a second (and sometimes third) opinion from a specialized memory clinic for their diagnostic workup.Future research should reveal whether the use of VAT-C/D can indeed be used as a valid parallel test for VAT-A/B in this experimental administration.
This work has some limitations.First, clinical diagnosis was not completely independent of performance on the VAT, which may have led to incorporation bias.Diagnoses were made in a multidisciplinary consensus meeting where the overall performance on the neuropsychological assessment contributed to the diagnosis of subjective cognitive decline, mild cognitive impairment or (any type of) dementia.As the VAT was only one piece of the puzzle, we believe the degree of circular reasoning is limited, however, we cannot completely rule out circularity.Second, our study sample was relatively young and highly educated, which limits generalizability to older populations and groups of patients who did not have access to formal education.It would be worthwhile to assess this experimental administration using the Modified VAT in diverse populations to determine whether our findings generalize.In addition, a direct comparison between this alternative and the standard administration method is needed to further support the outcomes of the current study.Furthermore, collecting normative data will form the next step toward implementation of this alternative administration in clinical practice.Future work might also employ other visual associative learning tasks to investigate whether this phenomenon is testspecific or broadly evocable.
An important strength of our study was the inclusion of a large sample of well-phenotyped individuals who underwent broad, standardized neuropsychological testing.Furthermore, the VAT has been validated in multiple studies and is already widely used in standardized neuropsychological assessment, and therefore the administration we propose can easily be implemented in clinical care.The alternative administration that is proposed requires no additional materials in addition to those of the existing test.
In conclusion, the alternative administration of the VAT as two subsequent short memory tasks as a novel approach to a widely used test administration, provides an easy and accessible test to capture AD-related early memory deficits and contributes to differential diagnosis.Supplementary material.For supplementary material accompanying this paper visit https://doi.org/10.1017/S1355617724000079 3 10.0 ± 5.7 7.0 ± 4.0 Aβ = amyloid-β, APOE = apolipoprotein E, AVLT = Auditory Verbal Learning Test, CSF = cerebrospinal fluid, GDS = Geriatric Depression Scale, IQR = interquartile range, M = median, MTA = medial temporal lobe atrophy, VAT = Visual Association Task, MMSE = Mini-Mental State Examination, P-tau = phosphorylated tau, RCFT = Rey-Osterrieth Complex Figure Test.Data are displayed as mean ± standard deviation, unless stated otherwise. 1percentages in the subgroups represent row percentages.

Figure 1 .
Figure 1.Proportion of patients with normal performance on both sets of the VAT (in yellow), normal performance on VAT-A, but abnormal performance on VAT-B (in light gray) and abnormal performance on both sets (in dark gray), stratified by clinical stage (top panel) and dementia type (bottom panel).AD = Alzheimer's disease, DLB = dementia with Lewy bodies, FTD = frontotemporal dementia, MCI = mild cognitive impairment, PPA = primary progressive aphasia, SCD = subjective cognitive decline, VAT = visual association test, VD = vascular dementia.

Figure 3 .
Figure 3. Learning curves over three trials on VAT-A (red) and VAT-B (blue), stratified by MTA score.MTA = medial temporal atrophy, VAT = visual association test.

Table 2 .
Areas under the curves for distinguishing AD dementia from clinical stages and other types of dementia, with 95% confidence intervals, and contrasts

Table 3 .
Odds ratios for VAT learning patterns based on MTA scores MTA = medial temporal atrophy, OR = odds ratio, VAT = Visual Association Test, 95%CI = 95% confidence interval.MTA 0 and learning all associations on both sets served as the reference categories.Models are adjusted for baseline age, sex, and education.