Normative Data for the CERAD-NP for Healthy High-Agers (80 – 84 years) and Effects of Age-Typical Visual Impairment and Hearing Loss

Objectives: This study aims to establish reference data for nondemented adults between 80 and 84 years of age based on the German version of the Consortium to Establish a Registry for Alzheimer ’ s disease Neuropsychological (CERAD-NP) test battery and to assess the possible influence of hearing and vision impairments on CERAD-NP performance. Methods: Two hundred one volunteers were examined with the German CERAD-NP test battery, and 18 test scores were calculated from the data. The sample included 99 men (49%), the mean age was 81.8 years ( SD = 1.3), and the mean years of education were 13.9 ( SD = 3.1). Percentiles for continuous and percentile ranks for discrete test scores were calculated separately for four norm groups. The groups were classified according to gender and education. Multiple regression analysis was used to predict cognitive performance from visual acuity and hearing ability. Results: The normative data obtained were consistent with other findings from younger and older age groups. Worse visual acuity predicted slower performance in the Trail Making Test (TMT). None of the other CERAD-NP tests were correlated to sensory functions. Conclusions: Using age-appropriate reference data, such as that established here for the 80 – 84 year age group can help to improve the detection of cognitive decline and prevent biases that arise when old-old adults are compared to younger old adults. Visual acuity should be considered an influencing factor on TMT performance.

While most of these data sets span a wide age range, the usefulness for OA beyond the age of 80 is limited because normative data sets for high-agers are often small and not well balanced according to gender or education (Miller et al., 2015). A few studies have tried to combat this bias by studying the CERAD-NP performance of high-agers specifically. Beeri and colleagues (2006) obtained normative data from a sample of 196 healthy individuals 85 years and over in the USA. Additionally, Luck and colleagues (2009) published data for the age group >75 years in Germany, but only included the memory subtests of the CERAD-NP. Both studies concluded that using norms based on younger cohorts or only small, biased samples of an older age group can lead to more false positives and the subjective interpretation of results (Beeri et al., 2006;Luck et al., 2009).
Especially in the German-speaking area, there is a lack of normative data in this age range (>80 years) beyond the memory subtests. The German version of the CERAD-NP (Memory Clinic Basel (2005)) includes two additional tests (Trail Making Test and Verbal Fluency with S-Words) that measure executive functions (Schmid, Ehrensperger, Berres, Beck, & Monsch, 2014). This version was validated (Aebi, 2002), and normative data are available for the original CERAD-NP tests based on a sample of 1100 Swiss healthy OA (49-92 years). The current normative data for the additional CERAD-NP tests are based on 604 Swiss healthy OA (50-88 years). However, the sample sizes of both these data sets are unequally distributed across age groups. For example, only one person was included for highly educated females in the age group >80 years. Luck et al. (2018) published an update of the CERAD-NP norms based on data obtained between 2011 and 2014, but only included the age range from 60 to 79 years.
This presents a good basis for applying the CERAD-NP to older samples both in Germany and internationally. Considering the increasing number of individuals older than 80 years in western societies and the concomitant increase of patients with dementia (Deutsche Alzheimer Gesellschaft e.V., 2018; Statistisches Bundesamt (Destatis), 2019), the frequency of and need for neuropsychological testing in this age group will continue to grow. Hence, reliable and comprehensive data sets for neuropsychological tests are required. Furthermore, normative data must be updated regularly to account for cohort effects and socio-environmental changes (Dickinson & Hiscock, 2011). Thus, the primary objective of this study is to complement already published data by providing a current and comprehensive normative data set for individuals between 80 and 84 years of age that is based on all CERAD-NP subtests. Our sample can be considered part of the old-old age group, which is used as a term to describe adults between 75 and 85 years old (Boyd & Bee, 2006). As a secondary outcome, we use the same sample to investigate how performance relates to sight and hearing performance.

INFLUENCE OF HEARING AND VISION
Standard cognitive tests almost exclusively use visual and auditory stimuli and oral test instructions. Furthermore, visual impairments like cataracts, glaucoma or macular degeneration, and hearing loss are very common in populations over 80 years of age (Hesse, Eichhorn, & Laubert, 2014;Reitmeir et al., 2017). Even though treatment and support with aids are routinely available, there is still a portion of individuals that do not regularly use their aids (Oberg, Marcusson, Nagga, & Wressle, 2012;Tsai, 2009) or who have impairments that cannot be sufficiently corrected or reversed by treatment (Nowak, 2006). Considering this, epidemiological study samples, as well as individuals receiving neuropsychological testing, probably cover a broad range of sensory functioning, and this must be considered when testing them.
As expected, impairments in visual and hearing ability have been shown to result in poorer performance in the Mini-Mental Status Examination (MMSE), part of the CERAD-NP, and other screening tools (Dupuis et al., 2015;Lim & Loo, 2018). To date, the relationship between hearing or vision and CERAD-NP performance in OA has not been clarified in any known studies. This knowledge is needed to better interpret the performance of OA with sensory impairments. Therefore, the secondary aim of this study is to examine the association between corrected visual acuity and corrected hearing ability and CERAD-NP performance.
Cross-sectional and longitudinal data have suggested that a substantial amount of variance in cognition can be explained by the quality of sensory functions (Li & Lindenberger, 2002). This might be because there are reduced resources available for cognitive tasks as additional cognitive effort is required for perceptual success in the presence of sensory deficits (McCoy et al., 2005;Wood et al., 2010), the so-called effortfullness hypothesis. Alternatively, the common cause hypothesis assumes that common neuropathological processes account for the changes in sensory and cognitive function (Uchida et al., 2019). A mixture of both explanations is most likely responsible for the strong connections between sensory and cognitive functioning during old age (Li & Lindenberger, 2002).

Sample
The nondemented volunteers were recruited as part of the SENDA (Sensor-based systems for early detection of dementia) study at the Chemnitz University of Technology, Germany. The study was approved by the Ethics Committee of the Chemnitz University of Technology (Faculty of Behavioral and Social Sciences) on December 2 S. Fröhlich et al.

19
, 2017 (V-232-17-KM-SENDA-07112017) and is included on the German Clinical Trials Register (DRKS00013167). Recruitment strategies as well as inclusion and exclusion criteria can be found in Table 1. Among others the following exclusion criteria were applied here: diagnosed psychological disorders (e.g., major depressive episode, anxiety disorder, substance use disorder) and diagnosed neurocognitive disorders (e.g., delirium, dementia due to Alzheimer's disease, dementia due to vascular disease). Eligibility was determined via telephone interview carried out by a trained study nurse. Furthermore, the face-toface Montreal Cognitive Assessment (MoCA, Nasreddine et al., 2005) and the MMSE (Folstein, Folstein, & McHugh, 1975) as part of the CERAD-NP were carried out. For more information refer to the SENDA study protocol by Müller et al. (2020).
Between January 2018 and March 2020, 201 volunteers (born between 1933 and 1939, age 80-84 years, M = 81.8, SD = 1.3) were recruited in Chemnitz and its surroundings. This five-year age range was chosen to ensure comparability with other normative data sets (Beeri et al., 2006;Luck et al., 2018) and to prevent biases arising from wide age ranges (Miller et al., 2015). Neither the younger (79 years, n = 8) nor the older (85-91 years, n = 35) participants from the SENDA study were used here because the numbers were deemed too small to be representative. The sample was well balanced according to gender (99 males and 102 females) and included 122 highly educated (>12 years of education) compared to 79 less educated individuals (≤12 years of education). The corrected hearing and visual acuity status of participants were representative of independently living old-old adults. This sample incorporated impairments ranging from normal functioning to moderate, but excluded impairments that would inhibit independent living or activities of daily living. Table 2 contains additional sample characteristics.

Sociodemographic variables
The sociodemographic variables age, gender, and years of education (including school and further professional education) were obtained via a short, structured interview prior to neuropsychological testing. Education was dichotomized into high level of education (>12 years of education) and low level of education (≤12 years of education) according to Welsh et al.'s (1994) classification system.

CERAD-NP
The extended CERAD-NP was carried out by trained staff and strictly followed the manual provided by the Memory Clinic Basel. This included the following tests: Verbal Fluency Animals and S-Words, Boston Naming Test, MMSE, Wordlist Learning, Recall and Recognition, Constructional Praxis Copying and Recall, Trail Making Test (TMT) A and B. Eighteen test scores were calculated from these tests ( Table 3). The only change pertained to the presentation of visual stimuli for the Boston Naming Test (pictures) and the Wordlist Learning and Recognition (words). A custom-made LabView 2015 (National Instruments, Austin, TX, USA) script was used to present stimuli in the center of a screen (using the same size and font as the original stimuli) for standardized implementation.

Sensory testing
During testing, participants used the same aids (i.e., glasses and/or hearing devices) they normally use during everyday life. Corrected visual acuity was determined by the Freiburg Visual Acuity Test with Landolt C (Bach, 1996). Participants sat three meters from the screen and completed 18 trials to obtain the logarithm of the minimum angle of resolution (logMAR). This parameter is a measure of visual acuity loss and logMAR scores from 0 to .5 are considered (near) calls for participation via (free) local newspapers calls for participation via the official university website invitation letters sent to 3300 Chemnitz residents in cooperation with the registration office (random selection from addresses with the following criteria: German citizens, age 80-90 years, no nursing homes) word of mouth from already enrolled volunteers Inclusion criteria -Age ≥ 80 years a -Independent means of travel to and from the testing facility -German fluency at native language level Exclusion criteria -Medical ban from sports and other strenuous activities -Diagnosed psychological disorders (e.g., major depressive episode, anxiety disorder, substance use disorder) -Diagnosed neurocognitive disorders (e.g., delirium, dementia due to Alzheimer's disease, dementia due to vascular disease) -Montreal Cognitive Assessment < 19 -Permanent impairments due to brain surgery or stroke -Other neurological diseases (e.g., epilepsy, Parkinson's disease, neuropathy) -Severe diseases of the respiratory system (e.g., COPD stage 4, severe asthma) -Severe diseases of the cardiovascular system (e.g., cardiac arrhythmia, heart failure, arterial occlusive disease) -Severe diseases of the musculoskeletal system (e.g., severe arthritis, orthopedic operations in the last 6 months) -Diabetes with diagnosed neuropathy -Substance abuse -Current participation in other clinical trials a Participants were also included if they turned 80 during the course of the baseline measurements which included 3 separate testing days.
To quantify corrected hearing performance, one practice list (18) and three test lists (4, 14, 20) from the Freiburg monosyllabic test (part of the Freiburg speech test (Hahlbrock, 1953)) were presented at four sound levels (35 dB, 47 dB, 24 dB, 53 dB) without background noise via headphones (SHARK ZONE H10 Gaming Stereo-Headset, Sharkoon Technologies GmbH, Germany). The same order was used for all participants and the number of correctly repeated words (out of 20) was recorded for each test list. The rate of understanding at the 24 dB sound level was calculated as a percentage because this list displayed the widest range (0-20) and greatest variance (SD = 5.17) of the test lists.

Statistical Analysis
The analysis was done with SPSS IBM Statistics Version 27 (IBM Corp., Armonk, NY, USA). For each CERAD-NP score, a 2 × 2 analysis of variance (ANOVA) with between-subject factors sex (male/female) and education (high/low) was used to determine whether normative data should be calculated for the whole sample or subdivided into different groups. The results indicated that only eight scores were not significantly influenced by either gender or level of education (Table 4). Therefore, all further analyses were done separately for the following groups: (1) males with >12 years of education, (2) males with ≤12 years of education, (3) females with >12 years of education, and (4) females with 12 ≤ years of education. Mean, standard deviation, minimum, maximum, skew, and kurtosis were calculated for each score, and distributions were tested for normality with Shapiro-Wilk tests.
Percentile ranks (PR) for discrete test scores and percentiles (2. 28, 6.68, 10, 15.87, 25, 50, 75, 90) for continuous test scores were calculated because the majority of variables were not normally distributed and therefore did not allow for the calculation of standard norms. Afterward, standard norm equivalents in the form of z-scores were calculated using area transformation (Lienert & Raatz, 1998). The detailed steps are explained in the supplement. PR are only ordinal scales but can be easily interpreted for individual diagnostics, because they show how common an individual's test score is (Crawford, Garthwaite, & Slick, 2009). Z-scores are interval scales that can be used for group statistics and the interpretation of differences (Woerner, Müller, & Hasselhorn, 2017). In addition, they can be transformed into all other commonly used scales such as T or IQ scales by linear transformation.
In the final phase of analysis, multiple linear regression analyses for the whole sample were carried out with predictors age, gender, and years of education in a first step to control for these demographic variables. Visual acuity and  (Nasreddine et al., 2005), GDS = Geriatric Depression Score (15-item version, Gauggel & Birkner, 1999), SWL = Satisfaction with Life Scale (mean score, Diener, Emmons, Larsen, & Griffin (1985), CCI = Charlson Comorbidity Index (Charlson, Pompei, Ales & MacKenzie, 1987). *Health data was only available for n = 189 and all measures were self-reports.   (1,197). The direction of the effect was the same for all significant effects. Females performed better than males and the high education group performed better than the less educated.
hearing performance were then included as predictors for each CERAD-NP score in order to test whether they were related to performance beyond the effects of the demographic variables. Results were only reported when a significant R² change was obtained from including sensory predictors. Data from one participant were excluded because no visual acuity test data were available.

RESULTS
The ANOVA (Table 4) revealed that the highly educated group performed better at Fluency Animals, MMSE, Constructional Praxis Copying and Recall. There were trends in the same direction for TMT A and TMT B. In addition, females performed better than males in the Wordlist Learning task (List 2, 3, and Total), Wordlist Recall, TMT B, and Fluency S-words. A trend level effect in the same direction was found for performance in Fluency Animals, Wordlist Learning List 1, and TMT A. The results for females and males did not differ significantly for any other scores. Due to these results norms were reported stratified according to gender and level of education.
An overview of the performance in each test score and the distribution of the data in the normative sample can be obtained from Table 5. Data from one person were missing for TMT B because this person did not want to complete it. Data from another person were retrospectively excluded from the analysis for the tasks Wordlist Learning, Recall, and Recognition because a Wordlist Total score of 2 (the next worst score in the overall sample was 9) indicated a lack of motivation during the learning trials. The normative data (subdivided according to sex and education) are presented in detail for each CERAD-NP score in a separate table in the supplement. The data in each table are presented from worse to better scores for easier interpretation. The discussion includes an example of how to use these reference tables.

Influence of Hearing and Vision
For the majority of CERAD-NP scores (16 out of 18), performance was not related to either hearing or visual acuity. However, visual acuity predicted performance in TMT A and B (Table 6). In all cases, worse visual acuity (indicated by larger logMAR) was related to worse task performance (more time for TMT A and B). Hearing performance predicted only TMT B. Again, hearing loss (indicated by less correctly repeated words) was associated with deficits in task performance (more time for TMT B). The estimates for the regression coefficients are presented in Table 6 and can be used to derive practical implications. For example, an increase of 1 in the logMAR scale means the time needed for the TMT A increases by approximately 32 s and for TMT B by approximately 86 s.

DISCUSSION
This study aimed to present normative data for all scores derived from the extended CERAD-NP for a sample of nondemented adults between 80 and 84 years of age. The normative data were presented as PR for discrete scores and as percentiles for continuous test scores and can be used as a reference point for performance of the old-old taking into consideration sex and educational level. Moreover, the effect of visual acuity and hearing on test performance was studied and indicated good robustness towards corrected sensory impairments. Only performance in the TMT was shown to suffer from lower visual acuity.
As shown in many previous studies, the demographic variables education and sex significantly influence CERAD-NP performance (e.g., Beeri et al., 2006;Kirsebom et al., 2019). Higher levels of education positively bias the performance (D. Y. Lee et al., 2004;Luck et al., 2018;Welsh et al., 1994). This was replicated in our sample and our highly educated group performed significantly better than the less educated group in the Fluency Animals, MMSE, Constructional Praxis Copying and Recall tests. The report of sex differences in CERAD-NP performance is not quite as one-sided, but seems to be more in favor of women performing better than their male counterparts of the same age (Beeri et al., 2006;Luck et al., 2018;McCurry et al., 2001). Females in our sample also performed better than males in a number of scores encompassing a wide variety of cognitive functions (language skills, memory, and executive functions). Males did not score significantly better than women in any of the test scores. Taken together these findings support the use of education-and sex-specific norms in neuropsychological testing, which is already common practice.
The validity of the data set was examined by comparing it with other normative data sets. This is only possible to a limited extent as reports often differ with regard to the exact characteristics of the study sample and calculations of norm values (Woerner et al., 2017). Nevertheless, we used data from Luck et al. (2009) to evaluate our data because age and nationality of both studies matched. In general, good agreement was found between both study samples. Their categorization of educational level with three categories (high, medium, low) differed slightly from our dichotomous categorization (<12 years vs. ≥12 years). This dichotomous variable makes our data set comparable internationally (Beeri et al., 2006;Nasreddine et al., 2005;Welsh-Bohmer, Gearing, Saunders, Roses, & Mirra, 1997;Welsh et al., 1994). Only the high and low education groups of Luck et al. (2009) were used for comparison as they had the most overlap with our groups. Scores available for comparison were: Fluency Animals, Wordlist Total, Wordlist Recall, Wordlist Recognition, and Wordlist Savings. Only the lower end of the data distribution was compared as this is decisive for the detection of impairments. Table 7 shows the highest score that is considered at least one standard deviation below the mean. The values were slightly higher in our sample with 6 S. Fröhlich et al. a 0-2 absolute point difference. One reason for this discrepancy could be the performance advantage of university-based samples compared to community-based samples, as this advantage remains even after controlling for educational level (Andel et al., 2003). University-based samples include volunteers who sign up for longitudinal studies with multiple visits at the university (comparable to the SENDA study). In contrast, community-based samples are recruited directly in the community at senior centers (Andel et al., 2003) or from primary care facilities (Luck et al., 2009). Furthermore, these differences might be caused by the quality of education, an influencing factor on late-life cognition and health (Barba et al., 2021;Carvalho et al., 2015). Although samples with the same educational level were compared, it is unclear whether the quality of education was also comparable. A second reason for this discrepancy might be the   Luck et al. (2009), which resulted in a maximum participant age of 98 years. Including only a restricted age range decreases the risk of false-positive results for participants at the upper end of the age range. It has also been shown that even small differences in age can lead to significant differences in average performance (Beeri et al., 2006;Miller et al., 2015). For example, a group of 80-84 year-olds performed better than a group of 85-89 year-olds, which again differed from a group of 90-95 year-olds (Miller et al., 2015). Comparison with an adjacent younger age group (75-79 years) from a recent population-based study in Germany (Luck et al., 2018) further supported the validity of our data set. When comparing means and 1 SD cut-offs, the younger group performed better than the present sample (80-84 years) across most scores, which confirms the negative relationship between age and cognitive performance. For Boston Naming, MMSE, Constructional Praxis Copying, Constructional Praxis Recall, and Constructional Praxis Saving, these differences only had a range of one point. In all other scores, the differences were even more pronounced. For example, the performance of a highly educated woman would be considered one standard deviation below the mean in the Fluency Animals score if they named 19 or fewer animals according to Luck et al. (2018). In comparison, the age-appropriate data presented here suggest the same cut-off is at 16 points. This further illustrates the increased risk of false positives when a younger reference group is used, even when the age differences (in this case 75-79 vs. 80-84 years) are relatively small (Beeri et al., 2006;Luck et al., 2018;Miller et al., 2015). An exception was the Fluency S-Words score where across all educational levels the older participants achieved slightly higher scores. It has been suggested that verbal fluency might be less affected by age because it reflects crystallized abilities like vocabulary and knowledge (Beeri et al., 2006).
Last, the data were compared to norms of nondemented volunteers of the directly following age range (85-89 years) from a US study (Beeri et al., 2006). It was expected that our sample would perform similarly or better because of their younger age. Comparing the highest score that was considered at or below the 10th percentile showed that for Boston Naming, MMSE, Wordlist Learning List 1, List 2, List 3, Total and Recall the values were either the same or within one point. For Fluency Animals, Constructional Praxis Copying, TMT A, and TMT B the differences were much more pronounced and always showed worse performance in the older age group. Somewhat surprising is the big drop-off in the Constructional Praxis Copying task (i.e., 10th percentile for highly educated males was ≤ 5 (Beeri et al., 2006) vs. ≤8 in the SENDA sample). Considering that the testing of the older sample was carried out at participant's homes instead of during lab visits, it is possible that more participants with movement restrictions (including fine motor impairments) were included.
Taken together, the comparisons presented above demonstrate that results from the SENDA study fit well into the previously published data. In addition, these findings are a valuable addition to the existing literature because they included all scores, instead of a small selection, and we provide clinically relevant percentiles (related to 1, 1.5, and 2.0 SD).
The following example illustrates how the normative tables provided in the supplement can be used in practice. For the sample case (woman, 83 years old, 14 years of education) the following performances were recorded: Wordlist Learning Total-17, Wordlist Savings-35%. The Wordlist Total score is discrete and, hence, the number of points (17) must be looked up in the first column. In the same row, we find a PR of 21.3% in the column "Female >12 years education," which means that 21.3% of the reference sample scored the same or fewer points. This is also equal to a z-score of −.8, which indicates that the performance was below average but did not reach the −1.5 SD cut-off usually used to determine mild cognitive impairments. In contrast, the Wordlist Savings score is continuous and must be compared to the numbers given in the column "Female > 12 years education." Looking for the closest number above the score reached (35%), we find 40%, which is equal to the 6.68th percentile and z = −1.50. From this, we know that less than 6.68% of the reference sample performed worse than the sample case and that this test performance is more than 1.5 SD below the reference average indicating impairment in recall performance.
A further argument for providing this new data set for individuals 80-84 years of age is that neuropsychological reference data are ideally kept up-to-date to deal with cohort effects and socio-environmental changes that can alter typical test performance (Dickinson & Hiscock, 2011). The Flynn-Effect describes the phenomenon of generational gains in intelligence testing, which means that later-born cohorts typically have higher levels of fluid intelligence (Flynn, 1987;Skirbekk, Stonawski, Bonsang, & Staudinger, 2013). Similarly, it has been shown that the performance of OA in processing speed, language, executive function, and verbal memory tasks has improved across birth cohorts and that this trend could be ongoing in the future (Dodge et al., 2017;Skirbekk et al., 2013). Therefore, it is important to publish data shortly after data collection and to also include information about birth cohorts, as was done here. Using outdated references could potentially lead to missing cases of cognitive impairment or limit us to only being able to detect them later in the transition to disease.
Some limitations of this sample must be considered. Establishing the dementia-free status was based on self-report (no diagnosed dementia) and neuropsychological screening, but did not include a full clinical assessment. Therefore, the inclusion of as yet undetected cases of dementia cannot be completely ruled out. However, the number of such cases should be minimal because, in addition to the participant's self-report of clinical diagnoses, performance in the MoCA Normative Data for the CERAD-NP 9 was used to exclude such cases. Another potential limitation may arise because the birth cohort included in the sample (born between 1933 and 1939) grew up in Germany in the aftermath of World War II (1939)(1940)(1941)(1942)(1943)(1944)(1945). This has been shown to have long-lasting effects on health and lifestyle into old age (Conzo & Salustri, 2019;Havari & Peracchi, 2017). In addition, all participants were current residents of Chemnitz and its surroundings and the vast majority of them lived in eastern Germany all their life. From this follows a very distinct difference in the socialization conditions during their working adulthood in the GDR (German Democratic Republic) compared to people who lived in the FRD (Federal Republic of Germany). This may result in the sample not being representative for the whole German population of this age group. Comparisons between East and West German OA have shown that East German women perform better in memory and fluid intelligence tests compared to their West German counterparts (Rupprecht, 2000). It is assumed that this effect is caused by the higher rate of employment for women in the GDR (Rupprecht, 2000). Beyond this, a bias during recruitment cannot be excluded, which probably favored more educated and healthier adults. The final limitation relates to sample size, sample sizes of N = 50-75 are considered a sufficient compromise between costs for data acquisition and generalizability for neuropsychological tests norms (Bridges & Holler, 2007). The group of men with less than 12 years of education is relatively small (n = 24) compared to the other groups, which all meet this recommendation. Other studies also reported problems finding enough male participants with a low educational level (Beeri et al., 2006;Welsh et al., 1994). As there are no current and complete reference values for this age group, this sample must still be considered a valuable expansion of the existing data.

Influence of Hearing and Vision
As a secondary outcome, we were also interested in whether CERAD-NP performance might be related to hearing ability and/or visual acuity, even in a sample of nondemented participants suffering from an age-typical decline in vision and/or hearing. The results indicate that most CERAD-NP subtests are robust regarding the age-related sensory loss found in an old-old age group. This reinforces the good practical application of the test battery. It should be considered that all participants were asked to use vision and hearing aids during testing. Hence, this does not mean that sensory performance per se is irrelevant for test performance. Rather it suggests that as long as no pathological visual or hearing impairments are present, the tasks can be conducted adequately. Nevertheless, visual acuity predicted the TMT A, and TMT B scores in our sample. The TMT is a visual search paradigm, where 25 letters (TMT A) or 13 letters and 12 numbers (TMT B) are distributed over a sheet of paper and must be connected in the correct order. Hence, the negative effect of visual acuity loss (even when corrected) on performance time is not surprising. This is also in accordance with findings that patients with glaucoma performed worse in the TMT B (S. S. Lee, Wood, & Black, 2020). Therefore, the time needed to perform TMT A and B must be interpreted with caution. Fortunately, the third score TMT B/A, which is the quotient of both times, showed no relationship to visual acuity. As the visual search demands of both conditions are similar, the slowing in both due to visual impairments seems to cancel out. In addition, the TMT B/A was found to be a purer measure of executive functions (Arbuthnott & Frank, 2010) and to be less susceptible to effects of demographics (Christidi, Kararizou, Triantafyllou, Anagnostouli, & Zalonis, 2015). In summary, this supports the utilization of the TMT B/A score.
Only a single score (TMT B) was associated with hearing ability, although less so than with visual acuity. Worse hearing performance was related to longer times needed for the TMT B. This relationship seems counterintuitive, as there is no hearing involved in solving this task. However, the TMT B is known to be one of the more difficult tasks involving a high cognitive load. As a result, a high number of participants with dementia are unable to complete it (Schmid et al., 2014). Therefore, the relationship with hearing ability could be caused by fatigue, which, according to the effortfullness hypothesis, would be more severe in individuals with worse hearing as they would expend a lot more effort across the whole testing session understanding the oral instructions. The common cause hypothesis may also explain the relationship between hearing ability and task performance in tasks with no auditory stimuli (TMT B) indicating that the sensory and cognitive system were affected by the same neuropathological processes. In summary, lower sensory performance seems to be a concern for TMT performance, but the alternative scoring option (TMT B/A) can be used instead. It is noteworthy that all participants used glasses as needed and were, in general, considered to have normal to moderate-low corrected vision. The negative effects were present even though participants did not report any problems with the visual stimuli.
To conclude, this data set of nondemented individuals with an age between 80 and 84 years presents reference data for the application of the CERAD-NP in this age group in Germany and any population similar to the sample described here. The normative tables presented include all information required to easily evaluate test scores in comparison to the typical performance of this age group, while also taking into consideration sex and educational level. This will help improve the diagnostic process of dementia in old-old age because individuals that should be remitted for further diagnostics can be identified. In the future, these references will need to be supplemented by additional normative data sets that include individuals 85 years and older in order to cover the entire age spectrum for neuropsychological testing.