Dissociating Statistically Determined Normal Cognitive Abilities and Mild Cognitive Impairment Subtypes with DCTclock

Objective: To determine whether the DCTclock can detect differences across groups of patients seen in the memory clinic for suspected dementia. Method: Patients (n = 123) were classified into the following groups: cognitively normal (CN), subtle cognitive impairment (SbCI), amnestic cognitive impairment (aMCI), and mixed/dysexecutive cognitive impairment (mx/dysMCI). Nine outcome variables included a combined command/copy total score and four command and four copy indices measuring drawing efficiency, simple/complex motor operations, information processing speed, and spatial reasoning. Results: Total combined command/copy score distinguished between groups in all comparisons with medium to large effects. The mx/dysMCI group had the lowest total combined command/copy scores out of all groups. The mx/dysMCI group scored lower than the CN group on all command indices (p < .050, all analyses); and lower than the SbCI group on drawing efficiency (p = .011). The aMCI group scored lower than the CN group on spatial reasoning (p = .019). Smaller effect sizes were obtained for the four copy indices. Conclusions: These results suggest that DCTclock command/copy parameters can dissociate CN, SbCI, and MCI subtypes. The larger effect sizes for command clock indices suggest these metrics are sensitive in detecting early cognitive decline. Additional research with a larger sample is warranted.


INTRODUCTION
The clock drawing test (CDT) is one of the oldest and most widely used neuropsychological tests due to its ease of administration, brevity, and ability to capture a wide range of neuropsychological functions (Cosentino et al., 2004;Libon et al., 1996). The CDT is comprised of two conditions, producing a drawing to command, followed by copying a model of a clock. Successful performance requires accessing the semantic attributes associated with a clock, the necessary linguistic abilities to translate the command for time setting into the correct graphomotor response, motor operations, spatial reasoning and organization, working memory, and the capacity for mental planning (Cosentino et al., 2004;Libon et al., 1996).
The literature using traditional paper and pencil CDT is substantial and includes many investigations of its accuracy in detecting cognitive changes in aging and neurodegenerative disease (Hazan et.al, 2018). CDT performance has also been shown to distinguish between dementia subtypes and between mild cognitive impairment (MCI) subtypes (Ahmed et al., 2016;Cosentino et al., 2004;Kozora & Cullum, 1994;Libon et al., 1996;Price et al., 2011;Royall et al., 1998). For example, Ahmed et al. (2016) found that CDT errors across MCI subtypes are highly associated with language skills, including naming and verbal concept formation. Prior research using analog scoring methods has shown that while participants with AD typically improve from the command to the copy test condition, participants with disproportionate dysexecutive impairment, as seen in vascular dementia (VaD) and Parkinson's disease (PD) often fail to improve. This behavior is thought to be secondary to the impaired frontal systems operations that can typify these disorders (Cosentino et al., 2004;Libon et al., 1996;Price et al., 2011). Indeed, it has been shown that participants with VaD make more errors overall and tend to make the same errors on both command and copy conditions, suggesting an inability to alter mental set as test conditions change (Cosentino et al., 2004;Libon et al., 1996;Price et al., 2011). Analog clock drawing behavior has also been shown to be associated with neuroimaging biomarkers of disease. For example, Shoyama et al. (2011) obtained analog clock drawings to command from young normal controls and assessed brain activity using multichannel near-infrared spectroscopy. These investigators found that total time to completion was correlated with increased prefrontal oxygen hemoglobin recruitment.
Despite its merits and longevity, the traditional CDT pose some challenges as a diagnostic and screening tool. For example, standard 3, 5, or 10-point scoring systems tend to capture only a small number of features or errors that might indicate cognitive impairment. Additional problems associated with analog clock drawing scoring systems revolve around the need to establish inter-rater reliability and the time necessary to score protocols . These problems tend to limit how the CDT could be used in settings such as primary medical care to screen for neurocognitive impairment.
Over the past decade, innovations in digital technology have enabled researchers to create a digital clock drawing test (dCDT) that captures a wide range of clock drawing behavior in real time yielding thousands of variables or features Müller et al., 2019;Schejter-Margalit et al., 2021;Souillard-Mandar et al., 2016). Recent research has shown that machine learning algorithms using features extracted from the dCDT are able to classify dementia and non-dementia patients into their respective groups (Binaco et al., 2020;Souillard-Mandar et al., 2016;Dion et al., 2020). For example, Bianco et al. (2020) analyzed digital clock drawing features using machine learning algorithms. In this research, neural networks employing an information theoretic feature selection approach was able to achieve the best 2-group classification at or above 83% between patients diagnosed with AD versus and MCI; and between amnestic versus mixed/ dysexecutive MCI, and between CN versus amnestic or mixed/dysexecutive MCI subtypes. In another study, Davoudi et al. (2021) extracted digital clock drawing kinematic, time-based, and visuospatial features and examined how well these features could classify AD, VaD, and normal control participants into their respective groups. Optimal area under the curve was achieved using a combination of command and copy variables measuring kinematic (mean pen pressure, ratio of pen pressure to velocity), time-based, and graphomotor features.
Perhaps some of the most innovative and potentially informative data that can be extracted using the dCDT are the variety of timed-based parameters. For example, prior research has revealed that the majority of clock drawing time is spent not actually drawing. This behavior-called think time or time spent not putting ink on the test form has been demonstrated in patients with multiple sclerosis , MCI (Dion et al., 2020), and community volunteers evaluated as part of the Framingham Heart Study (Piers et al., 2017). Digital clock drawing research has also uncovered a number of decision-making latency variables defined by the time elapsed between clock drawing components , Piers et al., 2017. Research shows that these decision-making latencies vary in the command versus the copy test condition Piers et al., 2017). Another recent validation study in non-demented older adults found that total clock drawing time positively correlated with performance in multiple cognitive domains, while selected decision-making latencies were negatively correlated with performance on many of the same tasks (Dion et al., 2020).
Behavior often seen on the CDT includes the tendency of patients to initiate the drawing of numbers inside the clock face using anchor digits (i.e., the numbers 12, 6, 3, 9). Lamar et al. (2016) studied cognitively normal (CN) older adults who use an anchoring organizational strategy involving key digits of the clock face (numbers 12, 3, 6, and 9). Participants using this strategy had better performance on executive and memory tasks and exhibited greater regional integration within the left orbitofrontal and temporal cortices and the right anterior cingulate/right frontal gyrus.
In sum, there is growing support that digital clock drawing metrics aid in the differential diagnosis of cognitive diseases of aging and underlying disruptions in brain function. However, the dCDT Souillard-Mandar et al., 2016) require some post-processing. Moreover, normative data is limited. To help make the transition from research to widespread clinical use it would be useful for a test to require little to no post-processing and clock drawing indices expressed as standard scores measuring constructs that underlie successful performance. Recently, a dCDT, DCTclock TM , has become commercially available as part of the Linus health platform (https://linus.health). The DCTclock TM builds upon previous dCDT research, capturing metrics previously described in the literature (e.g., 'think time,' spatial organization, drawing size) with machine learning analytics. The DCTclock TM diverges from prior dCDT by introducing a cloud-based scoring platform requiring no examiner post-processing, four age-adjusted composite scores for both command and copy conditions, and a composite total command/copy score designed to be user friendly and aid clinical interpretation. In a recent paper, Rentz and colleagues (2021) studied a group of CN participants who had amyloid and tau positron emission tomography (PET) imaging and a group of participants with MCI or early AD. Among participants with imaging biomarkers of amyloid and tau, the DCTclock TM total score and spatial reasoning index scores were associated with greater amyloid and tau burden. Despite these interesting findings, there is limited research on these DCTclock TM metrics to date.
The current study sought to further investigate the utility of DCTclock TM generated metrics for distinguishing between statistically defined MCI subtypes. In the current study, the DCTclock TM was administered to memory clinic patients who were classified as presenting with subtle cognitive impairment (SbCI; Edmonds et al., 2015) or MCI using statistically determined criteria (Bondi et al., 2014;Jak-Bondi et al., 2009). Past research regarding MCI suggests different clinical and pathology outcomes depending on specific MCI subtypes (Schneider et al., 2009). Therefore, a test that is able to dissociate between MCI subtypes would have considerable utility in both primary and special care settings. The goal of the current research was to assess differences across participant groups in the total score and composite metrics generated by DCTclock TM .

Participants
Participants in the current research (n = 103; 100% White older adults) were patients recruited from Rowan University, New Jersey Institute for Successful Aging, Memory Assessment Program (MAP). All MAP patients underwent a comprehensive neuropsychological evaluation and were examined by a social worker and board-certified geriatric psychiatrist. An Magnetic resonance imaging (MRI) study of the brain and appropriate serum blood tests were obtained to evaluate for reversible causes of dementia. A clinical diagnosis was determined for each patient at an interdisciplinary team conference. Participants diagnosed with MCI presented with subjective cognitive complaints and/or evidence of cognitive impairment relative to age and education, preservation of general functional abilities, and the absence of dementia. Participants were excluded if there was any history of head injury, substance abuse, or major psychiatric disorders, including major depression, epilepsy, B12, folate, or thyroid deficiency. For all participants, a knowledgeable family member was available to provide information regarding functional status. This study was approved by the Rowan University Institutional Review Board with consent obtained consistent with the Declaration of Helsinki.

Neuropsychological Assessment
The neuropsychological protocol used to classify MCI subtype assessed three domains of cognition: executive control, naming/lexical access, and episodic memory. Measures of visuospatial functioning were not assessed or used for MCI subtype classification. From this protocol, nine parameters, three from each neurocognitive domain, were used to classify MCI subtype as described below (Emrani et al., 2018). All test scores were expressed as z-scores derived from normative data.

Executive Control
This cognitive domain was assessed with three tests including The Boston Revision of the Wechsler Memory Scale-Mental Control subtest (Lamar et al., 2002); the letter fluency test (Spreen & Strauss, 1990); and the Trail Making Test-Part B (Reitan & Wolfson, 1985). The dependent variable for the mental control subtest was the total non-automatized accuracy index (see Lamar et al., 2002 for full details). The dependent variables obtained from the letter fluency test and Trail Making Test-Part B were demographically corrected scores provided by Heaton et al. (2004).

Lexical Access/Language
This domain was also assessed with three tests, including the 60-item version of the Boston Naming Test (BNT) (Kaplan et al., 1983); a test of semantic ('animals') fluency where participants were asked to produce as many names of animals in 60s excluding perseverations and extra-category intrusion responses (Spreen & Strauss, 1990); and the Wechsler Adult Intelligence Scale-III (WAIS-III) Similarities subtest (Wechsler, 1997). The dependent variables for the BNT and 'animal' fluency tests were standard scores adjusted for age, sex, and race obtained from Heaton et al. (2004). The dependent variable obtained from the WAIS-III Similarities subtest was the age-corrected scale score (Wechsler, 1997).

Memory and Learning
This cognitive domain was assessed with the 9-word California Verbal Learning Test (CVLT)-Mental Status test (Delis et al., 2000). This test was scored and administered using standard instructions. Three CVLT-short form variables were used in the current research including total immediate free recall, delayed free recall, and the delayed recognition measure adjusted for age, sex, and education.

Determination of Clinical Subtypes
Single and Multi-Domain MCI. Jak et al. (2009) criteria were used to determine MCI subtype. Single domain MCI was diagnosed when participants scored >1.0 standard deviation below normative expectations on any of two of the three measures within a single cognitive domain. Mixed MCI was diagnosed when participants scored >1.0 standard deviation below normative expectations on any of two of the three measures within two or more cognitive domains. Based on these procedures, 21 participants were diagnosed with single domain amnestic MCI (aMCI), 6 participants were diagnosed with single domain dysexecutive MCI, and 22 were diagnosed with mixed or multi-domain MCI. Because of the small number of dysexecutive MCI participants, a combined DCTclock differentiates MCI subtypes mixed/dysexecutive (mx/dys) MCI subgroup (n = 28) was constructed. This decision was made based on prior research (Emrani et al., 2018, Eppig et al., 2012Libon et al., 2011) where mixed/dysexecutive participants presented with similar patterns of impairment on executive tests. Table 1 shows descriptive statistics for neuropsychological performance, demographics, and clinical ratings in each group.
SbCI. Thirty-three of the 54 participants not meeting Jak et al. (2009) criteria for MCI were classified as presenting with subtle MCI (SbCI) using a modification of Edmonds et al. (2015) criteria. These participants scored >1 sd below the mean on two of the nine neuropsychological measures in different cognitive domains (Edmonds et al., 2015).

Cognitive Normal (CN) Group
Twenty-one participants did not meet criteria for either SbCI (Edmonds et al., 2015) or MCI (Bondi et al., 2014;Jak et al., 2009). One individual presented with some, but very little cognitive impairment, such that only one of the nine neuropsychological parameters was below the 1 SD cutoff. All of these participants were combined into a single group and labeled as presenting with CN.

The dCDT
DCTclock TM is based on the traditional paper and pencil clock drawing task and was originally designed, with cooperation from the Clock Sketch Consortium , at Lahey Clinic and the Massachusetts Institute of Technology (Souillard-Mandar et al., 2016). It was further developed and licensed for research use by Digital Cognition Technologies Inc. now part of Linus Health, and is cleared by the Food and Drug Administration for cognitive assessment. Participants are presented with a paper test form containing a faint dot pattern and handed a digital pen that looks and functions like a normal pen but contains a camera sensor that captures pen position every 12 ms. The instructions used to administer the DCTclock TM are consistent with traditional CDT administration and included both command and copy test conditions. In the command condition, participants are asked to, "draw the face of a clock, put in all numbers, and set the hands for 10 after 11." Upon completion of the command test condition, the copy test condition is administered whereby participants are asked to copy a model of a clock with hands set for '10 after 11'. The digital pen allows for the capture of thousands of clock drawing features to be analyzed as a series of timestamped (x,y) coordinates.
DCTclock TM produces multiple objective measurements that were derived from approximately 5000 digital clock drawings using machine learning algorithms (Binaco et al., 2020;Davis et al., 2014). Machine learning algorithms were previously developed to calculate meaningful clock scores based on their ability to discriminate performance between thousands of healthy controls and participants from different diagnostic groups including aMCI, AD dementia, PD and other neurodegenerative disorders (Davis et al., 2014; Souillard-Mandar  et al., 2016). Details on how the DCTclock TM algorithm and scoring process have been described in detail elsewhere (Rentz et al., 2021). Table 2 contains a description of the 9 DCTclock TM indices used for this analysis.

Statistical Analyses
Hierarchical linear regression models with block-wise predictor entry were constructed to investigate differences among groups on the DCTclock TM composite indices, and the total command/copy score. Due to the lack of normative data for education and sex for the composite indices, education and sex were entered into Step 1 of the hierarchical models for these variables to allow for the interpretation of group differences after controlling for variability among these factors. Unlike the composite indices, the DCTclock TM total command/copy score is not adjusted for age, and therefore we entered age, in addition to education and sex, in Step 1 of the hierarchical models using the total command/copy score. In Step 2, dummy coded variables representing between-group differences among the CN and MCI group subtypes were entered into the model. Dummy coding is a method frequently utilized in regression analysis to allow for the coding and incorporation of categorical predictors into the model (see Tabachnick & Fidell, 2013 for details). This coding sets one level of the variable as the control group, to which all other groups are then compared. In order to obtain a description of all possible group differences, K − 1 dummy codes, where K represents the levels of the categorical predictor, were created and included into the regression analysis. The results produced from Step 2 were interpreted to assess for between-group differences after controlling for demographics.

Preliminary Analyses
Four participants with DCTclock TM composite index scores in excess of 3.29 were identified as outliers and removed from analyses (Tabachnick & Fidell, 2013). No violations associated with the ordinary least squares estimator were identified. Descriptive statistics for all nine DCTclock TM parameters can be found in Table 3.
Overall, of the three indices that distinguished aMCI from CN (total score, command information processing, and command spatial reasoning), the greatest effect size was achieved using the command spatial reason index (β = −.35). Of the six indices that distinguished mx/dysMCI from CN (total score, command drawing efficiency, command information The efficiency the participant demonstrated during the process of drawing the clock. This considers metrics such as total time spent compared to amount of ink used, pen strokes and ink length, size of the drawing, etc.

Simple/Complex Motor Operations
The motor components involved during the process of drawing the clock. This considers metrics including speed and oscillatory motion and can be helpful in parsing out graphomotor concerns. Information Processing Speed The ability to process information demonstrated during the process of drawing the clock. This considers metrics including latencies, pauses, and relative time spent thinking (without pen to paper) versus actively drawing.

Spatial Reasoning
The spatial abilities demonstrated during the process of drawing the clock. This considers metrics including geometric and spatial placement of the various properties of the drawing.
Note: Score calculation is automated and cloud-based. Composite and subscale scores are calculated for both command and copy conditions and normed with respect to cognitively healthy individuals. Composite scales and subscale metrics are adjusted for age.
DCTclock differentiates MCI subtypes 7 processing, command spatial reasoning, copy drawing efficiency, copy information processing), the greatest effect size was achieved using the DCT total score and command spatial reason index (both β = −.59). None of the copy indices significantly distinguished aMCI from CN participants (p > .05).
Distinguishing mx/dysMCI from aMCI. DCTclock TM measures that significantly distinguished mx/dysMCI participants from aMCI participants included DCT total score (t = −2.82, p = .006), copy drawing efficiency (t = −2.11, p = .037), and copy Spatial Reasoning (t = −2.54, p = .012). Participants in the mx/dysMCI group scored lower on these three indices compared to those in the aMCI group. Overall, DCT total score and the copy spatial reasoning index score generated the greatest effect size (both βs = −0.33).

Distinguishing SbCI from MCI and CN
DCTclock TM measures that significantly distinguished SbCI participants from CN participants included the DCT total score (t = 2.85, p = .005), the command condition spatial reasoning index (t = 2.41, p = .018), and the copy condition drawing efficiency (t = 2.09, p = .04) and simple motor indices (t = 2.14, p = .035). Only the command condition spatial reasoning index distinguished SbCI from mx/dsyMCI (t = 3.04, p = .003) and none of the scores distinguished ScCI from aMCI.

DISCUSSION
Our findings suggest that DCTclock TM metrics can accurately distinguish between Jak/Bondi neuropsychological-defined clinical MCI subtypes, SbCI , and normal cognitive aging in a memory clinic sample. Critically, while individual index scores varied from one group comparison to the next, the DCTclock TM Total Score, a single score that aggregates across all command and copy condition metrics revealed significant differences in performance patterns across groups.
In previous research Cosentino et al. (2004) found that clock drawing errors in the command condition were associated with overall illness severity and degrade access to semantic knowledge. Errors produced in the copy test condition were associated with dysexecutive difficulty. These findings underscore the complimentary, but different neurocognitive abilities that underlie successful clock drawing. It is very likely that these neurocognitive disabilities contribute to a reduced DCTclock TM total ccore. More research is necessary to test this supposition. Nonetheless, the current research suggests that the DCTclock TM total score could be a reasonable omnibus measure to screen for many of the important cognitive domains that underlie MCI and SbCI.
It has been suggested that the biological substrate underlying insidious onset AD/VaD spectrum syndromes (Emrani et al., 2021a) have their origin years before clinical symptoms emerge. Thus, there is an urgent need to develop effective and time efficient tests to screen for emergent neurodegenerative illness. The traditional venue to assess for pre-dementia/ dementia illness has been the specialty memory clinic. Yet, the worldwide prevalence of dementia syndromes, such as AD, suggests that screening for dementia should become part of routine primary care. The brevity, ease of administration, autonomously scoring, and sensitivity to AD biomarkers suggests DCTclock TM could provide the means to screen for neurocognitive impairment in the primary care environment.
In addition to the total score, we found evidence that DCTclock TM index scores that capture more nuanced aspects of clock drawing performance also have utility, particularly for distinguishing between cognitive profiles to inform differential diagnosis. The index score with the best dissociation between CN versus SbCI and Jak/Bondi determined MCI groups in our sample was command spatial reasoning index, a compilation measuring clock face circularity and the spatial relationships of the components drawn within the clock face (i.e., digits, clock hands). These data suggest that nuanced changes in motor, executive, and visuospatial functioning critical for clock organization and construction may characterize SbCI and distinguish between profiles of early-stage amnestic versus executive cognitive decline. These findings lend empirical support to prior research showing heteromodal ventral stream alterations in normal control participants who did not use anchor digits to organize their clock drawings, thereby displaying less spatial organization/reasoning in their approach . If looked at longitudinally, subtle motor and spatial reasoning deficits may be a harbinger of cognitive impairment given the role of ventral steam visual processing regions in signaling the emergence of SbCI and conversion from normal cognition to MCI and then to AD (Lee et al., 2008;Thomann et al., 2008). The findings of the current study also add to data reported by Rentz and colleagues (2021), who showed that the spatial reasoning index score was associated with greater cerebral amyloid and tau burden. Interestingly, however, their finding was also specific to the spatial reasoning index score but from the copy condition rather than the command condition.
In the current study, copy condition index scores tended to have the most utility when distinguishing SbCIfrom normal cognition, and distinguishing mx/dysexecutive MCI from amnestic MCI and normal cognition. Of the various index scores in the copy condition, comparisons of group performance, with the exception of the amnestic MCI versus normal cognition comparison, most consistently differed on the drawing efficiency index. Consistent with the findings reported by Cosentino et al. (2004), these findings might suggest that clock drawing to copy is specifically linked to dysexecutive difficulty. Group differences on copy condition index scores underscores the benefit of this test condition and is consistent with a large corpus of prior clock drawing literature (see Cosentino et al., 2004;Price et al., 2011;Wiggins et al., 2021). However, further research and replication will be needed to parse the differential contributions of the command versus copy conditions for DCTclock TM indices in normal aging and early-stages cognitive decline. Further research is also needed to clarify the potentially  (Emrani et al., 2021a(Emrani et al., , 2021b resulting in better diagnostic decision-making. Previous research has suggested that participants diagnosed with amnestic MCI may be at greater risk to progress to pathological confirmed AD (Guillozet et al., 2003;Grundmanet al., 2004;Devlin et al., 2021). Patients with mixed or dysexecutive MCI may be expected to revert to a CN state or progress to other dementia syndromes such as frontotemporal dementia, dementia with Lewy bodies, VaD associated with small vessel disease, or depression (Schneider et al., 2009;Ferman et al., 2013;Dugger et al., 2015).
The current research is not without limitations. First, our sample size is modest, overwhelming white, and highly educated, which limits the generalizability of our findings. Gathering digital clock protocols from ethnically and racially diverse patients and non-native English speakers is critical. Also, the exact relation between DCTclock TM indices and education needs to be determined. Second, data were collected from self-referrals presenting to a specialized memory and aging program because of memory concerns. As stated above, to maximize the effectiveness for any neurocognitive screening test, data need to be collected in diverse setting such as primary medical care, family medicine, and obstetrics/gynecology where many women get their primary care. Third, biomarkers such as cerebral or cerebral spinal fluid (CSF) amyloid and tau levels, or brain volumetrics and vascular disease markers were not available for this analysis. We therefore cannot confirm the distinctness of our Jak/ Bondi defined clinical groups, and how these groups relate to neurodegenerative neuropathology or cerebrovascular disease. In this regard, there is a need to gather DCTclock TM data on diverse clinical samples with dementia biomarkers for further validation. Fourth, we acknowledge that other neuropsychological tests/domains of cognitive functioning could have been used for classification of MCI groups. The rationale for using the protocol that we did was based on prior research showing that the specific neuropsychological tests used were able to illustrate key neurocognitive constructs and differentiate between MCI subtypes (Emrani et al., 2018). Moreover, in addition to Jak-Bondi criteria others mean to classify MCI patients in relation to DCT TM performance should be undertaken. Lastly, the current study lacks the inclusion of test data from the visuospatial functioning domain. This domain is relevant to clock drawing performance, and should be examined in the future.
Despite these limitations, the current study contributes to the literature in that this is the first report on the ability of DCTclock TM metrics to distinguish between CN and clinical SbCI/MCI subtypes. The data described above, along with recent findings described by Rentz and colleagues (2021) build upon years of prior digital clock drawing and machine learning research (Binaco et al., 2020;Lamar et al., 2016;Libon et al., 2014;Piers et al., 2017;Souillard-Mandar et al., 2016). Collectively, these data provide evidence for a commercialized and Federal drug administration (FDA)approved digital clock drawing tool, DCTclock TM that has the capacity to leverage this technology for broader clinical and research use. The provision of normative data, automated scoring, and simplified composite metrics from the DCTclock TM system may improve the usability and efficiency of machine learning-based analytics of clock drawing performance. Finally, a tablet-based version of the DCTclock has recently been developed by Linus Health. As the field moves further toward tablet-based digital assessment, additional research is needed to investigate and compare the validity of digital pen versus table-based approaches to clock drawing assessment.

FINANCIAL SUPPORT
There is no financial support to disclose for this research.