Highlights
-
• Integrating clinical judgment with objective measures can refine patient selection for shunt surgery in idiopathic normal pressure hydrocephalus (iNPH).
-
• A more individualized approach better captures patient heterogeneity than rigid cut-offs in iNPH.
-
• Gait and balance remain robust predictors, with psychomotor speed offering additional insight for clinical judgment in shunt referral.
Introduction
The prevalence of cognitive disorders is increasing, which is a significant public health concern as the global population continues to age. 1 Idiopathic normal pressure hydrocephalus (iNPH) is one such condition, predominantly affecting older adults. According to a systematic review, the prevalence of probable iNPH ranges from 10 to 22 cases per 100,000 individuals across all age groups, increasing to 5.9% among individuals aged 80 years and above. Reference Zaccaria, Bacigalupo and Gervasi2 iNPH is characterized by ventricular enlargement in the presence of normal CSF pressure. Reference Hakim and Adams3 Clinically, iNPH is defined by a triad of symptoms: gait disturbance, cognitive decline and urinary incontinence. Reference Adams, Fisher, Hakim, Ojemann and Sweet4 Timely management of iNPH is imperative, as patients may experience symptom improvement following shunting. However, its clinical features lack specificity, complicating the diagnostic process. Reference Graff-Radford and Jones5,Reference Passos-Neto, Lopes, Teixeira, Studart Neto and Spera6 Radiological signs such as ventriculomegaly are frequently misattributed to brain atrophy associated with normal aging, Reference Jack, Shiung and Gunter7,Reference Moore, Kovanlikaya and Heier8 contributing to the underdiagnosis of iNPH. It is estimated that up to 80% of iNPH cases go unrecognized or remain untreated. Reference Kiefer and Unterberg9
A key diagnostic and prognostic tool for iNPH is the CSF tap test (CSF-TT), which involves the removal of 30–50 mL of CSF via lumbar puncture. This procedure is used to assess potential symptom improvement and to estimate the likelihood of a positive response to shunt surgery. Reference Wikkelsø, Hellström, Klinge and Tans10,Reference Wikkelsø, Andersson, Blomstrand and Lindqvist11 Pre- and post-CSF-TT evaluations generally focus on cognitive and gait performance, with post-CSF-TT improvements being interpreted as indicators of surgical benefit. Reference Nakajima, Yamada and Miyajima12 Cognitive impairments in iNPH typically affect fronto-subcortical functions, Reference Nakajima, Yamada and Miyajima12–Reference Ogino, Kazui and Miyoshi15 whereas gait disturbances commonly involve slowed walking speed and impaired balance. Reference Chunyan, Rongrong and Youping16,Reference Gallagher, Marquez and Osmotherly17 As such, CSF-TT protocols generally include targeted assessments of these functions. However, no consensus currently exists regarding the selection of tests or the definition of significant post-CSF-TT improvement, resulting in marked heterogeneity in clinical practice. Reference Passos-Neto, Lopes, Teixeira, Studart Neto and Spera6,Reference Nakajima, Yamada and Miyajima12,Reference Mihalj, Dolić, Kolić and Ledenko18,Reference Nunn, Jones and Morosanu19 In the absence of standardized criteria, a wide array of tests is employed, some lacking specificity or posing considerable demands on older adults, thereby increasing the burden for both patients and clinicians.
Efforts to define what constitutes a “meaningful improvement” have yielded variable and sometimes arbitrary thresholds. For instance, cognitive improvement has been defined as performance gains of at least one standard deviation on 50% of tests, Reference Duinkerke, Williams, Rigamonti and Hillis20,Reference Thomas, McGirt and Woodworth21 any improvement on at least one cognitive measure Reference Nimni, Weiss, Cohen and Laviv22 or a 3-point increase on the Mini-Mental State Examination. Reference Chunyan, Rongrong and Youping16,Reference Oike, Inoue, Matsuzawa and Sorimachi23 For gait, various cut-off values have been proposed. In responders, average post-CSF-TT improvements of 3.98 s on the Timed Up and Go test, 0.08 m/s on the 10-Meter Walk Test (10MWT) and 5.29 points on the Berg Balance Scale have been reported. Reference Gallagher, Marquez and Osmotherly17 Other studies have suggested that a 23% increase in walking speed may be clinically relevant, Reference Stolze, Kuhtz-Buschbeck and Drücke24 although a threshold as low as 5% has been used. Reference Damasceno, Carelli, Honorato and Facure25
Identifying patients most likely to benefit from shunt surgery remains challenging. A systematic review highlighted that a positive CSF-TT response is a strong predictor of shunt responsiveness (positive predictive value: 92%; specificity: 75%) and a negative response fails to reliably rule out potential benefit (negative predictive value: 37%; sensitivity: 58%), Reference Mihalj, Dolić, Kolić and Ledenko18 underscoring the need for more individualized and clinically nuanced interpretation. A key limitation of previous studies is their reliance on absolute score changes or predefined cut-offs, which often overlook the broader clinical context and individual characteristics, such as demographic factors.
The specialized Department of Neurological Sciences at Centre Hospitalier Universitaire de Québec – Hôpital de l’Enfant-Jésus (CHU-HEJ), recognized as a major neurological center in Canada, serves as the reference clinic for neurology in Eastern Quebec, covering the entire region for the assessment and management of iNPH. At this center, a standardized CSF-TT protocol has been in place for more than 15 years. This protocol includes cognitive and gait assessments within a flexible framework that fosters clinical judgment for evaluating patients with suspected iNPH. Rather than relying on statistical thresholds to determine shunt eligibility, referral decisions at CHU-HEJ are informed by a combination of objective test results and clinician expertise, allowing for more individualized and context-sensitive decisions.
The primary objective of this retrospective study was to identify which pre- to post-CSF-TT change indices in cognitive and gait variables, along with patient characteristics, were statistically associated with the likelihood of being referred for shunt surgery following the protocol at CHU-HEJ. Referral status (shunt or no shunt) was determined based on clinical judgment, rather than predefined improvement criteria. The aim was to better understand the clinical variables that most influence decision-making regarding surgery referral. A secondary objective was to compare pre- and post-CSF-TT results in the shunt and no shunt groups. This analysis aimed to identify which cognitive and gait variables were most sensitive to change and whether these measures were aligned with those that influenced referral decisions (Objective 1). Both objectives were designed to be complementary, linking the examination of pre- to post-CSF-TT sensitivity with the variables influencing referral decisions. The analysis of pre- to post-CSF-TT measures follows the framework used in prior iNPH studies, while the investigation of which factors influence referral decisions represents a novel aspect. Together, these complementary objectives clarify the clinical utility of cognitive and gait measures by highlighting their responsiveness and relevance within an integrative clinical framework.
Methods
Study design
This study is a retrospective chart review of 175 patients diagnosed with probable iNPH. Patients were evaluated by a neurologist at CHU-HEJ between January 2010 and May 2022. Inclusion criteria were: (i) meeting the diagnostic criteria for probable iNPH as defined by the 2021 Japanese guidelines for normal pressure hydrocephalus (NPH) Reference Nakajima, Yamada and Miyajima12 (patient records were selected based on their alignment with these updated diagnostic criteria); (ii) being native French-speaking (as French is the predominant and official language of Quebec); and (iii) having completed the CSF-TT protocol at CHU-HEJ. Patients with known factors contributing to secondary hydrocephalus, such as subarachnoid hemorrhage or meningitis, as well as those with congenital NPH, were excluded from the study. The data entry process was conducted in duplicate, and discrepancies were cross-verified to ensure data quality. Ethical approval for the study was obtained from the Comité d’éthique de la recherche du Centre Hospitalier Universitaire de Québec – Université Laval (Project #2021-5509).
All patients were initially evaluated by a neurologist through a comprehensive medical examination, which included a brain MRI procedure and a series of neurological and cognitive screening tests. Brain images were examined for ventricular enlargement indicated by an Evans ratio of 0.3 or higher Reference Evans26,Reference Hashimoto, Ishikawa, Mori and Kuwana27 and for disproportionately enlarged subarachnoid space hydrocephalus. Reference Nakajima, Yamada and Miyajima12 A clinical nurse administered the Montreal Cognitive Assessment (MoCA), which has been validated as a screening tool for mild to severe cognitive impairment. Reference Nasreddine, Phillips and Bédirian28 Based on clinical symptoms and neuroimaging features, patients presenting a clinical profile consistent with possible iNPH were referred to the CHU-HEJ medical day unit for completion of the standardized CSF-TT protocol. Following completion of the protocol, all patients included in the study ultimately met the criteria for probable iNPH according to the 2021 Japanese guidelines. Reference Nakajima, Yamada and Miyajima12 Although neurologists considered potential alternative diagnoses or co-occurring conditions during the clinical evaluation, all patients included in the present study were selected only if iNPH remained the primary working hypothesis in their medical file. Thus, while secondary co-pathologies may have been suspected, they did not outweigh iNPH as the leading diagnostic consideration.
The standardized CSF-TT protocol followed a fixed weekly schedule. On Tuesday morning (9 AM), patients completed the pre-neuropsychological assessment, followed by the pre-physiotherapy assessment on Tuesday afternoon (1 PM). The lumbar puncture was performed the next morning on Wednesday at 9 AM. Post-CSF-TT assessments were then conducted the same day: the post-physiotherapy at 11 AM and the post-neuropsychological at 1 PM.
Demographic and clinical data
Age, biological sex and the number of years of formal education were obtained from patients’ medical records at the time of admission. Years of formal education were treated as a continuous variable, contextualized within the Quebec education system: elementary school corresponds to 6 years; high school diploma, 11 years; college diploma, 13 years (14 for technical degrees); bachelor’s degree, 16 years; master’s degree, 18 years; doctoral degree, 21 years; and medical or postdoctoral fellowship, 23 years. When patients held multiple degrees, the highest level was considered. Additionally, the quantity of CSF withdrawn during the CSF-TT protocol and the MoCA score from the initial medical evaluation were obtained from the patient’s medical record.
Neuropsychological assessment
The tests were selected according to evidence-based principles, focusing on psychomotor speed, information processing speed, attention, short-term and working memory and executive functions. Reference Ogino, Kazui and Miyoshi15,Reference Nimni, Weiss, Cohen and Laviv22,Reference Laidet, Herrmann, Momjian, Assal and Allali29,Reference Peterson, Savulich, Jackson, Killikelly, Pickard and Sahakian30 The same battery of tests was administered in a standardized order at both pre- and post-CSF-TT: Grooved Pegboard test, 31 Delis–Kaplan Executive Function System (D-KEFS) (Trail Making test [TMT], Color-Word Interference), Reference Delis, Kaplan and Kramer32 phonemic verbal fluency and categorical verbal fluency, Reference Delis, Kaplan and Kramer32,Reference St-Hilaire, Hudon and Vallet33 Weschler Adult Intelligence Scale IV (Symbol Search, Coding, Digit Span forward and backward) Reference Wechsler34 and Neuropsychological Assessment Battery (Number and Letters subtests Part A). Reference Stern and White35 A total of 18 neuropsychological variables (raw scores) were selected for analysis. The administration of these instruments was carried out by qualified neuropsychologists, who followed standard instructions for each test. The 18 variables that were retained, along with the modifications made to the administration, are presented in Supplementary Table 1.
Physiotherapy assessment
The tests were selected according to evidence-based principles, focusing on deficits in gait and balance that are consistent with the neurological characteristics associated with iNPH. Reference Wikkelsø, Hellström, Klinge and Tans10,Reference Chunyan, Rongrong and Youping16,Reference Gallagher, Marquez and Osmotherly17,Reference Lilja-Lund, Nyberg, Maripuu and Laurell36 The same tests were administered at pre- and post-CSF-TT: 10-Meter Walk Test – normal pace (10MWT-C), Reference Kim, Park, Lee and Lee37 10-Meter Walk Test – Dual Task (10MWT-DT), Reference Lilja-Lund, Nyberg, Maripuu and Laurell36 Timed Up and Go Reference Podsiadlo and Richardson38 and Berg Balance Scale. Reference Berg, Wood-Dauphinee, Williams and Maki39 Four gait/balance variables (raw scores) were selected for analysis (see Supplementary Table 1). The administration of these tests was conducted by qualified physiotherapists, who followed standardized instructions for each test.
Lumbar puncture
The day following pre-CSF-TT, a single lumbar puncture was performed by a neurologist using a 20-gauge needle to remove 30–50 mL of CSF. During the procedure, the neurologist ensured that CSF pressure was within the normal range (≤200 mmH2O). Patients were re-evaluated for gait and cognitive performance 2 hours and 4 hours after CSF removal, respectively. Reference Nakajima, Yamada and Miyajima12
The neuropsychologist and the physiotherapist each compared pre- and post-CSF-TT cognitive and gait performances using raw or standardized scores. Based on these quantitative data, they formulated a clinical judgment of the changes observed in the patient’s functioning. This judgment was informed by the number, magnitude and clinical relevance of the changes. The neurologist then integrated these professional impressions with the broader clinical profile of the patient to make the final decision regarding referral for shunt surgery. Patients were subsequently categorized into two groups: those referred for surgery (shunt) and those not referred (no shunt). Patients in the shunt group who ultimately decline surgery are still considered part of this group in the study.
Statistical analysis
All analyses were performed using SAS® 9.4 software. Descriptive statistics, including mean and standard deviation, were calculated for baseline demographic and clinical variables (age, number of years of education, volume of CSF withdrawn during lumbar puncture, MoCA screening test score). For group comparisons, normality of data distribution was assessed with visual inspection and Shapiro–Wilk tests. Homogeneity of variance was subsequently verified using Levene’s test. Two-tailed independent Student’s t-tests were applied for variables meeting both normality and homogeneity assumptions; otherwise, Mann–Whitney U tests were employed. Categorical variables, such as biological sex, were analyzed using Pearson’s chi-square tests. Statistical significance was defined as p < 0.05. The specific statistical test applied to each variable is detailed in Table 1.
Table 1. Demographic and clinical data: descriptive statistics and comparison between groups

* Significant differences (p < 0.05) are shown in bold.
† Parametric test used (Independent Students’ t-test).
‡ Pearson Chi-squared.
Stepwise logistic regression was employed to identify predictors for shunt surgery referral, utilizing pre- to post-CSF-TT cognitive and gait change indices calculated with raw scores as (post-pre)/pre. The binary outcome was the referential status: shunt or no shunt. A forward stepwise procedure (p < 0.05 for inclusion and exclusion) was used, with sex, age and years of education entered as covariates. Missing data were handled via multiple imputation (100 datasets, predictive mean matching). The dataset was partitioned into an 80% training set (n = 141, proportional random selection) and a 20% test set (n = 34) for model validation.
Mixed-effects ANOVA models for repeated measures compared pre- and post-CSF-TT cognitive and gait outcomes between the shunt and no shunt groups. These models identified change-sensitive variables and assessed their alignment with referral decisions. Fixed effects included Group, Time and their interaction (Group × Time), with random intercepts for participants nested within groups. Models were estimated using restricted maximum likelihood, with optimal covariance structures selected by the Akaike information criterion and heterogeneous variance terms applied when necessary. Degrees of freedom were adjusted using the Kenward–Roger method. Model assumptions of normality and homogeneity were verified, with robustness checks conducted if violations occurred; linearity of effects was assumed. Significance was evaluated using Type III tests (p < 0.05), with Bonferroni-adjusted post hoc comparisons. Partial eta-squared was reported for significant interactions. Reference Correll, Mellinger and Pedersen40
Results
Demographic and clinical data of patients
A total of 175 patients underwent the iNPH-related cognition and gait/balance assessments before and after CSF-TT. Of these, 119 patients (68%) underwent shunt surgery (shunt group), whereas 56 patients (32%) showed insufficient clinical improvement after CSF-TT to warrant referral for surgery (no shunt group). Of the patients referred for surgery, 114 underwent shunt placement, while 5 declined the procedure. Postoperative follow-up data were available for 94 patients, among whom 85 (90.4%) met the criteria for definite iNPH, and consisted of a brief clinical evaluation by a neurologist to assess overall postoperative improvement. In the present study, analyses were restricted to pre- and post-CSF-TT, independent of post-shunt outcomes. Post-shunt outcomes are reported exclusively to document the final diagnostic classification.
Pre-CSF-TT MoCA scores did not significantly differ between the shunt (n = 107; M = 21.4, SD = 3.7) and the no shunt (n = 45; M = 20.2, SD = 4.1) groups; U = 1973.00, p = 0.078. However, as reported in Table 1, the two groups statistically differed in terms of mean age, with the shunt group being younger on average than the no shunt group. The age range in the shunt group varied from 60 to 85 years old, while in the no shunt group, it ranged from 61 to 90 years. Both groups exhibited similar demographic and clinical characteristics for other variables.
Logistic regression
The indices of the predictor variables used in the logistic regression model (Objective 1) are presented in Table 2. A stepwise logistic regression model was conducted to identify predictors of referral for shunt surgery (see Table 3). The model selected the 10MWT-C, TMT Condition 5 and Berg Balance Scale indices as significant predictors. The number of years of education was a significant covariate. Although age and biological sex did not reach statistical significance, they were retained in the model as covariates due to their documented theoretical relevance. Reference Peterson, Savulich, Jackson, Killikelly, Pickard and Sahakian30,Reference Chang, Agarwal, Williams, Rigamonti and Hillis41–Reference Solana, Poca, Sahuquillo, Benejam, Junqué and Dronavalli43
Table 2. Descriptive statistics of indices used in the multivariate logistic regression model

Note: iNPH = idiopathic normal pressure hydrocephalus; M = mean; SD = standard deviation; D-KEFS = Delis–Kaplan Executive Functions System; TMT = Trail Making Test; WAIS-IV = Wechsler Adult Intelligence Scale – Fourth Edition; NAB = Neuropsychological Assessment Battery; 10MWT = 10-Meter Walk Test.
* Indices calculated using this formula: (Post–Pre)/Pre.
† Efficiency is calculated with the formula ((236 – Numbers and Letters Part A errors raw score)/Numbers and Letters Part A speed raw score).
Table 3. Multivariate logistic regression analysis results

Note: CI = confidence interval; 10MWT-C = 10-Meter Walk Test – normal pace; TMT = Trail Making Test.*Significant p values < 0.05 are in bold.
Table 4. Results of repeated-measures ANOVA for neuropsychological and gait/balance variables

Note: Refer to the supplementary table for the exact number of observations (n) used for each variable and the corresponding descriptive results.
CSF-TT = CSF tap test; D-KEFS = Delis–Kaplan Executive Function System; TMT = Trail Making Test; WAIS-IV = Wechsler Adult Intelligence Scale – Fourth Edition; NAB = Neuropsychological Assessment Battery; 10MWT = 10-Meter Walk Test.
† Efficiency is calculated with the formula ((236 – Numbers and Letters Part A errors raw score)/Numbers and Letters Part A speed raw score).
dfs vary due to missing values on some indicators.
* p < 0.05.** p < 0.01.*** p < 0.001.
A 10% improvement in walking speed, reflected by a 0.1 unit increase in the relative change index for the 10MWT-C, corresponds to a 55% greater likelihood of referral for shunt surgery (OR = 1.55, 95% CI [1.19, 2.03]). Similarly, a 10% reduction in completion time (reflecting improved performance) on the TMT Condition 5, indicated by a 0.1 unit decrease in its relative change index, is linked to a 32% higher chance of being referred for surgery (OR = 1.32, 95% CI [1.07, 1.64]). Improvements in balance, quantified as a 0.1 unit increase in the Berg Balance Scale relative change index (10% gain from pre-CSF-TT scores), increase the odds of referral by 60% (OR = 1.60, 95% CI [1.09, 2.36]). Each additional year of education is associated with a 13% increase in the odds of being referred for shunt surgery (OR = 1.13, 95% CI [1.00, 1.27]).
Due to the multiple imputations performed to address missing data, a Wald statistical test could not be employed to evaluate the overall model fit. Nevertheless, the model explained 35.0% (Nagelkerke R 2) of the variance in referral status for shunt surgery, with a correct classification rate of 76.5%. The model revealed an area under the Receiver Operating Characteristic curve of 0.798, with a sensitivity and specificity of 95.7% and 36.4%, respectively. A probability threshold of 0.378 represents the critical value for decision-making. If the predicted probability exceeds the specified threshold, the decision is made to refer the patient for shunt surgery.
Mixed-effects ANOVA
Only significant interactions between group (shunt vs no shunt) and time (pre- and post-CFT-TT) are reported to identify the most change-sensitive variables and assess their concordance with those influencing referral decisions. Significant Group × Time interactions were observed for the Word Reading condition of the D-KEFS Color-Word Interference Test, the 10MWT-C, 10MWT-DT and the Berg Balance Scale. Overall, gait and balance improvements post-CSF-TT were more pronounced in the shunt group, whereas the no shunt group showed greater gains in information processing speed. Both the 10MWT-C and the Berg Balance Scale not only demonstrated sensitivity to change but also emerged as key variables associated with referral decisions. The primary results are detailed in Table 4. Supplementary Table 2 provides the sample sizes and descriptive statistics for each test, as the number of available data points varies due to missing data inherent to retrospective clinical chart reviews.
For the Word Reading condition of the Color-Word Interference Test, a significant Group × Time interaction was observed, F(1, 59.4) = 4.46, p = 0.039, partial η2 = 0.070, indicating a moderate effect size. Reference Norouzian and Plonsky44,Reference Richardson45 Post hoc analyses revealed that at pre-CSF-TT, the shunt group demonstrated significantly faster execution times compared to the no shunt group (p = 0.028). Between pre- and post-CSF-TT, only the no shunt group improved significantly, with a notable reduction in execution time (p = 0.044), while the shunt group showed no significant change (p = 0.598). At post-CSF-TT, the difference between groups was no longer significant (p = 0.207), indicating that the performance gap observed at baseline had diminished.
For the 10MWT-C, a significant Group × Time interaction was observed, F(1, 172) = 25.61, p < 0.001, partial η2 = 0.130, indicating a moderate to large effect size. Reference Norouzian and Plonsky44,Reference Richardson45 There was no significant difference between groups at pre-CSF-TT (p = 0.697). The shunt group improved significantly from pre- to post-CSF-TT (p < 0.001), whereas the no shunt group did not show a significant change (p = 0.259). This resulted in higher post-CSF-TT scores for the shunt group compared to the no shunt group, although this difference was marginally significant (p = 0.060).
For the 10MWT-DT, a significant Group × Time interaction was observed, F(1, 118) = 6.94, p = 0.010, partial η2 = 0.056, indicating a moderate effect size. Reference Norouzian and Plonsky44,Reference Richardson45 No significant difference was found between groups at pre-CSF-TT (p = 0.690). Both groups improved significantly from pre- to post-CSF-TT (shunt: p < 0.0001; no shunt: p = 0.005). Post-test scores were higher in the shunt group compared to the no shunt group, but this difference was not statistically significant (p = 0.108).
For the Berg Balance Scale, a significant Group × Time interaction was observed, F(1, 164) = 31.60, p < 0.001, partial η2 = 0.162, indicating a large effect size. Reference Norouzian and Plonsky44,Reference Richardson45 At pre-CSF-TT, there was no significant difference between groups (p = 0.408). Both groups improved significantly from pre- to post-CSF-TT (p < 0.001), with the shunt group showing a notably greater increase, resulting in higher post-CSF-TT scores compared to the no shunt group (p = 0.001).
Discussion
The primary objective of this retrospective study was to identify which pre- to post-CSF-TT change indices in cognitive and gait variables, along with patient characteristics, were statistically associated with the likelihood of being referred for shunt surgery. The logistic regression model identified three significant predictors using performance change indices derived from pre- to post-CSF-TT assessments: the 10MWT-C, the TMT Condition 5 and the Berg Balance Scale. These findings suggest that improvements in gait speed, psychomotor speed and balance were most influential in clinical referral decisions. Years of education also emerged as a significant covariate, with higher education associated with an increased likelihood of referral. The second objective of this study was to compare the shunt and no shunt groups to identify the cognitive and gait tests most sensitive to change following the CSF-TT. The findings indicated that the Berg Balance Scale, the 10MWT-C, the 10MWT-DT and the Word Reading condition of the Color-Word Interference test exhibited significant group differences in response to the CSF-TT. This overlap with results from Objective 1 underscores the key role of balance and gait variables in referral decision-making.
The two groups were relatively homogeneous in terms of demographic and clinical characteristics. Age emerged as the sole variable that exhibited a statistically significant difference, with younger patients being more frequently referred for shunt surgery. This pattern likely reflects a clinical tendency to prioritize younger individuals, given their greater potential for functional recovery and lower risk of postoperative complications. Reference Bugalho, Alves and Ribeiro46–Reference Andrén, Wikkelsø, Hellström, Tullberg and Jaraj48 No significant group differences were observed on the MoCA total score. However, in both groups, the mean score was below the widely accepted clinical cut-off for probable cognitive impairment (i.e., scores < 26), Reference Nasreddine, Phillips and Bédirian28 reinforcing the relevance of the MoCA as a screening tool for cognitive deficits in iNPH populations. Reference Wesner, Etzkorn and Bakre49 Furthermore, there was no significant difference in the volume of CSF removed during the tap test between groups, consistent with previous studies indicating that variations within the standardized range (30–50 mL) have no measurable effect. Reference Thakur, Serulle, Miskin, Rusinek, Golomb and George50
Logistic regression is a widely used method for developing predictive tools. However, the principal objective of the present model was not prediction per se but to provide greater insight into the clinical decision-making process at CHU-HEJ. As such, performance metrics are reported here descriptively to facilitate interpretation; they are not used as a basis for diagnostic classification. While the model demonstrated relatively high sensitivity (95.65%) and low specificity (36.36%), reflecting clinical priorities to avoid under-referral, it also underscores the need to improve decision precision. The model explained 35.0% of the variance in referral status, suggesting that important elements of the decision-making process lie outside the variables included in the model. Despite controlling for known influences such as age, biological sex and education, Reference Mitrushina and Satz42,Reference Solana, Poca, Sahuquillo, Benejam, Junqué and Dronavalli43,Reference Lezak, Howieson, Bigler and Tranel51 a substantial proportion of relevant factors remains unaccounted for. Clinician’s experience, for instance, may influence the interpretation of symptoms and the probability of referral. Experienced professionals may draw on tacit knowledge and subtle clinical cues, which could influence their decisions. Similarly, broader clinical considerations including vascular Reference Bådagård, Braun, Nilsson, Stridh and Virhammar52,Reference Uchigami, Sato and Samejima53 and neurological comorbidities Reference Malm, Graff-Radford and Ishikawa54 as well as patient functionality, often influence referral decisions in ways that are difficult to quantify. This highlights the necessity for models that incorporate both quantifiable outcomes and experiential clinical expertise.
Nevertheless, the findings of this study suggest that specific domains, particularly gait speed and balance, play a central role in guiding surgical referrals. Both the 10MWT-C and the Berg Balance Scale have been shown to serve as predictors and as variables sensitive to change, consistent with current guidelines Reference Nakajima, Yamada and Miyajima12,Reference Relkin, Marmarou, Klinge, Bergsneider and Black55 and previous studies. Reference Gallagher, Marquez and Osmotherly17,Reference Gallagher, Marquez, Dal and Osmotherly56 Notably, post-CSF-TT gait-related improvements have demonstrated a predictive value greater than 90% for shunt responsiveness, Reference Marmarou, Young and Aygok57 establishing gait and balance as core clinical indicators. The present findings reinforce this evidence: referral decisions at CHU-HEJ appear to rely heavily on these measures. While the 10MWT-DT also showed sensitivity to change, it was not retained in the predictive model. This may reflect the fact that only walking speed was analyzed, without consideration of cognitive performance during the dual task. The cognitive dimension may have been incorporated into the judgment process by clinicians, and the model may have failed to capture this nuance. The two groups demonstrated improvements on the 10MWT-DT, with the shunt group exhibiting more substantial post-CSF-TT gains. This pattern aligns with previous studies emphasizing the utility of dual-task paradigms for identifying responders to CSF drainage, Reference Allali, Laidet, Beauchet, Herrmann, Assal and Armand58,Reference Allali, Laidet and Armand59 particularly when assessments are conducted several days post-procedure. Reference Schniepp, Trabold and Romagna60
As for cognition, only one variable (TMT Condition 5), a measure of psychomotor speed, emerged as a significant predictor in the logistic regression model. Interestingly, this task did not show significant between-group differences in response to CSF-TT, indicating that its inclusion in the predictive model may reflect its influence on clinician judgment rather than overt performance-based change. This finding suggests that while TMT Condition 5 may influence clinical impressions, it may lack sensitivity in detecting improvements through objective, performance-based evaluation. Prior research has emphasized the utility of psychomotor speed tasks in detecting CSF-TT responsiveness, particularly those involving upper limb motor function. Reference Mandir, Hilfiker and Thomas61–Reference Liu, Wei and Dong64 For instance, performance on the Line Tracing Test, a task similar to TMT Condition 5, improved by 12% following CSF-TT among responders, offering a useful benchmark for interpreting change. Reference Tsakanikas, Katzen, Ravdin and Relkin63 The incorporation of alternative speed tasks such as the Line Tracing Test could facilitate the refinement of the assessment protocol and enhance the predictive validity of clinical decision tools.
In contrast, the Word Reading condition of the Color-Word Interference test, a measure of information processing speed involving visuo-oral integration, was the only cognitive measure to demonstrate significant group differences following the CSF-TT. Specifically, the shunt group outperformed the no shunt group at baseline. However, only the no shunt group exhibited a significant improvement post-CSF-TT, resulting in a loss of between-group difference at follow-up. While this pattern is statistically significant, it is clinically unexpected. It may reflect a ceiling effect in the shunt group, limiting their potential for measurable gains, whereas the no shunt group, with slower baseline performance, had more room for improvement.
Overall, these findings highlight the need to refine the cognitive assessment protocols in the context of CSF-TT to improve their sensitivity for detecting cognitive changes. The timing of assessments, 4 hours post-lumbar puncture in the present study, may be a contributing factor, as emerging evidence indicates that cognitive improvements often require more time to manifest. For instance, previous studies have demonstrated significant cognitive gains up to one week post-lumbar puncture using the Mini-Mental State Examination and the Frontal Assessment Battery. Reference Matsuoka, Akakabe, Iida, Kawahara and Uchiyama65 Similarly, other findings reported notable improvements in executive function after two lumbar punctures conducted 24 hours apart. Reference Rocha, Kowacs, Krause, Pizzani, Ramina and Teive66 These findings imply that extending follow-up intervals or implementing repeated assessments may enhance the detection of cognitive change. In this study, this represents a limitation, as some patients who might have benefited from shunt surgery could have been missed. Moreover, this approach may also increase the sensitivity of dual-task testing, as previously stated, which often relies on subtle cognitive shifts over time. Reference Schniepp, Trabold and Romagna60 In our clinical context, CHU-HEJ serves the entire eastern region of the province of Quebec, and logistical constraints make it difficult to extend assessments beyond the current schedule without robust scientific evidence. Therefore, longer or repeated follow-ups would be more feasible in a research setting to better determine the optimal timing for post-CSF-TT assessments.
Education level emerged as a significant covariate in the logistic regression model, highlighting its potential influence in the clinical decision-making process. In the present study, the direct assessment of years of education on test scores was not a primary objective. Instead, the focus was on the evaluation of education in relation to its predictive value for referral decisions. The impact of education on predicting referral status suggests a potential influence on clinical judgment, possibly through enhanced test performance or cognitive reserve mechanisms. Reference Piche, Armand, Allali and Assal67 A meta-analysis Reference Calamia, Markon and Tranel68 demonstrated that patients with Alzheimer’s disease exhibited minimal practice effects on cognitive tests; however, higher education slightly stabilized performance across repeated administrations. Although patients with iNPH generally show limited practice effects due to cognitive impairments, Reference Solana, Poca, Sahuquillo, Benejam, Junqué and Dronavalli43 education may nevertheless impact cognitive and gait test performance, thereby affecting referral decisions. This finding is significant for the optimization of pre- and post-CSF-TT protocols. Moreover, higher education has been associated with less functional decline 36 months post-shunt, Reference Chen, Xian and Wang69 suggesting a possible protective effect. Thus, further research is warranted to clarify the impact of education on short- and long-term outcomes in iNPH.
Overall, the present study highlights a key point: clinical decision-making for iNPH is not limited to rigid thresholds or absolute performance gains. Rather, it reflects a more integrative process, where changes are interpreted in the broader context of each patient’s clinical portrait. This judgment-based approach, as currently applied at CHU-HEJ, may be more reflective of real-world complexity than approaches relying solely on cut-offs. In this sense, the findings provide empirical support for a flexible model of care that integrates both quantitative findings and clinician expertise.
Strengths and limitations
This study has several strengths. First, it adopts a novel yet practice-oriented approach by examining a long-standing protocol in a real-world clinical setting to gain a deeper understanding of the decision-making process underlying referrals for shunt surgery. Grounded in the realities of patient care, this work helps identify the factors that guide judgment and provides a valuable foundation for improving current practices. Second, the study contributes to the development of more individualized clinical decision-making by using performance indices that account for each patient’s baseline functioning, rather than relying solely on absolute change scores. The inclusion of demographic factors further enhances the ability to tailor interpretations to individual patient profiles.
This study also has limitations. The retrospective design limited control over the completeness and consistency of the data, which were extracted from clinical records. This real-world context supports ecological validity and demonstrates the feasibility of implementing a structured CSF-TT protocol in clinical practice, but it may have introduced variability in documentation and reduced standardization across cases. Additionally, applying the most recent diagnostic criteria (2021) to older cases dating back to 2010 may have introduced classification bias. Although neurologists considered potential alternative medical explanations and comorbid neurological conditions when establishing the working diagnosis, these conditions were not systematically assessed or recorded in a standardized manner in the clinical files. As a result, some comorbidities or medical conditions may have contributed to cognitive deficits unrelated to iNPH, which could influence the interpretation of our findings. Missing data also posed another challenge, particularly for executive function tests, which were often unavailable due to task difficulty in more severely impaired patients (e.g., TMT Condition 4, Color-Word Interference – Inhibition). This highlights the need for assessment protocols that are better adapted to the full range of cognitive severity observed in iNPH. To address this limitation, multiple imputation using predictive mean matching was employed to provide unbiased and plausible estimates. Finally, the absence of post-shunt outcome data precludes direct validation of referral decisions, underscoring the importance of prospective studies with long-term follow-up.
Conclusion
Taken together, these findings lay the groundwork for developing individualized predictive models to support decision-making in iNPH. Instead of relying on fixed cut-off scores, logistic regression approaches that incorporate the most informative measures, such as the 10MWT-C, the Berg Balance Scale and either the TMT Condition 5 or the Line Tracing test, may offer a more flexible and clinically relevant alternative. To enhance model accuracy and clinical applicability, current assessment protocols must first be refined, particularly for cognition, where improvements may take longer to manifest. Extending the interval between pre- and post-CSF-TT assessments beyond the standard 4-hour window could help better capture these changes and improve the sensitivity of certain measures. While this study did not aim to create a predictive tool per se, its methodology provides a valuable foundation. Future prospective studies integrating pre- and post-CSF-TT data, along with post-shunt outcomes, are critical for building robust predictive models. Incorporating clinical and demographic variables, such as age, education, sex and vascular comorbidities, would further support a personalized approach by better accounting for patient heterogeneity and guiding more tailored surgical decisions.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/cjn.2026.10544.
Author contributions
Conceptualization: FBM, SC, LV, CH
Methodology: FBM, SC, YN, LV, CH
Project administration: FBM, AN
Analysis: FBM
Supervision: SC, LV, CH
FBM wrote the original draft of the manuscript, and all authors contributed to its review and editing.
Funding statement
FBM was supported by scholarships from the Canadian Institutes of Health Research (187549) and Fonds de Recherche du Québec (https://doi.org/10.69777/313551). This study was funded by the Normal Pressure Hydrocephalus Research Fund of the Clinique Interdisciplinaire de Mémoire at CHU-HEJ.
Competing interests
The authors have no potential conflicts of interest to disclose.



