Introduction
Expanded multiplex gastrointestinal panels (GIPs), simultaneously testing over 20 pathogens, are now routinely used for rapid and comprehensive diagnosis of infectious diarrhea. Reference Buss, Leber and Chapin1–Reference Riddle, DuPont and Connor3 However, clinical interpretation of GIP results may be difficult due to the nature of molecular diagnostics. Reference Miller, Binnicker and Campbell4,Reference Shane, Mody and Crump5 Their high sensitivity may lead to the detection of persistent organisms with unclear clinical significance and may not adequately differentiate between viable pathogenic organisms and remnant nucleic acids or non-pathogenic colonizers. Reference Deshpande, Pasupuleti and Patel6–Reference Nicol9
Clinicians may order repeat testing for a variety of reasons: to confirm potential false positives, evaluate new or progressing symptoms, confirm pathogen clearance (“test-of-cure”), or fulfill public health reporting for outbreak-prone organisms. Reference Mandelia, Procop, Richter, Worley, Liu and Esper7,Reference Hitchcock, Hogan, Budvytiene and Banaei8,Reference Park, Hitchcock, Gomez and Banaei10 Multiple clinical society guidelines have cautioned against repeat stool testing for Clostridioides difficile within 7 days, citing low yield and high false-positive rates. Reference Miller, Binnicker and Campbell4,Reference Kelly, Fischer and Allegretti11,Reference McDonald, Gerding and Johnson12 However, similar guidance for GIPs is lacking, leading to variability in testing practices and a gap in knowledge about the clinical utility of early repeat testing. One single-center study of 106 patients with repeat GIP testing demonstrated that follow-up testing within four weeks rarely led to a change in results. Reference Park, Hitchcock, Gomez and Banaei10 Another single-center study of 228 patients identified a repeat testing rate of 1.8% within four weeks. Reference Saif, Dooley, Baghdadi, Morgan and Coffey13 However, these studies were limited by small sample sizes and may not accurately identify differences between repeat tests across a large 4-week testing interval.
The objectives of our study were to 1) characterize patients receiving repeat GIP testing within 14 days of an index GIP, 2) evaluate the diagnostic yield of repeat testing, 3) evaluate pathogen persistence on repeat testing, and 4) identify which clinical settings contributed most to repeat testing. We hypothesized that repeat testing within 14 days would rarely detect new pathogens and that persistent positivity would be common among initially positive patients. Our study aims to inform evidence-based policies around GIP utilization and diagnostic stewardship.
Methods
Study population and data extraction
We conducted a multicenter retrospective cohort study including adults (≥18 years) tested with a GIP within an integrated health system from January 1, 2019, through March 1, 2024. This healthcare system comprised 12 hospitals and associated outpatient centers across Ohio and Florida. GIPs were defined as completed orders for the BioFire FilmArray Gastrointestinal Panel® (bioMérieux, France). Demographics, testing location, sample collection date, comorbidities, and microbiology results were extracted from electronic health records. High-risk patients were defined as patients with ≥1 of 10 predefined chronic conditions identified through International Classification of Diseases (ICD-10) codes documented prior to or at the time of testing, as previously described Reference Shu, Wang and Misra14 (Supplementary Table S1). Low-risk patients were those with none of these 10 predefined chronic conditions. Comparative epidemiology of diarrheal organisms detected by GIP in this cohort, as well as overall GIP yield, were previously described by Shu et al. Reference Shu, Wang and Misra14 Encounter setting was identified by the department ordering the GIP and classified as inpatient, outpatient, or emergency department. Ordering providers were identified by National Provider Identifier (NPI) and provider specialties were extracted from National Plan and Provider Enumeration System (NPPES) databases. Reference Bindman15 Manual chart review of 50 randomly selected patients was conducted to validate data accuracy.
Laboratory testing practices
The BioFire FilmArray GIP simultaneously tests for 21 pathogens (12 bacteria, 5 viruses, 4 parasites). Reference Buss, Leber and Chapin1 We analyzed the first diarrheal case (test interval spanning 14 calendar days) per patient, excluding cases with invalid/missing results and Clostridioides difficile results. C. difficile results on the GIP were excluded from analysis due to variations in laboratory reporting and preexisting guidelines discouraging C. difficile repeat testing. Repeat testing was defined as a second GIP completed within 14 days of an index GIP (calendar days between sample collection), excluding confirmatory testing conducted on the same stool sample. For patients with multiple repeat tests, only the test with the largest testing interval within 14 days was analyzed. GIP testing was ordered at the discretion of treating clinicians, with no institutional policies or restrictions in place. Institutional best practices encourage providers to reserve GIP usage for individuals meeting criteria for testing per IDSA recommendations. Reference Shane, Mody and Crump5 All GIP tests were processed using standard laboratory protocols. 16 Following the January 2024 recall due to norovirus concerns, all norovirus positives were repeat tested on the same platform and only reported as positive if reproducibly positive. 17 These repeat tests done for quality control were not considered a repeat GIP test.
Outcomes and definitions
Positivity was defined as detection of any pathogen on the GIP and all analyses were stratified by results of the index GIP (index negative vs index positive). The primary outcome was diagnostic yield of repeat GIP testing, defined as detection of any new pathogen not detected on the index GIP. The secondary outcome was pathogen persistence, defined as detection of the same organism on both index and repeat testing, among index-positive patients. Antibiotic-treatable organisms were previously defined as bacteria and parasites with guideline-recommended antimicrobial therapy per Infectious Diseases Society of America (IDSA) guidelines: Campylobacter spp., Plesiomonas shigelloides, Salmonella spp., Shigella/EIEC; Vibrio spp. (parahaemolyticus, vulnificus), Vibrio cholerae, Yersinia enterocolitica, Cyclospora cayetanensis, Entamoeba histolytica, and Giardia lamblia. Reference Shane, Mody and Crump5,Reference Clark, Sidlak, Mathers, Poulter and Platts-Mills18
Statistical analysis
Baseline characteristics were presented as median and interquartile range (IQR) for continuous variables and proportions (%) for categorical variables within each testing group (repeat testing vs single testing only). Pearson’s χ2 tests and Wilcoxon rank-sum tests compared baseline characteristics for categorical and continuous variables across testing groups, respectively. We calculated 95% confidence intervals (95% CI) for proportions using the Clopper-Pearson method for primary and secondary outcomes. Number needed to test (NNT) was calculated as the inverse of the diagnostic yield for any organism and organisms warranting antibiotic treatment, with 95% CIs derived from the corresponding confidence limits. Multiple logistic regression analyzed the associations between testing interval and diagnostic yield, controlling for index-result. Heatmap visualization was used to display changes in pathogen detection patterns between index and repeat testing. Care transition patterns were visualized using Sankey flow diagrams to illustrate patient movement between clinical settings. Provider continuity was assessed by matching NPI codes or ordering departments between index and repeat test orders. Analyses were conducted with R version 4.5.1 on a complete-case basis and utilized a 5% significance level. Given the exploratory nature of secondary and subgroup analyses, no adjustments were made for multiple comparisons.
Sensitivity analyses
Sensitivity analyses were performed using 7-, 14-, 28-, and 56-day testing intervals to evaluate the effect of the repeat-testing window on the primary and secondary outcomes. We used a Pearson’s χ2 tests to compare differences in diagnostic yield of repeat testing between “high-risk” and “low-risk” patients, as previously described. Reference Shu, Wang and Misra14 Subgroup analyses were conducted on 1) patients receiving repeat testing with an index test between January 26, 2024 through March 1, 2024, and 2) all repeat testing excluding norovirus results to evaluate the impact of the 2024 BioFire FilmArray GIP recall related to concerns about norovirus false positivity. 17
Results
Baseline characteristics of patients receiving repeat testing
During the study period, 16,502 adult patients underwent GIP testing across 12 hospitals, with an overall detection rate of 22%. Of these, 507 patients (3.1%, 95% CI: 2.8–3.4%) received repeat testing within 14 days of their index test, with a median interval of 6.2 days (IQR: 2.6–9.7 days) between tests.
Of the 507 repeat tests, 415 (82%) were conducted in patients with index-negative results compared to 92 (18%) with index-positive results (Table 1). Among the 92 index-positive patients, 16 (17%) identified multiple pathogens, detecting either two or three organisms. Patients who underwent repeat testing differed significantly from those with single testing across multiple baseline characteristics. Patients with repeat testing were older (median age 69 vs 66 years; P = .003), more likely to be male (46% vs 40%; P = .019), and more likely to have immunocompromising conditions including organ transplant (7.7% vs 3.0%; P < .001), inflammatory bowel disease (9.1% vs 6.1%; P = .005), metastatic cancer (9.1% vs 5.7%; P = .001), and primary immunodeficiency (7.9% vs 5.1%; P = .004).
Baseline characteristics of patients receiving repeat tests compared to single testing only

Index-negative repeat GIPs became positive in 19/415 cases (4.6%, 95% CI: 2.8–7.1%) and index-positive repeat GIPs remained positive in 47/92 cases (51%, 95% CI: 40–62%) (Figure 1).
Positivity of repeat GIPs within 14 days, stratified by index GIP result. Each bar represents the total number of repeat GIP tests ordered per day after index test. Light blue represents positive repeat GIPs, light gray represents negative repeat GIPs.

Primary outcome: diagnostic yield of repeat testing
The overall diagnostic yield of repeat GIP testing was 4.1% (95% CI: 2.6–6.3%) representing 21 of 507 repeat tests detecting new pathogens (Figure 2). Diagnostic yield was not significantly different across testing intervals: ≤3 days (reference), 4–7 days (OR 0.65, 95% CI: 0.16–2.32), and 8–14 days (OR 1.21; 95% CI: 0.45–3.58) (P = .6) or between index-negative (reference) and index-positive tests (OR 0.45, 95% CI: 0.01–1.58, P = .2).
Heatmap of changes in pathogen detection on repeat testing, stratified by index GIP result. Light gray represents no change (0%). Abbreviations: EPEC, enteropathogenic E. coli; EAEC, enteroaggregative E. coli; ETEC, enterotoxigenic E. coli; STEC, Shiga toxin-producing E. coli; EIEC, enteroinvasive E.coli.

Among patients with index-negative tests, 19 of 415 (4.6%; 95% CI: 2.8–7.1%) detected new pathogens on repeat testing. The most frequently identified organisms in these patients were norovirus (n = 7), enteropathogenic Escherichia coli (EPEC) (n = 4), and Campylobacter species (n = 2). Conversely, only 2 of 92 index-positive tests (2.2%; 95% CI: 0.3%–7.7%) detected new pathogens on repeat testing. Both index-positive cases involved new detection of viral pathogens (one case each of norovirus and astrovirus) in patients readmitted with persistent diarrheal symptoms after discharge.
Of the 21 new pathogens detected on repeat testing, only 4 new detections (19%) involved organisms with guideline-recommended antimicrobial treatment: Campylobacter (n = 2), Yersinia spp. (n = 1), and Vibrio spp. (n = 1). (Supplementary Table S2). The NNT to identify one new pathogen was 24 (95% CI: 16–39) tests overall, and 127 (95% CI: 50–455) tests to identify one new pathogen warranting antimicrobial treatment.
Secondary outcome: pathogen persistence
Another common reason that clinicians may repeat GIP testing is to assess pathogen clearance for test-of-cure or for public health reporting purposes. Of 92 index-positive tests, GIPs were repeated for those with bacterial detections in 63% of cases, viruses in 37%, and parasites in 12%. Among these index-positive patients, 47 (51%; 95% CI: 40.4%–61.7%) demonstrated persistence of at least one pathogen from their initial test (Figure 2). Persistence rates were similar among viral (n = 20/34; 58.8%), bacterial (n = 27/58; 46.6%) and parasitic pathogens (n = 4/11, 36.4%) (P = .28), though interpretation is limited due to small sample sizes. The most prevalent persistent organisms were norovirus (n = 13/27, 48.1%), EPEC (9/22, 40.9%), and Campylobacter (n = 7/16, 43.8%).
Complete pathogen clearance (“negative change”) occurred in 45/92 (49%; 95% CI: 38.6–59.1%) index-positive patients when GIPs were repeated within 14 days. The most cleared organisms were norovirus (n = 14/27, 51.9%), EPEC (n = 13/22, 59.1%), Campylobacter (n = 9/16, 56.3%), Salmonella (n = 6/9, 66.7%), and Cryptosporidium (n = 5/8, 62.5%). Patients with complete pathogen clearance (median 9.2 days, IQR: 6.6–11.3) repeated their test in a longer test interval compared to those without pathogen clearance (median 5.0 days, IQR: 1.4–9.6) (P < .001).
Healthcare utilization and care transitions
To identify system factors that impact repeat testing, we investigated GIP repeat ordering factors such as encounter setting, clinician specialty, and individual clinician. Most repeat testing occurred during transitions of care (Figure 3). Among the 507 patients receiving repeat tests, 137 (27.0%; 95% CI: 23.2–31.1%) had testing performed in a different encounter setting compared to their index test, 325 (64.1%; 95% CI: 59.8–68.3%) by a different clinician specialty, and 436 (86.0%; 95% CI: 82.8–88.6%) by a different individual clinician.
Flow diagrams of transitions between index and repeat GIP tests for a) clinical setting and b) clinician specialty.

Most repeat testing occurred in inpatient-to-inpatient settings (n = 297) and emergency department to emergency department (n = 55). The most common encounter transitions were from emergency department to inpatient settings (n = 74). There was no significant difference in diagnostic yield in patients with repeat testing conducted in the same encounter setting compared to different encounter settings (3.8% vs 5.0%, P = .5).
Only 71 of 507 patients (14.0%) had both tests ordered by the same clinician. Among cases with provider continuity, the diagnostic yield was not significantly different from cases with different ordering providers (2.8% vs 4.4%, P = .8) (Table 2).
Test characteristics of patients receiving repeat testing ordered by the same clinician or different clinicians

Sensitivity analyses
Varying the repeat testing window demonstrated consistently low diagnostic yield. Compared to an overall diagnostic yield of GIPs of 22% in our cohort, a 7-day repeat testing window yielded 3.6% new pathogen detection, 14-day window yielded 4.1%, 28-day window yielded 4.9%, and 56-day window yielded 5.5%. (Supplementary Table S3) There was also no significant difference in diagnostic yield in repeat testing between high-risk and low-risk patients (4.1% vs 4.1%; P > .9). Reference Shu, Wang and Misra14
In January 2024, a recall was announced related to concerns about norovirus false positives. 17 To assess the effect of the recall on repeat ordering rates, we conducted subgroup analyses 1) identifying patients with repeat testing following the recall, and 2) reanalyzing the NNT to detect an organism (excluding norovirus results) given concerns about false positivity potential. Between January 26, 2024 and March 1, 2024, 614 patients underwent GIP testing, of which 53 (8.6%) were positive for norovirus. 12/614 (2.0%) patients received a repeat test, including 10 who were index-negative and 2 who were index-positive (1 detecting norovirus and Campylobacter, 1 detecting Campylobacter). When excluding norovirus results from the overall cohort, the NNT to identify one new pathogen (excluding norovirus) was 37 (95% CI: 22–64) tests overall. There were no changes to the NNT to identify one pathogen warranting antibiotic treatment.
Discussion
In this multicenter cohort study, we found that repeat GIP testing within 14 days rarely yields new diagnostic information. Over 80% of repeat testing was conducted in patients with index-negative tests. Fewer than 5% of patients identified a new pathogen on repeat testing, with over 120 repeat tests needed to detect one new pathogen warranting antimicrobial treatment. Among index-positive patients, over half continued to test positive for the same pathogen on follow-up. Lastly, most repeat tests were ordered by a different clinician, highlighting that most repeat testing occurs during transitions of care.
The low detection of new pathogens supports that most diarrheal illnesses either resolve spontaneously or do not evolve into new infections within a 14-day period. Reference Shane, Mody and Crump5,Reference Park, Hitchcock, Gomez and Banaei10,Reference Cerqueira, Do, Enderle and Ren19 The most common new pathogens (norovirus and EPEC) are self-limiting infections in immunocompetent hosts and do not necessitate antimicrobial therapy. Reference Shane, Mody and Crump5 Additionally, the persistent detection of pathogens on repeat testing in index-positive individuals supports existing concerns about the detection of nonviable organisms in molecular testing, suggesting test-of-cure or public health reporting should not be done with molecular testing. Reference Miller, Binnicker and Campbell4,Reference Deshpande, Pasupuleti and Patel6,Reference Cerqueira, Do, Enderle and Ren19–Reference McCartney, Thilakarathna, Hluchy, Pillai, Chui and Berenger21 Culture-independent methods for test-of-cure could also lead to missed detection of difficult-to-treat and multidrug-resistant (MDR) organisms. Culture-based methods remain critical for pathogen characterization and public health reporting in these cases, as it allows for higher specificity and subsequent susceptibility testing. Reference Shane, Mody and Crump5,Reference Harrington, Buchan and Doern22 Our findings expand upon a prior smaller single-center study of 106 patients that demonstrates poor clinical utility of repeat GIP testing within 4 weeks and also recommends against using a GIP for test-of-cure purposes. Reference Park, Hitchcock, Gomez and Banaei10
These findings have implications for diagnostic stewardship. In the absence of new symptoms, repeat GIP testing within two weeks offers minimal added value. Repeat testing was more frequently ordered in index-negative patients, as well as high-risk, immunocompromised patients with no significant difference in diagnostic yield compared to low-risk patients. We observed significant regional variation in repeat testing, with higher rates at Florida sites compared to Ohio, potentially reflecting differences in testing culture, patient populations, or provider training. This highlights the variability in testing practices in the absence of standardized stool testing algorithms and further supports the need for institutional policies or clinical decision support (CDS) tools.
Our finding that most repeat testing was requested by a different clinician has not been previously described in the literature. This could be due to lack of recognition of prior results, the desire to use a routine testing approach regardless of previous tests, or belief that tests-of-cure are required for these pathogens. The risks of repeat testing can include increased costs and overutilization on the healthcare system, unnecessary antimicrobial use for incidental findings, or extended isolation precautions due to persistent detection of nonviable pathogens. Reference Shane, Mody and Crump5,Reference Moon, Bleak and Rosenthal23,Reference O’Neal, Murray, Dash, Al-Hasan, Justo and Bookstaver24 Future studies should investigate the impact of repeat testing on treatment decisions and healthcare costs. As repeat testing occurs primarily during transitions of care, institutions and patients may benefit from implementing CDS tools (e.g. alerts informing providers of existing prior results) or order restrictions (e.g. hard stops) that discourage repeat GIP testing within 14 days, similar to recommendations for C. difficile repeat testing. Reference Deshpande, Pasupuleti and Patel6,Reference McDonald, Gerding and Johnson12,Reference Saif, Dooley, Baghdadi, Morgan and Coffey13,Reference Gupta and Khanna25,Reference Luo, Spradley and Banaei26 Previous studies on CDS tools have demonstrated varying effectiveness at limiting unnecessary testing, with hard stops more effective than soft stops. Reference Saif, Dooley, Baghdadi, Morgan and Coffey13,Reference Procop, Keating and Stagno27,Reference Baghdadi, Coffey, Leekha, Johnson, Diekema and Morgan28
This study has several limitations. We aimed to provide evidence-based guidance for institutional policies regarding repeat molecular testing, while acknowledging that certain clinical scenarios may warrant exceptions to general recommendations. First, its retrospective nature may be subject to unmeasured confounding. For example, false positive recalls for norovirus, as well as cross-reactivity concerns with E. coli species, may affect the specificity of these tests and the organisms identified in our study. Reference Hitchcock, Hogan, Budvytiene and Banaei8,17 However, sensitivity analyses excluding norovirus results and patients tested after the January 2024 norovirus recall did not significantly change the conclusions from this study. Second, we were unable to capture clinical rationale for repeat testing (such as new symptoms, test-of-cure, public health reporting, or unintended repeat testing) due to the absence of standardized clinical documentation. Third, the time line between symptom onset and testing is not standardized. Variations in when a patient is tested during their disease course make it difficult to generalize pathogen persistence rates. Additionally, while our study analyzed repeat testing within 14 days, sensitivity analyses suggested that diagnostic yield did not substantially change even over longer intervals up to 56 days. The “optimal” cutoff for repeat testing intervals will require future studies with larger sample sizes or prospective designs. Fourth, our study focused on molecular detection methods that may not differentiate between nucleic acids and viable organisms. Future studies should investigate the associations between pathogen persistence rates using culture-independent and culture-based methods. Lastly, although we present one of the largest cohorts in repeat GIP testing, our findings may not generalize to other hospital systems with different institutional guidelines and patient populations. For example, GIP ordering may be influenced by the logistics of hospital systems, with different institutional guidelines, patient populations, and hierarchical ordering practices.
In conclusion, repeat GIP testing within 14 days rarely results in the identification of new pathogens and is most ordered during transitions of care. These findings can be used to support institutional policies that discourage repeat testing within 14 days, which would improve diagnostic testing stewardship. Future studies should explore whether CDS interventions to limit repeat GIP testing can improve diagnostic stewardship and patient outcomes.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/ice.2026.10464.
Data availability statement
Individual-level data collected is not publicly available. All data relevant to this study were presented in the article.
Acknowledgments
The authors thank Oleg Lisheba in the Cleveland Clinic Data and Analytics Department for assistance with data extraction, Jacqueline Fox for clinical research support, and members of Center for Populations Health Research group for statistical advice.
Author contributions
J.S. and A.D. devised the project, the main conceptual ideas, and the outline. J.S. collected, cleaned, and analyzed the data. J.E.D. assisted with the project outline and conceptual development. H.W., A.M., and D.D.R. provided clinical expertise and verified the results. J.S., A.S.N., and J.E.D. contributed to the statistical analysis and provided technical expertise. J.S. and A.D. wrote the manuscript with input from all authors. All authors read and approved the final version.
Financial support
None reported.
Competing interests
H.W. has received compensation from Hologic for scientific advisory work and has served as an investigator on sponsored research through her employer with Hologic and Cepheid. DDR has performed, is performing, or has engaged in the potential for funding of sponsored research projects funded through his employer with the following entities: Abbott, Altona, BD, bioMérieux, Cepheid, Hologic, Luminex, Meridian, Pattern Bio, Qiagen, Q-Linea, Rapid Diagnostics, Roche, Selux Diagnostics, and Thermo Fisher; and DDR has equity and/or receives scientific advisory compensation from Renascent Diagnostics and Next Gen Diagnostics. All other authors report no conflicts of interest relevant to this article.
IRB approval
This study was approved by the Cleveland Clinic IRB as “minimal risk research” under IRB protocol 24–651.

