Hostname: page-component-6766d58669-vgfm9 Total loading time: 0 Render date: 2026-05-15T18:13:05.744Z Has data issue: false hasContentIssue false

Comparing Mycobacterium tuberculosis transmission reconstruction models from whole genome sequence data

Published online by Cambridge University Press:  09 June 2023

Benjamin Sobkowiak*
Affiliation:
Division of Respiratory Medicine, University of British Columbia, Vancouver, BC, Canada British Columbia Centre for Disease Control, Vancouver, BC, Canada
Kamila Romanowski
Affiliation:
British Columbia Centre for Disease Control, Vancouver, BC, Canada Department of Medicine, University of British Columbia, Vancouver, BC, Canada
Inna Sekirov
Affiliation:
British Columbia Centre for Disease Control, Vancouver, BC, Canada Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
Jennifer L. Gardy
Affiliation:
Bill and Melinda Gates Foundation, Seattle, WA, USA
James C. Johnston
Affiliation:
Division of Respiratory Medicine, University of British Columbia, Vancouver, BC, Canada British Columbia Centre for Disease Control, Vancouver, BC, Canada
*
Corresponding author: Benjamin Sobkowiak; Email: bs2259@yale.edu
Rights & Permissions [Opens in a new window]

Abstract

Genomic epidemiology is routinely used worldwide to interrogate infectious disease dynamics. Multiple computational tools exist that reconstruct transmission networks by coupling genomic data with epidemiological models. Resulting inferences can improve our understanding of pathogen transmission dynamics, and yet the performance of these tools has not been evaluated for tuberculosis (TB), a disease process with complex epidemiology including variable latency and within-host heterogeneity. Here, we performed a systematic comparison of six publicly available transmission reconstruction models, evaluating their accuracy when predicting transmission events in simulated and real-world Mycobacterium tuberculosis outbreaks. We observed variability in the number of transmission links that were predicted with high probability (P ≥ 0.5) and low accuracy of these predictions against known transmission in simulated outbreaks. We also found a low proportion of epidemiologically supported case–contact pairs were identified in our real-world TB clusters. The specificity of all models was high, and a relatively high proportion of the total transmission events predicted by some models were true links, notably with TransPhylo, Outbreaker2, and Phybreak. Our findings may inform the choice of tools in TB transmission analyses and underscore the need for caution when interpreting transmission networks produced using probabilistic approaches.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1. The transmission network reconstruction tools evaluated in this study, detailing the epidemiological features and input data type for each tested approach

Figure 1

Figure 1. Boxplots showing the sensitivity (green), specificity (blue), and PPV (red) of each transmission reconstruction model for predicting known transmission events in 20 simulated tuberculosis outbreaks. Links with a probability of ≥0.5 are considered. The results when transmission links between pairs in any direction are shown for Phybreak simulations in (a) and TransPhylo+SeqGen simulations in (b). The results when transmission links are predicted with the correct donor–recipient direction are shown for Phybreak simulations in (c) and TransPhylo+SeqGen simulations in (d).

Figure 2

Figure 2. Example transmission networks predicted by each tested method for cluster MCLUST006 (n = 6) of Mycobacterium tuberculosis strains from British Columbia. Nodes represent sampled hosts and edges are the highest probability transmission link between hosts. Edge widths are weighted by the SNP distance between connected hosts, and edges are coloured black if the posterior probability of direct transmission ≥0.5 and grey if <0.5.

Figure 3

Table 2. The results of predicted transmission reconstruction model for identifying transmission links in real-world Mtb clusters in BC that are supported by case–contact data. Bolded values are the best performing models.

Figure 4

Figure 3. Boxplots of the transmission parameters estimated by each tested method from high-probability (P ≥ 0.5) transmission events between sampled Mycobacterium tuberculosis isolates from British Columbia. (a) The SNP distance between observed hosts, and (b) the transmission interval between infection times of observed hosts. Note that SCOTTI and seqTrack do not estimate infection times.

Supplementary material: File

Sobkowiak et al. supplementary material

Sobkowiak et al. supplementary material 1

Download Sobkowiak et al. supplementary material(File)
File 64.4 KB
Supplementary material: File

Sobkowiak et al. supplementary material

Sobkowiak et al. supplementary material 2

Download Sobkowiak et al. supplementary material(File)
File 12.6 KB