Hostname: page-component-77f85d65b8-242vh Total loading time: 0 Render date: 2026-03-26T09:59:19.208Z Has data issue: false hasContentIssue false

Machine learning-based analyses support the existence of species complexes for Strongyloides fuelleborni and Strongyloides stercoralis

Published online by Cambridge University Press:  16 June 2020

Joel L. N. Barratt*
Affiliation:
Division of Parasitic Diseases and Malaria, Centers for Disease Control and Prevention, Parasitic Diseases Branch, Atlanta, USA Oak Ridge Associated Universities, Oak Ridge, Tennessee, USA
Sarah G. H. Sapp*
Affiliation:
Division of Parasitic Diseases and Malaria, Centers for Disease Control and Prevention, Parasitic Diseases Branch, Atlanta, USA
*
Author for correspondence: Joel L. N. Barratt, E-mail: jbarratt@cdc.gov; joelbarratt43@gmail.com and Sarah G. H. Sapp, E-mail: xyz6@cdc.gov
Author for correspondence: Joel L. N. Barratt, E-mail: jbarratt@cdc.gov; joelbarratt43@gmail.com and Sarah G. H. Sapp, E-mail: xyz6@cdc.gov

Abstract

Human strongyloidiasis is a serious disease mostly attributable to Strongyloides stercoralis and to a lesser extent Strongyloides fuelleborni, a parasite mainly of non-human primates. The role of animals as reservoirs of human-infecting Strongyloides is ill-defined, and whether dogs are a source of human infection is debated. Published multi-locus sequence typing (MLST) studies attempt to elucidate relationships between Strongyloides genotypes, hosts, and distributions, but typically examine relatively few worms, making it difficult to identify population-level trends. Combining MLST data from multiple studies is often impractical because they examine different combinations of loci, eliminating phylogeny as a means of examining these data collectively unless hundreds of specimens are excluded. A recently-described machine learning approach that facilitates clustering of MLST data may offer a solution, even for datasets that include specimens sequenced at different combinations of loci. By clustering various MLST datasets as one using this procedure, we sought to uncover associations among genotype, geography, and hosts that remained elusive when examining datasets individually. Multiple datasets comprising hundreds of S. stercoralis and S. fuelleborni individuals were combined and clustered. Our results suggest that the commonly proposed ‘two lineage’ population structure of S. stercoralis (where lineage A infects humans and dogs, lineage B only dogs) is an over-simplification. Instead, S. stercoralis seemingly represents a species complex, including two distinct populations over-represented in dogs, and other populations vastly more common in humans. A distinction between African and Asian S. fuelleborni is also supported here, emphasizing the need for further resolving these taxonomic relationships through modern investigations.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is included and the original work is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use.
Copyright
Copyright © Centers for Disease Control and Prevention, USA, 2020. To the extent this is a work of the US Government, it is not subject to copyright protection within the United States. Published by Cambridge University Press
Figure 0

Fig. 1. Schematic of the Strongyloides genotyping scheme referenced here. Graphical representation a Strongyloides sp. genotyping scheme after the description of Barratt et al. (2019b) and Jaleta et al. (2017). This scheme was expanded here to include haplotype XV of 18S HVR-I (GenBank Accession: MT436714), identified in genomes sequenced from Japanese humans by Kikuchi et al. (2016). For the purposes of this study, note that the haplotypes originally defined after the description of Barratt and colleagues were split into smaller segments as shown here.

Figure 1

Fig. 2. Dendrogram of clustered distances generated from genotyped Strongyloides stercoralis using ML. This dendrogram was divided into seven distinct clusters. Branches are coloured according to their cluster number and are simply used to differentiate branches. Coloured peripheral bars indicate the host species from which the genotyped S. stercoralis were collected from; either a dog, human, chimpanzee or cat. The colour of these respective silhouettes matches the peripheral bars. For the identity of each specimen in this dendrogram refer to Supplementary File S2, Appendix part B. Cluster 5 is exclusively occupied by S. stercoralis infecting dogs. Specimens assigned to Cluster 4 were more frequently collected from dogs, though were also found in humans and cats. S. stercoralis from chimpanzees were assigned to clusters 2 and 3, though these were less frequently observed in these samples. S. stercoralis from clusters 1, 2, 3, 6 and 7 were more frequently collected from human samples than other hosts. The geographic origin of sequenced specimens included in this dendrogram is shown in Supplementary File S1, Tab B.

Figure 2

Table 1. Frequency of S. stercoralis from dogs and humans assigned to each genetic cluster and their respective χ2 P values

Figure 3

Fig. 3. Dendrogram of clustered distances generated from genotyped specimens of Strongyloides fuelleborni and Strongyloides sp. from lorises using ML. This dendrogram was divided into seven distinct clusters. Branches are coloured according to their cluster number and the coloured peripheral bars indicate the host species from which the S. stercoralis specimens/sequences were derived, either a human (Hu), gorilla (Go), chimpanzee (Ch), baboon (Ba), orangutan (Or), Long-tailed macaque (Lt), proboscis monkey (Pr), Silvered leaf monkey (Sl), Slow loris (Lo), or Japanese macaque (Jm). Black branches visualize the relationship between clusters. For the identity of each specimen in this dendrogram refer to Supplementary File S1, Appendix part C. Clusters 1 & 2 include specimens from Africa. Cluster 3 includes specimens from Thailand and Laos, and a single specimen from an Indian human. Cluster 4 includes specimens collected in Japan. Clusters 5, 6 and 7 include specimens from Malaysian Borneo. This dendrogram supports that clusters 1 and 2 include S. fuelleborni genotypes that infect humans more commonly than the Strongyloides in other clusters.

Supplementary material: PDF

Barratt and Sapp supplementary material

Barratt and Sapp supplementary material 1

Download Barratt and Sapp supplementary material(PDF)
PDF 15.7 MB
Supplementary material: File

Barratt and Sapp supplementary material

Barratt and Sapp supplementary material 2

Download Barratt and Sapp supplementary material(File)
File 3.4 MB