Introduction
Recent research into the human gut microbiome has vastly increased our understanding of the relationship between gut microbes and human health and disease. For instance, we now know that in adults, a low-diversity microbiota with increases in proportions of facultative anaerobes is linked to acute diarrhoea, inflammatory bowel disease, Clostridium difficile infection, metabolic syndrome and liver disease, just to mention a couple of conditions (Cani, Reference Cani2018; Kriss et al., Reference Kriss, Hazleton, Nusbacher, Martin and Lozupone2018). Although most gut microbiome research has focussed on prokaryotic diversity, we have also gained significant insight into the micro-eukaryotic diversity of the human gut. DNA-based methods have been instrumental to this advancement. Three important points have emerged: (1) For some common luminal intestinal parasitic protists (CLIPPs), genetic diversity is surprisingly high; still, DNA sequence data available in publicly available databases such as the NCBI database is rudimentary, hence not reflecting this amount of diversity (Stensvold et al., Reference Stensvold, Lebbad, Victory, Verweij, Tannich, Alfellani, Legarraga and Clark2011c, Reference Stensvold, Lebbad and Clark2012, Reference Stensvold, Winiecka-Krusnell, Lier and Lebbad2018; Royer et al., Reference Royer, Gilchrist, Kabir, Arju, Ralston, Haque, Clark and Petri2012; Poulsen and Stensvold, Reference Poulsen and Stensvold2016). (2) We have come to realize that some CLIPPs are very common and often more common in gut-healthy individuals than in those with functional and inflammatory bowel diseases, contrary to previous general belief (Petersen et al., Reference Petersen, Stensvold, Mirsepasi, Engberg, Friis-Møller, Porsbo, Hammerum, Nordgaard-Lassen, Nielsen and Krogfelt2013; Andersen et al., Reference Andersen, Bonde, Nielsen and Stensvold2015; Krogsgaard et al., Reference Krogsgaard, Engsbro, Stensvold, Nielsen and Bytzer2015, Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018; Rossen et al., Reference Rossen, Bart, Verhaar, van Nood, Kootte, de Groot, D'Haens, Ponsioen and van Gool2015; Beghini et al., Reference Beghini, Pasolli, Truong, Putignani, Cacciò and Segata2017; Jokelainen et al., Reference Jokelainen, Hebbelstrup Jensen, Andreassen, Petersen, Röser, Krogfelt, Nielsen and Stensvold2017; Mirjalali et al., Reference Mirjalali, Abbasi, Naderi, Hasani, Mirsamadi, Stensvold, Balaii, Asadzadeh Aghdaei and Zali2017). (3) Robust links between CLIPPs and gut bacteria have been identified by several research teams (Stensvold and van der Giezen, Reference Stensvold and van der Giezen2018).
These three points currently stimulate interdisciplinary research across the fields of parasitology, clinical microbiology, gastroenterology and ecology. Nevertheless, compared with advances within e.g. bacteriology and virology, progress in the research into CLIPPs and their role in human health and disease is still reflected mostly in simple stool microscopy-based surveys of parasites in selected populations, and is therefore still facing some major challenges. In the following, I will try to detail the status of the three above-mentioned points, and highlight some of the limitations and challenges to the work ahead that aims to identify the significance of CLIPPs in human health and disease.
Which are the most common luminal intestinal parasitic protists?
Intestinal eukaryotes that need a host to complete their life cycles (i.e. organisms that are referred to as ‘parasites’) include both helminths and protists. More typically, the distinction is made between ‘helminths’ and ‘protozoa’, but from a taxonomical point of view, the group of organisms referred to as protozoa does not include one of the most common micro-eukaryotes, namely Blastocystis, and so, the term ‘protists’ appears more relevant and applicable than ‘protozoa’ in this context. Moreover, while some parasitic intestinal protists are invasive (e.g. sporozoa) or adhere to the mucosal lining (e.g. Giardia), quite a few genera appear to be confined mainly to the gut lumen. These include Blastocystis, Dientamoeba, Endolimax, Iodamoeba and most species of Entamoeba, and given their prevalence, they could be referred to as CLIPPs.
Contrary to the situation in developing countries, the number of carriers of helminth infestations other than those attributable to pinworm (Enterobius vermicularis) appears to be rapidly plummeting in human populations in the Western world (Verweij and van Lieshout, Reference Verweij and van Lieshout2011; Verweij, Reference Verweij2014), and also in some parts of the developing world, which would probably reflect improved hygienic standards. Still, and for incompletely known reasons, a substantial proportion of the population is colonized by CLIPPs, especially Blastocystis and Dientamoeba (Verweij and van Lieshout, Reference Verweij and van Lieshout2011; Roser et al., Reference Roser, Simonsen, Nielsen, Stensvold and Molbak2013; Krogsgaard et al., Reference Krogsgaard, Engsbro, Stensvold, Nielsen and Bytzer2015, Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018; Jokelainen et al., Reference Jokelainen, Hebbelstrup Jensen, Andreassen, Petersen, Röser, Krogfelt, Nielsen and Stensvold2017) and, to a lesser extent, by one or more of the Amoebozoa, e.g. Entamoeba coli (Bruijnesteijn van Coppenraet et al., Reference Bruijnesteijn van Coppenraet, Wallinga, Ruijs, Bruins and Verweij2009; Krogsgaard et al., Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018; Stensvold and Nielsen, Reference Stensvold and Nielsen2012; ten Hove et al., Reference ten Hove, Schuurman, Kooistra, Möller, van Lieshout and Verweij2007); these organisms will be introduced briefly below.
Blastocystis
Blastocystis is a genus comprising a perplexing variety of ribosomal lineages that are arguably separate species, judging from the amount of genetic diversity across complete nuclear ribosomal genes. So far, at least 17 ribosomal lineages, the so-called ‘subtypes’, have been acknowledged in humans, non-human primates, other mammals and birds (Alfellani et al., Reference Alfellani, Taner-Mulla, Jacob, Imeede, Yoshikawa, Stensvold and Clark2013b; Clark et al., Reference Clark, van der Giezen, Alfellani and Stensvold2013; Stensvold and Clark, Reference Stensvold and Clark2016). Also reptiles, amphibia and insects have been identified as hosts for various species of Blastocystis (Yoshikawa et al., Reference Yoshikawa, Koyama, Tsuchiya and Takami2016). While this parasite appears to be a rare or at least not so common finding in strict or moderately strict carnivores such as cats, dogs and hyenas (Ruaux and Stang, Reference Ruaux and Stang2014; Wang et al., Reference Wang, Owen, Traub, Cuttell, Inpankaew and Bielefeldt-Ohmann2014; Heitlinger et al., Reference Heitlinger, Ferreira, Thierer, Hofer and East2017; Cociancic et al., Reference Cociancic, Zonta and Navone2018; Moura et al., Reference Moura, Oliveira-Silva, Pedrosa, Nascentes and Cabrine-Santos2018; Udonsom et al., Reference Udonsom, Prasertbun, Mahittikorn, Mori, Changbunjong, Komalamisra, Pintong, Sukthana and Popruk2018), it may be more common in omni- and herbivores, including pigs, cows and sheep (Pakandl, Reference Pakandl1991; Navarro et al., Reference Navarro, Domínguez-Márquez, Garijo-Toledo, Vega-García, Fernández-Barredo, Pérez-Gracia, García, Borrás and Gómez-Muñoz2008; Ramirez et al., Reference Ramirez, Sanchez, Bautista, Corredor, Florez and Stensvold2014; Masuda et al., Reference Masuda, Sumiyoshi, Ohtaki and Matsumoto2018; Moura et al., Reference Moura, Oliveira-Silva, Pedrosa, Nascentes and Cabrine-Santos2018; Udonsom et al., Reference Udonsom, Prasertbun, Mahittikorn, Mori, Changbunjong, Komalamisra, Pintong, Sukthana and Popruk2018). Nine distinct ribosomal lineages, the so-called ‘subtypes’, have been isolated from humans, with subtypes 1–4 predominating. Some subtypes even exhibit extensive within-subtype diversity that to some degree is host-specific; e.g. ST3 (Alfellani et al., Reference Alfellani, Jacob, Perea, Krecek, Taner-Mulla, Verweij, Levecke, Tannich, Clark and Stensvold2013a). Colonization is common in older children and adults than in infants and young children (El Safadi et al., Reference El Safadi, Gaayeb, Meloni, Cian, Poirier, Wawrzyniak, Delbac, Dabboussi, Delhaes, Seck, Hamze, Riveau and Viscogliosi2014; Scanlan et al., Reference Scanlan, Stensvold, Rajilić-Stojanović, Heilig, De Vos, O'Toole and Cotter2014; Poulsen et al., Reference Poulsen, Efunshile, Nelson and Stensvold2016; Salehi et al., Reference Salehi, Haghighi, Stensvold, Kheirandish, Azargashb, Raeghi, Kohansal and Bahrami2017; Scanlan et al., Reference Scanlan, Hill, Ross, Ryan, Stanton and Cotter2018), with prevalence rates reaching 100% in developing countries (El Safadi et al., Reference El Safadi, Gaayeb, Meloni, Cian, Poirier, Wawrzyniak, Delbac, Dabboussi, Delhaes, Seck, Hamze, Riveau and Viscogliosi2014). Moreover, Blastocystis may colonize the human gut for several years (Scanlan et al., Reference Scanlan, Stensvold, Rajilić-Stojanović, Heilig, De Vos, O'Toole and Cotter2014).
Dientamoeba fragilis
DNA-based methods helped overcome the diagnostic challenges related to the detection of Dientamoeba fragilis, a non-flagellated flagellate for which a cyst stage was reported only very recently (Munasinghe et al., Reference Munasinghe, Vella, Ellis, Windsor and Stark2013; Stark et al., Reference Stark, Garcia, Barratt, Phillips, Roberts, Marriott, Harkness and Ellis2014). Dientamoeba fragilis is the only known species in the genus. The first DNA-based detection methods for D. fragilis appeared in the mid-00s (Peek et al., Reference Peek, Reedeker and van Gool2004; Stark et al., Reference Stark, Beebe, Marriott, Ellis and Harkness2005a, Reference Stark, Beebe, Marriott, Ellis and Harkness2005b, Reference Stark, Beebe, Marriott, Ellis and Harkness2006; Verweij et al., Reference Verweij, Mulder, Poell, van Middelkoop, Brienen and van Lieshout2007). Since then, such methods have helped us to realize that this parasite is very common in some populations, especially in Northern Europe (Röser et al., Reference Roser, Simonsen, Nielsen, Stensvold and Molbak2013; de Jong et al., Reference de Jong, Korterink, Benninga, Hilbink, Widdershoven and Deckers-Kocken2014; Ögren et al., Reference Ögren, Dienus, Löfgren, Einemo, Iveroth and Matussek2015; Holtman et al., Reference Holtman, Kranenberg, Blanker, Ott, Lisman-van Leeuwen and Berger2017; Jokelainen et al., Reference Jokelainen, Hebbelstrup Jensen, Andreassen, Petersen, Röser, Krogfelt, Nielsen and Stensvold2017). In Denmark, D. fragilis is almost an obligate finding in children (Röser et al., Reference Roser, Simonsen, Nielsen, Stensvold and Molbak2013; Jokelainen et al., Reference Jokelainen, Hebbelstrup Jensen, Andreassen, Petersen, Röser, Krogfelt, Nielsen and Stensvold2017). In other regions where methods of high sensitivity are also used, such as Australia, the parasite appears to be a lot less common (Stark et al., Reference Stark, Barratt, Chan and Ellis2016); however, studies involving screening of asymptomatic individuals for D. fragilis are very rare, and so the prevalence of the parasite in individuals without symptoms in most parts of the world remains largely unknown. Apart from humans, D. fragilis has been found in non-human primates and pigs (Stark et al., Reference Stark, Phillips, Peckett, Munro, Marriott, Harkness and Ellis2008; Cacciò et al., Reference Cacciò, Sannella, Manuali, Tosini, Sensi, Crotti and Pozio2012). The diversity within the species appears very limited, and most cases of D. fragilis colonization are attributable to one of only two acknowledged genotypes (Genotype 1), no matter where sampling is performed (Stark et al., Reference Stark, Beebe, Marriott, Ellis and Harkness2005a, Reference Stark, Beebe, Marriott, Ellis and Harkness2005b; Stensvold et al., Reference Stensvold, Clark and Röser2013; Cacciò et al., Reference Cacciò, Sannella, Bruno, Stensvold, David, Guimarães, Manuali, Magistrali, Mahdad, Beaman, Maserati, Tosini and Pozio2016; Greigert et al., Reference Greigert, Abou-Bacar, Brunet, Nourrisson, Pfaff, Benarbia, Pereira, Randrianarivelojosia, Razafindrakoto, Solotiana Rakotomalala, Morel, Candolfi and Poirier2018).
Entamoeba
A number of Entamoeba species can colonize the human intestine. Infections due to the potentially highly pathogenic Entamoeba histolytica are relatively rare compared with colonization by Entamoeba dispar, Entamoeba hartmanni, and, especially, Entamoeba coli, which has been found to colonise between 20 and 30% of individuals in surveyed populations in Brazil (Aguiar et al., Reference Aguiar, Gonçalves, Sodré, Pereira, Bóia, de Lemos and Daher2007; Neres-Norberg et al., Reference Neres-Norberg, Guerra-Sanches, Blanco Moreira-Norberg, Madeira-Oliveira, Santa-Helena and Serra-Freire2014; Higa et al., Reference Higa, Cardoso, Weis, França, Pontes, Silva, Oliveira and Dorval2017; Jeske et al., Reference Jeske, Bianchi, Moura, Baccega, Pinto, Berne and Villela2018). Substantial genetic variation has been detected within E. coli, with E. coli subtype 1 and subtype 2 differing by 13% (Stensvold et al., Reference Stensvold, Lebbad, Victory, Verweij, Tannich, Alfellani, Legarraga and Clark2011c). Overall, the genetic diversity within octo-nucleated Entamoebas appears vast and still largely unaccounted for (Jacob et al., Reference Jacob, Busby, Levy, Komm and Clark2016; Elsheikha et al., Reference Elsheikha, Regan and Clark2018). Entamoeba polecki rarely infects humans; nevertheless, four subtypes have been detected with quite varying geographical distribution and host reservoirs (Stensvold et al., Reference Stensvold, Lebbad, Victory, Verweij, Tannich, Alfellani, Legarraga and Clark2011c, Reference Stensvold, Winiecka-Krusnell, Lier and Lebbad2018); all four subtypes have been found in humans (Verweij et al., Reference Verweij, Polderman and Clark2001; Stensvold et al., Reference Stensvold, Winiecka-Krusnell, Lier and Lebbad2018).
It is currently unclear to which extent non-histolytica Entamoebas contribute to the development of intestinal symptoms.
Some other protists show up in stool every now and then (often accompanied by other CLIPPs) and these include parasites belonging to ciliates and the Amoebozoa. Although the amount of documentation is scarce, it is clear that for some of these parasites, especially Iodamoeba and Endolimax, the intra-generic diversity is vast, with a maximum genetic divergence of at least 31% (Stensvold et al., Reference Stensvold, Lebbad and Clark2012; Constenla et al., Reference Constenla, Padrós and Palenzuela2014; Poulsen and Stensvold, Reference Poulsen and Stensvold2016). Endolimax nana was recently shown to colonize 28.8% of 3245 individuals attending the Evandro Chagas National Institute of Infectious Diseases, Rio de Janeiro, Brazil (Faria et al., Reference Faria, Zanini, Dias, da Silva, de Freitas, Almendra, Santana and Sousa2017). To date, no complete, annotated nuclear genome sequences have been published for CLIPPs other than Blastocystis and Entamoeba dispar.
The extensive genetic diversity documented so far within these CLIPPs has informed the taxonomic terminology, and so, depending on the availability of morphology data and genetic diversity and SSU rDNA sequence coverage, sequences are annotated to species, subtypes, ribosomal lineage, or conditional lineage (Jacob et al., Reference Jacob, Busby, Levy, Komm and Clark2016). Importantly, it appears that specific taxonomic terminologies are being developed for individual genera; these are based first and foremost on a pragmatic basis.
CLIPPs in a gut ecology setting
In addition to exploring parasite diversity in the gut, it could be important to try and look to gut microbial and ecological relationships in non-human hosts and in the environment, respectively, to better understand the role of CLIPPs in human health and disease. In the field of ecology, protists have been identified as important components of terrestrial and aquatic environments where they are integral constituents of trophic chains and nutrient cycles (Bates et al., Reference Bates, Clemente, Flores, Walters, Parfrey, Knight and Fierer2013; Maritz et al., Reference Maritz, Rogers, Rock, Liu, Joseph, Land and Carlton2017). In geothermal springs, protist diversity appears to rely on pH and temperature (Oliverio et al., Reference Oliverio, Power, Washburne, Cary, Stott and Fierer2018). The introduction of Acanthamoeba into the rhizosphere of Arabidopsis thaliana leads to rapid changes in associated bacterial communities due to the grazing of the amoeba (Rosenberg et al., Reference Rosenberg, Bertaux, Krome, Hartmann, Scheu and Bonkowski2009). Gut flagellates and ciliates assist termites and ruminants in metabolizing/fermenting carbohydrates (Veira, Reference Veira1986; Ohkuma, Reference Ohkuma2008; Moon-van der Staay et al., Reference Moon-van der Staay, van der Staay, Michalowski, Jouany, Pristas, Javorský, Kišidayová, Varadyova, McEwan, Newbold, van Alen, de Graaf, Schmid, Huynen and Hackstein2014); examples are endless. The presence of protists in various niches therefore appears to be driven by a variety of host- and environment-derived factors and may in turn have a number of vital or less vital consequences for the associated microbiome, be it the host-associated gut microbiome, plant rhizosphere or terrestrial and aquatic biomes. This understanding has to a large extent failed to resonate with professionals in clinical microbiology and related medical fields, where CLIPPs are generally seen as ‘intruders’ and (potential) pathogens, despite the fact that most of these are most probably non-invasive and may have unknown functions of potential benefit (Parfrey et al., Reference Parfrey, Walters and Knight2011; Lukeš et al., Reference Lukeš, Stensvold, Jirků-Pomajbíková and Wegener Parfrey2015; Andersen and Stensvold, Reference Andersen and Stensvold2016).
Nevertheless, the concept of certain gut parasitic protists as ‘ecosystem engineers’ also in humans is sinking in, and studies on trans-kingdom relationships are emerging. For instance, Laforest-Lapointe and Arriet (Reference Laforest-Lapointe and Arrieta2018) recently proposed a model for the ecological role of Blastocystis in the human gut microbiota. They suggested that Blastocystis by predation on abundant bacterial taxa lowers the competition for nutrients and space, leading to an increase in bacterial richness and community evenness. And indeed, carriers of Blastocysts and other CLIPPs have been shown to have gut bacterial microbiomes that differ significantly from those who do not carry these parasites in several recent studies, the findings of which were recently summarized by Stensvold and van der Giezen (Reference Stensvold and van der Giezen2018). In fact, higher diversity and higher richness are typically observed in CLIPPs-positive individuals than in those who are negative. What is more is the fact that observations from a recent meta-analysis of metagenomics data indicated that Blastocystis carriage is linked to low body mass index (Andersen et al., Reference Andersen, Bonde, Nielsen and Stensvold2015; Beghini et al., Reference Beghini, Pasolli, Truong, Putignani, Cacciò and Segata2017), which again lends support to specific links to gut bacterial diversity. However, it remains to be identified, to which extent Blastocystis is actively driving this difference as proposed by Laforest-Lapointe and Arrieta, or whether Blastocystis is merely an indicator or specific bacterial community patterns. Stensvold and van der Giezen (Reference Stensvold and van der Giezen2018) recently hypothesized that the increased intestinal oxygen concentrations observed during gut dysbiosis may prevent Blastocystis from establishing in the gut, which would suggest a role for Blastocystis as an indicator organism.
Experimental models, such as that recently proposed by Pomajbikova and colleagues (Růžková et al., Reference Růžková, Květoňová, Jirků, Lhotská, Stensvold, Parfrey and Jirků Pomajbíková2018), could be used to develop longitudinal studies on bacterial community changes after the establishment of Blastocystis colonization. Blastocystis is one the few parasites that is readily established in culture (Clark and Stensvold, Reference Clark and Stensvold2016), and cysts induced in cultures or obtained from donor material, isolated from stool by gradient centrifugation, can be used for inoculation in order not to co-introduce bacteria that would lead to experimental bias (Rene et al., Reference Rene, Stensvold, Badsberg and Nielsen2009). Here, the use of both eubiotic and dysbiotic animals could be used to study potential differences in colonization success rate.
The fact that some hosts (e.g. cats and dogs) are not so prone to harbouring a parasite such like Blastocystis while others (e.g. humans and artiodactyls) in the same habitat are much more likely hosts, should also be explored in detail, to identify whether this boils down to diet, behaviour (exposure), and/or other factors. If all the subtypes of the parasite are globally pervasive and the overall colonization pressure of Blastocystis strong, differences in intestinal colonization between hosts may rely – at least in part – on differences in gut microbiota composition.
It is intriguing that not only Blastocystis, but also other CLIPPs have been shown to be linked to specific microbiota patterns (Stensvold and van der Giezen, Reference Stensvold and van der Giezen2018). Studying gut microbiomes of rural Africans, Morton et al. (Reference Morton, Lynch, Froment, Lafosse, Heyer, Przeworski, Blekhman and Ségurel2015) could predict the presence/absence of Entamoeba by 79% accuracy, based on the composition of any individual's gut microbiota. To this end, Xiong et al. (Reference Xiong, Yu, Dai, Zhang, Qiu and Ou2018) identified that shrimp health status could be predicted with 92.4% accuracy based on eukaryotic taxon profiling.
Nucleated life within the human intestine also include fungi. Common genera found in stool include Candida, Saccharomyces, Malassezia, Pichia and Aspergillus (Laforest-Lapointe and Arrieta, Reference Laforest-Lapointe and Arrieta2018); however, our understanding of the extent to which these genera in fact colonize the human intestinal tract or merely reflect dietary components is incomplete, and recent evidence appears to suggest that fungal colonization of the intestinal tract of healthy individuals is minimal (Auchtung et al., Reference Auchtung, Fofanova, Stewart, Nash, Wong, Gesell, Auchtung, Ajami and Petrosino2018).
The faecal eukaryome – mapping of eukaryotic diversity in vertebrate stool
As observed by e.g. Hamad et al. (Reference Hamad, Abou Abdallah, Ravaux, Mokhtari, Tissot-Dupont, Michelle, Stein, Lagier, Raoult and Bittar2018), differences in observed microbiome profiles may reflect differences in DNA extraction protocols, DNA amplification and sequencing technologies, plus queried databases (SILVA, Greengenes, RDP, NCBI, self-curated databases, etc.). So far, mapping of eukaryotic diversity in human and non-human stool samples has used mainly one of two approaches: Shotgun sequencing or amplicon-based sequencing of genomic DNA extracted from stool (Cristescu, Reference Cristescu2014). The applicability of shotgun sequencing in terms of detecting and differentiating CLIPPs is hampered by the fact that relatively few CLIPPs genomes are available for reference. Amplicon-based sequencing has typically used nuclear small subunit ribosomal DNA (18S) as the target. However, some variation in amplicon-based approaches is seen, mostly in terms of the choice of target(s) and DNA sequence data processing. The most informative regions of the 18S appear to be the V3, V4, V5 and the V9 regions (Maritz et al., Reference Maritz, Rogers, Rock, Liu, Joseph, Land and Carlton2017; Krogsgaard et al., Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018). As an example, Krogsgaard and colleagues used three different primer sets for eukaryotic DNA (G3F1/G3R1 and G6F1/G6R1 targeting the V3–V4 region of the 18S rRNA gene and G4F1/G4R1 targeting the V3–V5 region) and one set of primers for prokaryotic DNA [341F/806R (Yu et al., Reference Yu, Lee, Kim and Hwang2005)] (Krogsgaard et al., Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018). Sequences were mapped using BION (http://box.com/bion), a newly developed k-mer-based analytical semi-commercial open-source package which allows annotation to species level. Prokaryotic DNA sequences were mapped against the RDP 11.04 reference database, while eukaryotic DNA sequences were mapped using SILVA version 123 reference database with an improved in-house seven-tier taxonomy for eukaryotes, similar to the tiers defined for prokaryotes (phylum, class, order, family, genus, species and sequence levels).
Published data on differences in the eukaryome across vertebrate populations and links between bactieral and eukaryotic signatures are still scarce.
Krogsgaard et al. (Reference Krogsgaard, Engsbro, Stensvold, Nielsen and Bytzer2015) found that CLIPPs diversity was higher in healthy individuals compared with patients with irritable bowel syndrome and also observed that individuals colonized by CLIPPs typically had a higher bacterial richness and diversity than those without (Krogsgaard et al., Reference Krogsgaard, Andersen, Johannesen, Engsbro, Stensvold, Nielsen and Bytzer2018).
Heitlinger et al. (Reference Heitlinger, Ferreira, Thierer, Hofer and East2017) used 4 16S and 44 18S primers in a Fluidigm-based approach, followed by taxonomic analysis using dada2 to map eukaryotic diversity in spotted hyenas. While no differences were found in eukaryome richness, diversity, evenness or genus abundance across age groups in a population of spotted hyenas, a more diverse eukaryome was identified in high-ranking than in low-ranking animals (Heitlinger et al., Reference Heitlinger, Ferreira, Thierer, Hofer and East2017).
Maritz et al. (Reference Maritz, Rogers, Rock, Liu, Joseph, Land and Carlton2017) recently developed and evaluated an 18S rRNA assay employing ILLUMINA-based sequencing and annotation of sequence data using locally curated as well as QIIME formatted SILVA databases with a view to detecting and differentiating protists in sewage with special emphasis on trichomonads. The team used vertebrate blocking primers to increase protist data yield (Maritz et al., Reference Maritz, Rogers, Rock, Liu, Joseph, Land and Carlton2017). Choice of primers is critical too as evidenced by the differing outcomes in terms of e.g. Amoebozoan data obtained by Moreno et al., Reference Moreno, Matz, Kjelleberg and Manefield2010 and Matsunaga et al. (Reference Matsunaga, Kubota and Harada2014), who both aimed at mapping eukaryotic diversity in wastewater/sludge.
The extent to which primate gut eukaryotic diversity is only rudimentarily reflected in reference databases can be exemplified by the following: In a metabarcoding study of non-human primate gut eukaryomes, only 0.01% of all SSU rDNA reads matched sequences in the Silva 123 database at a 100% threshold (Wilcox and Hollocher, Reference Wilcox and Hollocher2018). In that study, de novo operational taxonomic unit (OTU) assignment revealed 4293 eukaryotic OTUs at a 97%-identity level, and reference-based taxonomy assignment matched sequences to 2021 unique eukaryotic genera. Investigating the sewage eukaryome of sludge digesters in Japan, Matsubayashi et al. (Reference Matsubayashi, Shimada, Li, Harada and Kubota2017) found that 85% of the clones obtained by 18S rRNA gene clone library construction showed less than 97.0% sequence identity to what they termed as ‘described eukaryotes’, indicating most of the eukaryotes in anaerobic sludge digesters are largely unknown.
Advancing the mapping of intestinal eukaryotic diversity: Wastewater and new sequencing technologies–the way forward?
In summary, the characterization of nuclear small subunit (SSU) ribosomal RNA genes has been the backbone of DNA-mapping the tree of life. In the field of clinical microbiology, taxon-specific genetic variation across nuclear SSU ribosomal RNA genes has been instrumental to the development of a vast variety of targeted DNA-based diagnostic methods over the past few decades (Verweij and Stensvold, Reference Verweij and Stensvold2014); however, the development and use of such diagnostics are limited by the DNA sequence data available in NCBI (Stensvold et al., Reference Stensvold, Lebbad and Verweij2011b).
The SSU rRNA gene has proved useful for the detection and differentiation of several species of parasites. For helminths, however, this gene generally appears very conserved, and mitochondrial genes or ITS data are taxonomically more informative. Likewise, ITS data appear more relevant for differentiating between non-parasitic eukaryotic organisms often found in the gut, such as yeasts and molds, and so the genes providing most taxonomic resolution differ and depend on the type of organism.
The presence of large intra-generic diversity in some parasites has spurred hypotheses on differences in pathogenicity being associated with species/subtype/genotype, and so our ability to detect and differentiate not only genera and species but also subtypes, ribosomal lineages, etc., is important. Again, while the 18S has proved particularly useful in differentiating between Blastocystis subtypes and even subtype alleles (Stensvold et al., Reference Stensvold, Alfellani and Clark2011a), this marker provides very little resolution within the species of for instance D. fragilis. For other parasites, such as a couple of genera belonging to the Amoebozoa, namely Entamoeba, Endolimax and Iodamoeba, we are only beginning to appreciate the vast extent of genetic diversity (Silberman et al., Reference Silberman, Clark, Diamond and Sogin1999; Clark, Reference Clark2000; Verweij et al., Reference Verweij, Polderman and Clark2001; Stensvold et al., Reference Stensvold, Lebbad and Clark2010, Reference Stensvold, Lebbad, Victory, Verweij, Tannich, Alfellani, Legarraga and Clark2011c; Royer et al., Reference Royer, Gilchrist, Kabir, Arju, Ralston, Haque, Clark and Petri2012; Jacob et al., Reference Jacob, Busby, Levy, Komm and Clark2016; Elsheikha et al., Reference Elsheikha, Regan and Clark2018). The work and methodological limitations involved in mapping the intra-generic diversity in these organisms have led to issues related to resolving the phylogeny among this group of organisms and left some ‘dark holes’ in publicly available databases. Briefly, the largest limitations here are as follows: although hypervariable regions within 18S, ITS or 28S may prove useful for studies into eukaryotic diversity, robust analysis of phylogenetic relationships, including the very delineation of novel ribosomal lineages, and optimal yield of analysis of sequence data from metagenomics or other amplicon-based sequencing studies requires sequencing of complete, or near-complete ribosomal genes. When genomic DNA extracted directly from e.g. stool is used, the application of general primers with a view to amplifying near-complete ribosomal genes often results in preferential amplification of some organisms over other. As an example, individuals colonized by Iodamoeba and/or Endolimax are typically co-colonised with Blastocystis, and because the length of the SSU rRNA gene is only 1.8 kbp in Blastocystis while 2.5 kbp or more in Iodamoeba and Endolimax, Blastocystis ribosomal genes are more likely to be amplified from faecal genomic DNA due to the shorter DNA sequence. Another limitation is related to intra-cellular variation (hypervariable regions), which makes Sanger sequencing of polymerase chain reaction (PCR) products of some sequence stretches unsuitable, e.g. due to the presence of sequence variation within homo-polymers. TA cloning of PCR products has been tried with some success, but this is relatively expensive, time-consuming and laborious (Stensvold et al., Reference Stensvold, Lebbad and Clark2012). Even next-generation sequencing methods such as ILLUMINA do not provide much better solutions to overcoming this issue. Clearly, alternative ways to effectively obtain data are needed.
Meanwhile, Pacific Biosciences (PacBio) RS II, considered a third-generation sequencer, uses single-molecule real-time technology and can be used for sequencing of single DNA molecules in real-time without prior amplification steps, enabling direct observation of DNA synthesis by DNA polymerase (Nakano et al., Reference Nakano, Shiroma, Shimoji, Tamotsu, Ashimine, Ohki, Shinzato, Minami, Nakanishi, Teruya, Satou and Hirano2017). Importantly, this technology enables the production of long reads (typically >20 kbp with a maximum of 60 kbp) at relatively low costs (Nakano et al., Reference Nakano, Shiroma, Shimoji, Tamotsu, Ashimine, Ohki, Shinzato, Minami, Nakanishi, Teruya, Satou and Hirano2017). Orr and colleagues used culturing and targeted PacBio RS II amplicon sequencing to expand on data on the diversity within the class of Diphyllatea, a group of protists that may represent one of the earliest diverging eukaryotic lineages (Orr et al., Reference Orr, Zhao, Klaveness, Yabuki, Ikeda, Makoto and Shalchian-Tabrizi2018). By obtaining near full-length 18S rRNA sequences in addition to mining publicly available databases, they were able to resolve the phylogeny within the class and better map the distribution of members of the class. The technology was also recently used for characterizing and quantifying protistan sequences from environmental samples (Jones and Kustka, Reference Jones and Kustka2017), and in terms of gut microbial diversity, one of the few studies using it so far is that by Myer et al. (Reference Myer, Kim, Freetly and Smith2016) to generate data for phylogenetic analysis of rumen bacterial communities.
A limit to this technology is the relatively high rate of sequencing-related introduced errors; however, there are several ways to reduce or completely eliminate these errors using software tools and by decreasing the time the machine is used. Moreover, PacBio appears to be better at overcoming the issues related to the sequencing of hypervariable regions that e.g. ILLUMINA sequencing may have problems with. Critics of PacBio might argue that the use of this technology should rather be seen as an adjunctive, supportive and possibly exploratory tool that may provide a scaffold that could inform and guide more sophisticated and precise analyses. Such analyses could include Illumina-based sequencing of overlapping 300–400-bp amplicons using sequence-specific primers. Nevertheless, complete and accurate de novo assemblies of Escherichia coli strains could be accomplished using data generated solely from the PacBio RS II (Powers et al., Reference Powers, Weigman, Shu, Pufky, Cox and Hurban2013). The team found that addition of other sequencing technology data obtained by Ion Torrent and MiSeq offered no improvements over the use of PacBio data alone (Powers et al., Reference Powers, Weigman, Shu, Pufky, Cox and Hurban2013).
Apart from identifying the best possible technological and data processing pipelines, it is also worthwhile considering types of material for studying diversity. For instance, untreated sewage may be particularly useful in terms of detecting and mapping micro-eukaryotic diversity, since this material reflects pooled faecal samples from a large population of humans with some spill-over of material from non-human sources.
Chouari et al. (Reference Chouari, Leonard, Bouali, Guermazi, Rahli, Zrafi, Morin and Sghir2017) investigated eukaryotic diversity in wastewater using 18S sequencing, and of 1519 analysed sequences, 160 operational taxonomic units were identified. No less than 56.9% of the phylotypes were assigned to novel phylogenetic molecular species, showing <97% sequence similarity with their nearest affiliated representative within public databases. Similarly, Matsunaga et al. (Reference Matsunaga, Kubota and Harada2014) observed that 60% of their 18S rRNA gene clones obtained from DNA extracted from municipal wastewater had <97% sequence identity to described eukaryotes. In both studies, data on Blastocystis and Amoebozoa were observed. These studies highlight not only the vast DNA data gap in the eukaryotic tree of life, but also the relevance of using sewage as study material for investigations into eukaryotic diversity.
In conclusion, DNA mapping of nucleated life within the intestine and exploring it in ecological contexts are critical to further our understanding of gut microbial diversity and its role in health and disease. Application of more detailed reference data will allow for subtle and robust trans-kingdom analyses of gut microbes and will moreover expand our knowledge on host specificity, transmission patterns and links to clinical phenotypes. The use of genomic DNA from the pooled stool, as e.g. represented by sewage and amplicon-based third-generation sequencing may be a way to ensure the acquisition of quick and robust data to uncover the missing branches of the gut microbial eukaryotic tree.
Author ORCIDs
Christen Rune Stensvold, 0000-0002-1417-7048.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Conflict of interest
None.
Ethical standards
Not applicable.