| Literature DB >> 34200631 |
Rodrigo P Baptista1,2, Garrett W Cooper3, Jessica C Kissinger1,2,3.
Abstract
Cryptosporidiosis is ranked sixth in the list of the most important food-borne parasites globally, and it is an important contributor to mortality in infants and the immunosuppressed. Recently, the number of genome sequences available for this parasite has increased drastically. The majority of the sequences are derived from population studies of Cryptosporidium parvum and Cryptosporidium hominis, the most important species causing disease in humans. Work with this parasite is challenging since it lacks an optimal, prolonged, in vitro culture system, which accurately reproduces the in vivo life cycle. This obstacle makes the cloning of isolates nearly impossible. Thus, patient isolates that are sequenced represent a population or, at times, mixed infections. Oocysts, the lifecycle stage currently used for sequencing, must be considered a population even if the sequence is derived from single-cell sequencing of a single oocyst because each oocyst contains four haploid meiotic progeny (sporozoites). Additionally, the community does not yet have a set of universal markers for strain typing that are distributed across all chromosomes. These variables pose challenges for population studies and require careful analyses to avoid biased interpretation. This review presents an overview of existing population studies, challenges, and potential solutions to facilitate future population analyses.Entities:
Keywords: cryptosporidiosis; genome evolution; mixed infections; molecular typing; population structure
Mesh:
Year: 2021 PMID: 34200631 PMCID: PMC8229070 DOI: 10.3390/genes12060894
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Summary of Cryptosporidium genome assembly data available in the NCBI GenBank.
| # of Genome Sequences Available | Sequencing Technology | Gene Evidence Availability | ||||
|---|---|---|---|---|---|---|
| RNAseq a | Expressed Sequence Tag Datasets | Proteomic Data | # of Genome Annotations Available | |||
|
| 19 | Sanger, Illumina, 454, ABI SOLiD, PacBio, ONT, HAPPY map data | Yes | Yes | Yes | 2 |
|
| 12 | Sanger, Illumina, Ion Torrent, 454 | Yes | Yes | Yes | 5 |
|
| 5 | Illumina | No | No | No | 1 |
|
| 3 | Illumina | No | No | No | 1 |
|
| 3 | Illumina | No | No | No | 1 |
|
| 1 | Sanger and 454 | No | No | Yes | 1 |
|
| 1 | Illumina | No | No | No | 1 |
|
| 1 | Illumina | No | No | No | 1 |
|
| 1 | Illumina | No | No | No | 0 |
|
| 1 | Illumina | No | No | No | 0 |
|
| 1 | Illumina | No | No | No | 0 |
|
| 1 | Illumina | No | No | No | 0 |
| 1 | Illumina | No | No | No | 0 | |
| 1 | Illumina | No | No | No | 0 | |
|
| 1 | Illumina, PacBio | Yes | Yes | No | 0 |
a not available for all lifecycle stages. Most data represent only extracellular oocyst and sporozoite lifecycle stages. # means Number.
Figure 1A single genetic marker is not representative of the entire genome sequence and evolution of an organism. (A) Comparative maximum likelihood topology analysis of two different Cryptosporidium genes (nt) that are usually used as markers; Dashed red lines represents differences with bootstrap values below 50% and solid lines with bootstrap values above 80% (B) Admixture clustering analysis of C. hominis biallelic variant sites reveals genomic variation within strains of the same gp60 subtype. The number of ancestral populations (K) were predicted by the lowest cross-validation error K value (K = 10) obtained from the admixture analysis. Each column in the graph represents an individual isolate, while each color within the column represents an ancestral population. The gp60 subtype for each isolate is indicated below the columns. GenBank and SRA accession numbers for the sequences utilized are provided in Supplemental Tables S1, S2 and the methods are described in Supplemental Methods.