| Literature DB >> 25903370 |
Yaqiong Guo1,2, Kevin Tang3, Lori A Rowe4, Na Li5, Dawn M Roellig6, Kristine Knipe7, Michael Frace8, Chunfu Yang9, Yaoyu Feng10, Lihua Xiao11.
Abstract
BACKGROUND: Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis-associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major new subtype in the United States. In this study, we sequenced the genomes of two field specimens from each of the two subtypes and conducted a comparative genomic analysis of the obtained sequences with those from the only fully sequenced Cryptosporidium parvum genome.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25903370 PMCID: PMC4407392 DOI: 10.1186/s12864-015-1517-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of sequence data from whole genome sequencing of four specimens in comparison with data from the published (TU502) and (IOWA) genomes
|
|
|
|
|
|
| |||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |||||
| 30976 (IaA28R4) | Illumina Genome Analyzer IIx 100 bp paired end | 5,780,028,818 | 64,449,544 | 6,140 | 22,133,082 | 3,605 | 502 | 1,279,890 | 145,968 | 257 |
| 37999 (IbA10G2) | Illumina Genome Analyzer IIx 100 bp paired end | 2,798,259,889 | 30,886,077 | 78 | 9,054,010 | 116,077 | 510 | 1,029,232 | 406,678 | 307 |
| 33537 (IaA28R4) | 454 GS-FLX Titanium | 431,742,212 | 1,157,140 | 1,464 | 14,065,231 | 9,607 | 501 | 154,507 | 27,749 | 31 |
| 30974 (IbA10G2) | 454 GS-FLX Titanium | 382,520,957 | 1,048,412 | 443 | 8,841,752 | 19,959 | 513 | 325,032 | 78,110 | 43 |
|
| Sanger | - | - | 1,422 | 8,743,570 | 6,149 | 251 | 90,444 | 14,504 | 12 |
|
| Sanger | - | - | 18 | 9,102,324 | 504,874 | 17,388 | 1,278,458 | 1,014,526 | 13 |
Coverage of four genomes sequenced in this study and sequence similarities to published (IOWA) and (TU502) genomes
|
|
|
|
|
|
|
| |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
| 1 | 875659 | 15 | 867575 | 99.08 | 96.8 | 99.79 | 1 | 873289 | 99.73 | 96.81 | 99.86 | 36 | 863586 | 98.62 | 96.84 | 99.82 | 54 | 840604 | 96 | 96.93 | 99.86 | 124 | 859754 | 98.18 | 96.88 |
| 2 | 985969 | 7 | 987017 | 100.11 | 96.79 | 99.58 | 8 | 983830 | 99.78 | 96.82 | 99.81 | 41 | 970191 | 98.4 | 96.81 | 99.77 | 69 | 944487 | 95.8 | 96.9 | 99.82 | 115 | 946071 | 95.95 | 96.73 |
| 3 | 1099352 | 13 | 1098355 | 99.91 | 96.89 | 99.62 | 13 | 1096430 | 99.73 | 96.86 | 99.79 | 40 | 1081251 | 98.35 | 96.86 | 99.75 | 87 | 1064015 | 96.78 | 96.98 | 99.81 | 158 | 1079381 | 98.18 | 96.78 |
| 4 | 1104417 | 3 | 1103687 | 99.93 | 96.76 | 99.72 | 4 | 1105075 | 100.06 | 96.76 | 99.8 | 86 | 1056974 | 95.7 | 96.81 | 99.74 | 123 | 1024384 | 92.75 | 96.93 | 99.82 | 195 | 1007110 | 91.19 | 96.77 |
| 5 | 1080900 | 13 | 1092751 | 101.1 | 96.78 | 99.74 | 11 | 1107822 | 102.49 | 96.78 | 99.84 | 71 | 1013283 | 93.74 | 96.93 | 99.77 | 118 | 940540 | 87.01 | 97.09 | 99.83 | 186 | 972978 | 90.02 | 96.79 |
| 6 | 1332857 | 5 | 1304591 | 97.88 | 96.91 | 99.76 | 2 | 1298888 | 97.45 | 96.93 | 99.86 | 66 | 1263267 | 94.78 | 97.01 | 99.76 | 124 | 1237394 | 92.84 | 97.09 | 99.83 | 192 | 1240122 | 93.04 | 96.82 |
| 7 | 1278458 | 5 | 1268482 | 99.22 | 97.19 | 99.79 | 4 | 1269257 | 99.28 | 97.2 | 99.87 | 27 | 1267106 | 99.11 | 97.18 | 99.82 | 79 | 1258429 | 98.43 | 97.25 | 99.88 | 124 | 1282777 | 100.34 | 97.18 |
| 8 | 1344712 | 3 | 1319172 | 98.1 | 96.78 | 99.76 | 2 | 1319721 | 98.14 | 96.8 | 99.83 | 57 | 1300516 | 96.71 | 96.81 | 99.78 | 113 | 1281066 | 95.26 | 96.9 | 99.81 | 175 | 1304075 | 96.98 | 96.74 |
| Total | 9102324 | 64 | 9041990 | 99.34 | 96.86 | 99.72 | 45 | 9054312 | 99.47 | 96.87 | 99.83 | 424 | 8816174 | 96.93 | 96.9 | 99.78 | 767 | 8590919 | 94.36 | 97 | 99.83 | 1269 | 8692268 | 95.49 | 96.84 |
Figure 1Structural organization of two Illumina-sequenced genomes of Cryptosporidium hominis comparing to eight chromosomes (numbered and separated by vertical red lines) of published Cryptosporidium parvum genome. The color blocks (known as Locally Collinear Blocks) are conserved segments of sequences internally free from genome rearrangements, whereas the inverted white peaks within each block are sequence divergence between the reference C. parvum (IOWA) genome and C. hominis genome under analysis. A. Coverage of two C. hominis genomes showing possible sequence rearrangements in chromosomes 2, 4, 5 and 6. Assembled contigs are bordered by vertical red lines. For specimens 30976, only Cryptosporidium contigs were used in mapping. B. Possible sequence rearrangements at the 5′ end of chromosome 2. C. Possible sequence rearrangements in chromosomes 4 and 5.
Figure 2Deletion of genes in Cryptosporidium hominis genomes in comparison with Cryptosporidium parvum. A. Deletion of four genes (cgd6_5480, cgd6_5490, cgd6_5510, and cgd6_5520) at the 3′ end of chromosome 6 (probably should be the 5′ end of chromosome 5) in C. hominis. B. A major 19,048-bp deletion in C. hominis genome in chromosome 8, including the cgd8_680 and cgd8_690 genes. Note the ~10 kb sequence gap in C. parvum.
Coverage of two Illumina-sequenced genomes in sequence gaps of the published IOWA genome
|
|
|
| |
|---|---|---|---|
|
|
| ||
| 2 | 100* | 1481 | 1481 |
| 3 | 500 | 5582 | 5576 |
| 4 | 100 (1st)* | >458 | 1450 |
| 4 | 100 (2nd)* | 2937 | 2714 |
| 5 | 100 (1st)* | 14,926 | 14,929 |
| 5 | 100 (2nd)* | 1788 | 1311 |
| 5 | 1,000 (3rd) | Not covered | 467 |
| 6 | 2,500 | 245 | 245 |
| 6 | 100* | >538 (ending with telomeric repeats) | >857 (ending with telomeric repeats) |
| 8 | 10,000 | 19,048 bp deletion spanning entire gap | 19,048 bp deletion spanning entire gap |
*Regions where inversions and translocations of sequences occurred in sequenced C. hominis genomes.
Species-specific genes in genomes of and
|
|
|
|
|
|---|---|---|---|
| 8 | 19,048 | cgd8_680, cgd8_690 and other potential genes in 10,000 bp sequence gap |
|
| 6 | 15,314 | cgd6_5480, cgd6_5490, cgd6_5510, cgd6_5520 |
|
| 5 | 5,620 | cgd5_4580, cgd5_4590, cgd5_4610 |
|
| 3 | ~4800 | Chro.50011 |
|
Notes:
1. cgd5_4580, cgd5_4590, cgd5_4600, and cgd5_4610: four genes with similar sequences at the 3′ end of chromosome 5 in C. parvum, all called telomeric MEDLE family of secreted proteins. C. hominis has only one such gene here (Chro.50507, the ortholog of cgd5_4600).
2. cgd6_5480 and cgd6_ 5490: two genes of the telomeric MEDLE family of secreted proteins with similar sequences at 3′ end of chromosome 6 in C. parvum. C. hominis has no such gene here. The two genes have sequences similar to the four genes above. This fragment and cgd6_5510 (ZPT) and cgd6_5520 below are located at the 5′ end of chromosome 5 in the C. hominis genomes sequenced. C. hominis specimen 37999 does not appear to have the ortholog for cgd6_5470, although 30976 clearly has it. Ortholog of cgd6_5500 is apparently translocated to an unknown chromosome in C. hominis, downstream of the ortholog of cgd5_4580.
3. cgd6_5510 (ZPT) and cgd6_5520: telomeric insulinase-like protease with a signal peptide (the two genes have very different sequences). C. parvum has 11 such genes near 3′end of chromosome 3.
4. cgd8_680: a large low complexity protein with repeats. cgd8_690: a signal peptide containing protein with 2 Cryptosporidium-specific paralogs (cgd8_660 and its ortholog chro.80081).
Figure 3Cryptosporidium hominis-specific nature of Chro.50011. A. Insertion of ~4,860 bp containing the Chro.50011 gene at the 3′ end of chromosome 3 in C. hominis. B. Confirmation of the absence of the ortholog of Chro.50011 in four specimens of C. parvum by PCR analysis of three regions of the Chro.50011 gene. The faint band in PCR analysis of the 3′ end of the gene in C. parvum specimen 38416 produced a nucleotide sequence identical to Chro.50011 in C. hominis.
Figure 4Distribution of SNPs in Cryptosporidium hominis genome by chromosome. The number of SNPs in a sliding window of 2,000 bp with 200 bp steps across each of the eight chromosomes is shown. A. Sequence divergence between specimen 30976 of the IaA28R4 subtype and the published isolate TU502 of the IaA25R3 subtype. B. Sequence divergence between specimen 37999 of the IbA10G2 subtype and specimen 30976 of the IaA10G2 subtype.
Highly polymorphic loci in genomes
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| chr1_var1 | contig_5 | 5.0 | Chro.10024 | cgd1_150 | Hypothetical protein with a signal peptide |
| chr1_var2 | contig_5 | 9.0 | Chro.10427 | cgd1_3810 | Conserved hypothetical protein with a signal peptide |
| chr2_var1 | contig_32 | 9.5 | Chro.20050-52 | cgd2_430-450 | Mucin glycoprotein with a signal peptide |
| chr2_var2 | contig_20 | 5.0 | Intergenic downstream of Chro.20394 | Intergenic downstream of cgd2_3690 | WD repeat protein (cgd2_3690) |
| chr3_var1 | contig_59 | 5.0 | Intergenic downstream of Chro.30096 | Within cgd3_720 | Very large mucin with a signal peptide |
| chr3_var2 | contig_31 | 6.5 | Chro.30315 | cgd3_2770 | Hypothetical conserved protein |
| chr3_var3 | contig_293 + contig_255 | 7.0 | Chro.30479 | cgd3_4260 | Insulinase-like protease |
| chr6_var1 | contig_3 | 9.0 | Chro.60606 | cgd6_5270 | Hypothetical protein with a signal peptide |
| chr8_var1 | contig_11 | 5.5 | Chro.80070 | cgd8_550 | Large uncharacterized protein |
| chr8_var2 | contig_2 | 6.0 | Chro.80189 | cgd8_1610 | Sacsin-like HSP90 chaperone domain |
*Additional polymorphic genes identified by comparative analysis of other isolates with C. hominis TU502: Chro.60016 (ortholog of cgd6_60), and Chro.60138 (ortholog of cgd6_1080).
Genetic recombination in chromosome 6 of two virulent subtypes
|
|
| ||
|---|---|---|---|
|
|
|
| |
| 30974 (IbA10G2) | IbA10G2 | IbA10G2 | IaA25R3 |
| 37999 (IbA10G2) | IaA25R3* | IbA10G2 | IaA25R3 |
| 30976 (IaA28R4) | IaA25R3 | IaA28R4 | IaA28R4 |
| 33537 (IaA28R4) | IaA25R3 | IaA28R4 | IaA28R4 |
| TU502 (IaA25R3) | IaA25R3 | IaA25R3 | IaA25R3 |
*15/16 SNPs at the 3′ end of cgd6_60 are unique in 37999.