| Literature DB >> 19132089 |
Donna E Akiyoshi1, Hilary G Morrison, Shi Lei, Xiaochuan Feng, Quanshun Zhang, Nicolas Corradi, Harriet Mayanja, James K Tumwine, Patrick J Keeling, Louis M Weiss, Saul Tzipori.
Abstract
Enterocytozoon bieneusi is the most common microsporidian associated with human disease, particularly in the immunocompromised population. In the setting of HIV infection, it is associated with diarrhea and wasting syndrome. Like all microsporidia, E. bieneusi is an obligate, intracellular parasite, but unlike others, it is in direct contact with the host cell cytoplasm. Studies of E. bieneusi have been greatly limited due to the absence of genomic data and lack of a robust cultivation system. Here, we present the first large-scale genomic dataset for E. bieneusi. Approximately 3.86 Mb of unique sequence was generated by paired end Sanger sequencing, representing about 64% of the estimated 6 Mb genome. A total of 3,804 genes were identified in E. bieneusi, of which 1,702 encode proteins with assigned functions. Of these, 653 are homologs of Encephalitozoon cuniculi proteins. Only one E. bieneusi protein with assigned function had no E. cuniculi homolog. The shared proteins were, in general, evenly distributed among the functional categories, with the exception of a dearth of genes encoding proteins associated with pathways for fatty acid and core carbon metabolism. Short intergenic regions, high gene density, and shortened protein-coding sequences were observed in the E. bieneusi genome, all traits consistent with genomic compaction. Our findings suggest that E. bieneusi is a likely model for extreme genome reduction and host dependence.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19132089 PMCID: PMC2607024 DOI: 10.1371/journal.ppat.1000261
Source DB: PubMed Journal: PLoS Pathog ISSN: 1553-7366 Impact factor: 6.823
Figure 1Estimation of the genome size of E. bieneusi based on pulsed field electrophoresis analysis.
The E. bieneusi chromosomes were separated by electrophoresis (see Methods) and stained with ethidium bromide. The sizes of the chromosomal bands (lane 2; 0.92, 1.0, 1.06 Mb) were estimated using the MCID 3.0 software (Imaging Research Inc., St. Catharines, Canada). Based on densitometry analysis (ImageQuant TL 1-D analysis software; GE Healthcare Bio-Sciences Corp., Piscataway, NJ), the ratios of the bands were estimated to be 1∶4∶1. Using these ratios, the genome was estimated to be ∼6 Mb. S. cerevisiae chromosomal size standards (Bio-Rad) were included (lane 1) and their sizes (Mb) are shown.
E. bieneusi statistics with a comparison to the E. cuniculi genome.
|
|
| |
| Genome Size, Mb | 6 | 2.9 |
| Chromosome Number | 6 | 11 |
| Scaffolds | 1,646 | 11 |
| Scaffold N50 bp | [2,349] | NA |
| Contigs | [1,742] | NA |
| Contig N50 bp | [1,977] | NA |
| Assembled Mb | [3.86] | 2.5 |
| Sequence Coverage, % | [64] | 86 |
| G+C Content, % | [25 | 47 |
| Gene Models | [3,804] | 2,063 |
| Gene Density | [1/1,148 bases | 1/1,025 bases |
| No. of SSU-LSU rRNA Genes | Unknown | 22 |
| No. of 5S rRNA Genes | Unknown | 3 |
| No. of tRNAs |
| 46 |
| No. and Sizes of tRNA Introns | [2 (13, 30 bp)] | 2 (16, 42 bp) |
| No. of tRNA Synthetases |
| 19 |
| No. and Sizes of Splicesomal Introns | [19 (36–306 bp)] | 13 (23–52 bp) |
| Predicted CDS | [3,632] | 1,997 |
| Mean Intergenic Region, bp | [127 | 129 |
| Median CDS, bp | [579] | 858 |
| Mean Size of CDS, bp | [995 | 1,077 |
| Overlapping CDS? | Yes | Yes |
| No. of CDS Assigned to Functional Categories | [653 (39%) | 884 (44%) |
Estimation based on PFGE data.
Values based on analysis of contigs >5 kb and their encoded genes. Bracketed values indicate values based on the survey data and not a complete genome.
The 5S, 5.8S-SSU and LSU rRNA genes have been identified in E. bieneusi but are located on short contigs, with very short regions of sequence upstream and downstream, suggesting that these are either surrounded by sequences that are difficult to clone into E. coli or are difficult to sequence, such as highly repeated sequences. Therefore, the copy number of these genes cannot be determined at this time.
CDS, coding sequences; NA, not applicable
Values based on data from Table S4.
Figure 2Comparison of the lengths of E. bieneusi proteins with their respective S. cerevisiae (Sc) and E. cuniculi (Ec) homologs.
Only E. bieneusi proteins with assigned functions that were located on the larger contigs were included in these comparisons. (A). Lengths of the E. bieneusi proteins (n = 566) relative to their S. cerevisiae homologs, expressed as a percentage. (B). Lengths of the E. bieneusi proteins (n = 580) relative to their E. cuniculi homologs, expressed as a percentage. E. bieneusi proteins that were shorter or larger than their respective homologs have percentages less than 100% or greater than 100%, respectively.
Figure 3Grouped bar graph showing the distribution percentages of E. bieneusi (blue) and E. cuniculi (red) proteins among the functional categories.
E. bieneusi proteins (653 proteins) were assigned to one of eleven functional categories listed in Katinka et al. [25]. For comparison, the distribution percentages of the E. cuniculi proteins (884 proteins) were included [25]. The corresponding gene lists for both organisms are presented in Table S4.