| Literature DB >> 17577414 |
Jason W Abernathy1, Peng Xu, Ping Li, De-Hai Xu, Huseyin Kucuktas, Phillip Klesius, Covadonga Arias, Zhanjiang Liu.
Abstract
BACKGROUND: The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17577414 PMCID: PMC1906770 DOI: 10.1186/1471-2164-8-176
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
A summary of the EST analysis.
| Description | Number | Percentage |
| Total number of clones sequenced | 10,368 | |
| Total number of successful sequences | 9,769 | 94.2% |
| Number of high quality sequences | 8,432 | 86.3%1 |
| Unique sequences | 4,706 | 55.8%2 |
| Number of contigs | 976 | |
| Number of clones included in the contigs | 4,702 | |
| Average clones per contig | 4.82 | |
| Number of singletons | 3,730 | |
| Number of known genes | 2,518 | 53.5%3 |
| Unique unknown genes | 2,188 | 46.5%3 |
1Percentage of high quality sequences from successful sequences; 2percentage of unique sequences of the high quality sequences; 3percentage of unique sequences
The most abundant ESTs detected from the EST sequencing
| Cluster | # of Sequences | Putative identities | % of Total |
| 276 | 764 | Hypothetical protein TTHERM_02141640 from | 7.36% |
| 60 | 119 | Hypothetical protein TTHERM_02641280 from | 1.15 |
| 636 | 86 | Unknown | 0.82 |
| 602 | 78 | Unknown | 0.75 |
| 171 | 48 | Heat shock protein 90 | 0.46 |
| 83 | 39 | Zinc finger ZZ type family protein | 0.38 |
| 105 | 38 | Unknown | 0.37 |
| 279 | 35 | Heat shock protein 90 | 0.34 |
| 392 | 34 | Unknown | 0.33 |
| 354 | 31 | Heat shock protein 70 (dnaK) | 0.30 |
| 203 | 31 | Conserved hypothetical protein from | 0.30 |
| 219 | 29 | Hypothetical protein PY05925 from | 0.28 |
| 932 | 28 | Unknown | 0.27 |
| 75 | 27 | Unknown | 0.26 |
| 472 | 24 | ER type HSP70 | 0.23 |
| 351 | 23 | Unknown | 0.22 |
| 833 | 23 | Unknown protein from | 0.22 |
| 131 | 22 | Unknown | 0.21 |
| 6 | 21 | Dynein heavy chain protein | 0.20 |
| 45 | 21 | Outer surface protein from | 0.20 |
A summary of simple sequence repeats identified from the Ich ESTs. Percentages indicated in the parentheses are percentage of each type of repeat among all repeats
| Total number of sequences analyzed | 8,432 |
| Number of dinucleotide repeats | 422 (68.8%) |
| Number of AC repeats | 121 |
| Number of AG repeats | 108 |
| Number of AT repeats | 56 |
| Number of CT repeats | 49 |
| Number of GT repeats | 88 |
| Number of GC repeats | 0 |
| Number of trinucleotide repeats | 145 (23.7%) |
| Number of tetranucleotide repeats | 46 (7.5%) |
| Total number simple sequence repeats | 613 |
Figure 1Venn diagram summary of sequence comparisons of the Ich ESTs with Tetrahymena thermophila and Plasmodium falciparum genomes. A total of 4,706 unique Ich ESTs were used as queries yielding 1,759 significant (E-value < 10-5) hits to the T. thermophila genome, and 817 to the P. falciparum genome. A total of 695 sequences were ESTs with common hits to both genomes.
Figure 2Pie charts of 2nd level gene ontology (GO) terms. Overall, 1,008 unique sequences were annotated using the Blast2GO software and included in the graphs. Each of the three GO categories is presented including cellular component (a), molecular function (b), and biological process (c).