| Literature DB >> 21637594 |
Samuel Mazzinghy Alvarenga1, Eveline Teixeira Caixeta, Bárbara Hufnagel, Flávia Thiebaut, Eunize Maciel-Zambolim, Laércio Zambolim, Ney Sussumu Sakiyama.
Abstract
Sequences potentially associated with coffee resistance to diseases were identified by in silico analyses using the database of the Brazilian Coffee Genome Project (BCGP). Keywords corresponding to plant resistance mechanisms to pathogens identified in the literature were used as baits for data mining. Expressed sequence tags (ESTs) related to each of these keywords were identified with tools available in the BCGP bioinformatics platform. A total of 11,300 ESTs were mined. These ESTs were clustered and formed 979 EST-contigs with similarities to chitinases, kinases, cytochrome P450 and nucleotide binding site-leucine rich repeat (NBS-LRR) proteins, as well as with proteins related to disease resistance, pathogenesis, hypersensitivity response (HR) and plant defense responses to diseases. The 140 EST-contigs identified through the keyword NBS-LRR were classified according to function. This classification allowed association of the predicted products of EST-contigs with biological processes, including host defense and apoptosis, and with molecular functions such as nucleotide binding and signal transducer activity. Fisher's exact test was used to examine the significance of differences in contig expression between libraries representing the responses to biotic stress challenges and other libraries from the BCGP. This analysis revealed seven contigs highly similar to catalase, chitinase, protein with a BURP domain and unknown proteins. The involvement of these coffee proteins in plant responses to disease is discussed.Entities:
Keywords: Coffea; ESTs; bioinformatics; data mining; genomics; in silico
Year: 2010 PMID: 21637594 PMCID: PMC3036153 DOI: 10.1590/s1415-47572010000400031
Source DB: PubMed Journal: Genet Mol Biol ISSN: 1415-4757 Impact factor: 1.771
The number of ESTs and their relative percentages obtained by keyword data mining in 14 keyword projects, the number of clusters (EST-contigs and singlets) formed and the number and percentage of EST-contigs with E-values < e-20 and scores > 100.
| Project | ESTs | % | EST-contigs Total | Singlets | EST-contigs1 | % |
| Chalconesynthase | 153 | 1.35 | 5 | 8 | 5 | 0.51 |
| Chitinase | 1,855 | 16.41 | 47 | 48 | 45 | 4.59 |
| CytochromeP450 | 2,441 | 21.60 | 235 | 202 | 144 | 14.70 |
| Glucanase | 642 | 5.68 | 92 | 68 | 88 | 8.98 |
| Glucosyltransferase | 1,286 | 11.38 | 130 | 160 | 125 | 12.76 |
| HSP (Heat Shock Protein) | 240 | 2.12 | 31 | 27 | 30 | 3.06 |
| Hypersensitive | 86 | 0.76 | 8 | 4 | 8 | 0.81 |
| Importin | 532 | 4.70 | 11 | 14 | 10 | 1.02 |
| NBS-LRR | 826 | 7.30 | 160 | 243 | 140 | 14.30 |
| Pathogenesis | 979 | 8.66 | 63 | 37 | 61 | 6.23 |
| Phytoalexin | 12 | 0.10 | 3 | 2 | 3 | 0.30 |
| Polyphenoloxidase | 67 | 0.59 | 4 | 3 | 4 | 0.40 |
| Resistance | 1,864 | 16.49 | 347 | 416 | 300 | 30.64 |
| Thaumatin | 317 | 2.80 | 16 | 7 | 16 | 1.63 |
| Total | 11,300 | 100 | 1,152 | 1,239 | 979 | 100 |
1E-value < e-20 and score > 100.
Figure 1Number of ESTs from NS1 (roots infected with nematodes), RM1 (leaves infected with leaf miner and coffee leaf rust), RX1 (stems infected with Xylella spp.) and SS1 (well-watered field plants) libraries that were present in the created projects. CHA – chalconesynthase, CHI – chitinase, CYT – cytochrome P450, GLT – glucosyltransferase, GLU – glucanase, HYP – hypersensitive, IMP – importin, NBS-LRR – nucleotide binding site-leucine rich repeat, PAT – pathogenesis, PFO – polyphenoloxidase, PHY – phytoalexin, RES – resistance and THA – thaumatin.
Figure 2Distribution of GO terms in the Cellular Component category, level 3.
Figure 3Distribution of GO terms in the Molecular Function category, level 3.
Figure 4Distribution of GO terms in the Biological Process category, level 3.
Contigs differentially expressed between libraries containing the responses to biotic stress challenges and other libraries from the Brazilian Coffee Genome Project. Differential expression was confirmed by Fisher's exact test, with the p values indicated in the last column.
| Cluster | # reads | Length | BlastX | Score | E-value | GenBank Record | p value |
| Contig 10650 | 15 | 1376 | hypothetical protein [ | 99.4 | 8.00E-19 | ref|XP_002319603.1| | 0.00079 |
| Contig 13908 | 30 | 1181 | BURP domain-containing protein [ | 289 | 1.00E-76 | gb|ACD49738.1| | 0.00114 |
| Contig 13986 | 9 | 794 | cysteine protease inhibitor family protein / cystatin family protein [ | 46.2 | 0.003 | ref|NP_193383.1| | 0.0007 |
| Contig 14592 | 131 | 1290 | class III chitinase [ | 447 | 1.00E-123 | emb|CAJ43737.1| | 0.00263 |
| Contig 431 | 33 | 821 | cysteine protease inhibitor family protein / cystatin family protein [ | 45.8 | 0.005 | ref|NP_193383.1| | 0.00158 |
| Contig 8478 | 65 | 2154 | catalase [ | 788 | 0.00 | gb|ABM47415.1| | 0.00193 |
| Contig 9073 | 347 | 1819 | catalase [ | 915 | 0.00 | emb|CAA85426.1| | 0.00482 |