| Literature DB >> 21303543 |
Jorge Mc Mondego1, Ramon O Vidal, Marcelo F Carazzolle, Eric K Tokuda, Lucas P Parizzi, Gustavo Gl Costa, Luiz Fp Pereira, Alan C Andrade, Carlos A Colombo, Luiz Ge Vieira, Gonçalo Ag Pereira.
Abstract
BACKGROUND: Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21303543 PMCID: PMC3045888 DOI: 10.1186/1471-2229-11-30
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1Flow diagram of bioinformatics procedures applied in .
Summary of Coffea spp. cluster datasets.
| Contigs | Average contig length | Singlets | Average singlet length | Clusters | Average cluster length | |
|---|---|---|---|---|---|---|
| 15,656 | 868 bp | 16,351 | 459 bp | 32,007 | 662 bp (ranging from 100 to 3,584 bp) | |
| 7,710 | 832 bp | 8,955 | 494 bp | 16,665 | 663 bp (ranging from 100 to 2,988 bp) | |
Figure 2Distribution of the number of ESTs in contigs of .
Evaluation of CDS, 5'UTR and 3'UTR of Coffea spp.
| Full length CDS sequences | 5'UTR length (median) | CDS length (median) | CDS length (mode) | 3'UTR length (median) | |
|---|---|---|---|---|---|
| 1,189 | 160 bp | 836 bp | 479 bp | 240 bp | |
| 518 | 134 bp | 708.5 bp | 476 bp | 229.5 bp | |
Figure 3Distribution of GC in the coding regions of .
Predicted C. arabica and C. canephora gene comparisons.
| 13,855 | 43.29% | |
| 1,846 | 5.77% | |
| 1,161 | 3.63% | |
| 643 | 2.01% | |
| 641 | 2.00% | |
| 428 | 1.34% | |
| 392 | 1.22% | |
| 149 | 0.47% | |
| 115 | 0.36% | |
| 104 | 0.32% | |
| Others | 1,941 | 6.06% |
| No hits | 10,732 | 31.66% |
| 7,427 | 44.57% | |
| 972 | 5.83% | |
| 639 | 3.83% | |
| 372 | 2.23% | |
| 362 | 2.17% | |
| 232 | 1.39% | |
| 225 | 1.35% | |
| 105 | 0.63% | |
| 64 | 0.37% | |
| 56 | 0.32% | |
| Others | 1,231 | 7.39% |
| No hits | 4,980 | 29.88% |
* Each coffee cluster was compared to all of the proteins from the organisms listed. The BLASTX score was defined as 1e-10.
Figure 4Comparative chart between the relative percentage of Pfam domains in .
Figure 5Distribution of .
OrthoMCL analysis of C. arabica and C. canephora, highlighting prominent and specific families in Coffea spp
| OrthoMCL family ID | Manual Annotation* | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 446 | 189 | 1402 | 808 | 2532 | 1378 | 813 | 847 | Serine-threonine kinase |
| 2 | 152 | 51 | 580 | 212 | 967 | 461 | 478 | 447 | PPR repeat protein |
| 6 | 84 | 41 | 193 | 123 | 226 | 99 | 101 | 108 | Cytochrome P450 |
| 544 | 21 | 6 | 1 | - | - | - | - | - | Cystatin |
| 453 | 14 | 4 | 1 | 7 | 3 | 1 | 1 | 1 | NBS LRR resistance protein |
| 1231 | 13 | 5 | - | - | - | - | - | - | Small secreted glycine-rich protein |
| 4031 | 10 | - | - | - | - | - | - | - | Glycine-rich protein |
| 1510 | 7 | 1 | 1 | - | 2 | 1 | 1 | 3 | UDP-glucosyltransferase |
| 2703 | 6 | 3 | - | 1 | 1 | - | 1 | - | Cysteine proteinase inhibitor like protein |
| 3289 | 6 | - | 1 | - | 2 | - | 2 | - | NBS LRR resistance protein |
| 5056 | 6 | 1 | - | 1 | - | - | - | - | Alcohol dehydrogenase |
| 2306 | 5 | 1 | - | 2 | 1 | 1 | 2 | - | Cytochrome P450 |
| 2722 | 5 | 1 | - | 1 | 1 | 2 | 1 | 1 | NBS LRR resistance protein |
| 3294 | 5 | - | 1 | - | 3 | - | 1 | 1 | Poly-A binding protein |
| 3303 | 5 | 1 | 2 | 1 | - | - | - | 1 | NADPH-dependent cinnamyl alcohol dehydrogenase |
| 3305 | 5 | 2 | 1 | 2 | - | - | - | - | Specific tissue protein 2 |
| 4049 | 5 | 2 | 1 | 1 | - | - | 1 | - | Sugar transport protein |
| 4070 | 5 | - | 1 | 1 | 3 | - | - | - | Cytochrome P450 |
| 7241 | 5 | 1 | 1 | - | - | - | 1 | - | Potato type II serine proteinase inhibitor family |
| 10956 | 5 | - | - | - | - | - | - | - | Hypothetical protein |
| 7610 | 4 | 1 | - | 1 | - | - | - | 1 | Ubiquitin-conjugating enzyme |
| 7611 | 4 | 1 | - | 1 | 1 | - | - | - | P-glycoprotein ABC |
| 7613 | 4 | - | - | 2 | 1 | - | - | - | Hexose transporter |
| 9014 | 4 | 1 | - | - | - | 1 | - | - | GH3 family protein/Indole-3-acetic acid-amido synthetase |
| 10273 | 4 | 1 | - | - | - | - | - | - | Potato type II serine proteinase inhibitor family |
| 11588 | 4 | - | - | - | - | - | - | - | Small secreted glycine-rich protein |
| 11617 | 4 | - | - | - | - | - | - | - | Hypothetical protein |
| 12384 | 4 | - | - | - | - | - | - | - | Hypothetical protein |
| 12385 | 4 | - | - | - | - | - | - | - | Defensin/gamma thionin |
| 12386 | 4 | - | - | - | - | - | - | - | Hypothetical protein |
| 7324 | 3 | 2 | - | - | 2 | - | - | - | Helix-loop-helix DNA-binding protein |
| 9019 | 3 | - | - | 1 | - | 1 | - | - | Zinc/iron transporter |
| 9830 | 3 | - | 3 | - | - | - | - | - | Eukaryotic initiation factor (eIF1)/SU1 |
| 10271 | 3 | 1 | - | - | - | - | 1 | - | Metallothionein |
| 10276 | 3 | - | - | - | - | 1 | - | 1 | SEC14 cytosolic factor family protein |
| 10293 | 3 | - | - | 1 | 1 | - | - | - | ABC transporter |
| 10300 | 3 | 1 | - | - | 1 | - | - | - | Phytochrome B/histidine kinase |
| 10309 | 3 | 1 | - | 1 | - | - | - | - | Oxidoreductase |
| 11058 | 3 | - | 1 | 1 | - | - | - | - | ATP-binding cassette transporter |
| 11594 | 3 | - | - | - | - | - | - | 1 | |
| 11600 | 3 | - | - | - | - | - | 1 | - | Alcohol dehydrogenase |
| 11607 | 3 | 1 | - | - | - | - | - | - | CAAX amino-terminal protease |
| 11626 | 3 | 1 | - | - | - | - | - | - | Hypothetical protein |
| 13353 | 3 | - | - | - | - | - | - | - | Hypothetical protein |
| 13392 | 3 | - | - | - | - | - | - | - | GDP-D-mannose 4,6-dehydratase |
| 14410 | 3 | - | - | - | - | - | - | - | No hits found |
| 14413 | 3 | - | - | - | - | - | - | - | No hits found |
| 14414 | 3 | - | - | - | - | - | - | - | Aspartate aminotransferase superfamily protein |
| 14418 | 3 | - | - | - | - | - | - | - | HAT transposase element |
| 14420 | 3 | - | - | - | - | - | - | - | Protein translation factor SUI1 |
| 8498 | 2 | 5 | - | - | - | - | - | Rapid Alkalinization Factor (RALF)-like protein | |
| 9023 | 2 | 3 | - | - | - | 1 | - | - | GTP binding protein |
| 10281 | 2 | 3 | - | - | - | - | - | - | No hits found |
| 12371 | 2 | 2 | - | - | - | - | - | Hydroxyproline-rich glycoprotein/extension | |
| 12375 | 2 | 2 | - | - | - | - | - | No hits found | |
| 1715 | - | 4 | 1 | 2 | 1 | 8 | - | - | Viroid polyprotein ORF4 protein |
| 6375 | - | 4 | 2 | 1 | 1 | - | - | - | NBS LRR resistance protein |
| 9679 | - | 3 | 1 | - | 1 | 1 | - | - | Replication factor A 1 |
| 10952 | - | 3 | 1 | - | - | 1 | - | - | LTR retrotransposon |
| 11055 | - | 5 | - | - | - | - | - | - | Small glycine-rich protein |
| 14392 | - | 3 | - | - | - | - | - | - | Kelch repeat-containing phosphatase |
| 14397 | - | 3 | - | - | - | - | - | - | Albumin/sulfur-rich seed storage protein |
| 14809 | - | 3 | - | - | - | - | - | - | Hypothetical protein |
| 14813 | - | 3 | - | - | - | - | - | - | Miraculin-like protein |
| 14814 | - | 3 | - | - | - | - | - | - | Invertase inhibitor |
* Annotation based on BLASTX-NR (E-value 1e-5).
Figure 6Hierarchical clustering of coffee cDNA libraries and clusters based on EST distribution. a) C. canephora hierarchical clustering of 443 clusters differentially expressed vs. the eight cDNA library assemblies. b) C. arabica hierarchical clustering of 331 clusters differentially expressed vs. the 23 cDNA library assemblies. Hierarchical clustering was performed using a correlation matrix constructed from EST frequencies for differentially expressed C. arabica and C. canephora contigs. Black intensity designates relative transcript abundance in a given library, as inferred from EST frequency within each contig. Library abbreviations correspond to the following descriptions: : LF; young leaves, PP1; pericarp, all developmental stages; SE1; whole cherries,18 and 22 weeks after pollination; SE2, whole cherries,18 and 22 weeks after pollination; SE3: endosperm and perisperm, 30 weeks after pollination SE4; endosperm and perisperm, 42 and 46 weeks after pollination; EC1: embriogenic calli; SH1: leaves from water deficit stressed plants; and SH3: leaves from water deficit stressed plants (drought resistant clone). : PC1, C. arabica non-embryogenic cell line induced with 2,4-D; CA1, non-embryogenic calli; IC1, C. arabica non-embryogenic cell line without 2,4-D; EA; EA2, C. arabica embryogenic calli; IA2, C. arabica embryogenic cell line induced with 2,4-D; PA1, primary embryogenic C. arabica calli; EM1, zygotic embryo from mature germinating seeds; SI3, germinating whole seeds; LV4, young leaves from orthotropic branches; LV5, young leaves from orthotropic branches; LV8, mature leaves from plagiotropic branches; LV9, mature leaves from plagiotropic branches; FB1, floral buds at developmental stages 1 and 2; FB2, floral buds at developmental stages 1 and 2; FB4, floral buds at developmental stages 3 and 4; FR1, floral buds, pinhead fruits, fruit developmental stages 1 and 2; FR2, floral buds, pinhead fruits, fruit developmental stages 1 and 2; SS1, well-watered field plant tissues; SH2, water-stressed plant tissues; CB1, suspension cells treated with acibenzolar-S-methyl and brassinosteroids; CS1, suspension cells under osmotic stress; AR1, leaves treated with arachidonic acid; LP1, plantlets treated with arachidonic acid; RT5, roots with acibenzolar-S-methyl; CL2, hypocotyls treated with acibenzolar-S-methyl; BP1, suspension cells treated with acibenzolar-S-methyl; RT8, root suspension cells under aluminum stress; RX1, Xyllela spp.-infected stems; NS1, nematode-infected roots; and RM1, leaves infected with leaf miner and coffee leaf rust.