| Literature DB >> 22607098 |
Dejun Li1, Zhi Deng, Bi Qin, Xianghong Liu, Zhonghua Men.
Abstract
BACKGROUND: In rubber tree, bark is one of important agricultural and biological organs. However, the molecular mechanism involved in the bark formation and development in rubber tree remains largely unknown, which is at least partially due to lack of bark transcriptomic and genomic information. Therefore, it is necessary to carried out high-throughput transcriptome sequencing of rubber tree bark to generate enormous transcript sequences for the functional characterization and molecular marker development.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22607098 PMCID: PMC3431226 DOI: 10.1186/1471-2164-13-192
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Characteristics of assembled contigs, scaffolds and unigenes
| 75–500 | 62,918 | 30,245 | 15,576 |
| 501–1,000 | 4,821 | 5,253 | 5,143 |
| 1,001–1,500 | 831 | 1,509 | 1,504 |
| 1,501–2,000 | 205 | 449 | 448 |
| 2,001–2,500 | 26 | 73 | 71 |
| 2,501–3,000 | 6 | 7 | 8 |
| >3,000 | 3 | 5 | 6 |
| Total | 68,810 | 37,541 | 22,756 |
| N50 (bp) | 291 | 464 | 592 |
| Average length (bp) | 223 | 364 | 485 |
| Total nucleotides length (bp) | 16,017,394 | 13,656,242 | 11,046,525 |
Figure 1 Gap distribution of assembled scaffolds and unigenes. The gap distribution (%) represents the percentage of the number of N divided by the sequence length of assembled scaffold or unigene.
Figure 2 Assessment of assembled quality. The assembled quality of unigenes was assessed by the distribution of mapped reads within the assembled unigenes.
Figure 3 Characterization of searching the assembled unigenes against Nr and Swissprot protein databases. (A) E-value distribution of BLAST hits for the assembled unigenes with a cutoff of 1E-5 in Nr database. (B) E-value distribution of BLAST hits for the assembled unigenes with a cutoff of 1E-5 in Swissprot database. (C) Similarity distribution of the top BLAST hits for the assembled unigenes with a cutoff of 1E-5 in Nr database. (D) Similarity distribution of the top BLAST hits for the assembled unigenes with a cutoff of 1E-5 in Swissprot database.
Summary of most abundant unigenes in the transcriptome of rubber tree bark
| 1725 | 174827 | ABK29471.1 | 6e-19 | CHK1 checkpoint-like protein | |
| 1124 | 20460 | ACG27632.1 | 9e-16 | hypothetical protein | |
| 1710* | 15171 | GR305569.1 | 3e-93 | No | |
| 22025 | 12755 | AAP42157.1 | 2e-147 | heat shock protein 70 | |
| 22071 | 11715 | AAA34124.1 | 0.0 | polyubiquitin | |
| 18835 | 10115 | AAQ08597.1 | 2e-72 | heat shock protein | |
| 48 | 9307 | AAO14118.1 | 5E-136 | ascorbate peroxidase | |
| 20505 | 8197 | XP_002512570.1 | 8E-123 | s-adenosylmethionine synthetase | |
| 20207 | 8095 | ACN30003.1 | 3E-108 | chalcone synthase |
* The information of accession, E-value, annotation and source was from the blastn program with NCBI EST database.
Figure 4 Gene Ontology classifications of assembled unigenes. 6,867 unigenes with significant similarity in nr protein databases were assigned to gene ontology classifications.
Figure 5 Histogram presentation of COG classification. All unigenes were aligned to COG database to predict and classify possible functions. Out of 16,520 unigenes with nr hits, 5,559 were assigned to 24 COG classifications.
Unigenes of MVA and MEP pathways identified in this research
| MVA pathway | isopentenyl-diphosphate Delta-isomerase (IDI) | BAF98286.1 (2) |
| | acetyl-CoA C-acetyltransferase (AACT) | BAF98276.1 (2), ZP_08629444.1 (1), BAF98277.1 (1), AAL18924.1 (1) |
| | hydroxymethylglutaryl-CoA synthase (HMGS) | BAF98279.1 (1) |
| | hydroxymethylglutaryl-CoA reductase (HMGR) | P29057.1 (1), BAF98280.1 (1) |
| | mevalonate kinase (MVK) | AAL18925.1 (1), |
| | phosphomevalonate kinase (PMK) | BAF98284.1 (1), AAL18926.1 (1) |
| | diphosphomevelonate decarboxylase (MVD) | BAF98285.1 (2) |
| MEP pathway | 1-deoxy-D-xylulose 5-phosphate synthase (DXS) 1-deoxy-D-xylulose 5-phosphate reductoisomerase (DXR) | XP_002533688.1 (2), ABD92702.1 (1), XP_002514364.1 (2), ZP_08629200.1 (1) ABQ53937.1 (1), AAS94121.1 (1), ABD92702.1 (1) |
| | 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (CMS) | BAF98292.1 (1) |
| | 4-diphosphocytidyl-2 C-methyl-D-erythritol kinase (CMK) | BAF98293.1 (1) |
| | 2-C-methyl-D-erythritol 2,4- cyclodiphosphate synthase (MCS) | BAF98295.1 (1) |
| | 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase (HDS) | BAF98296.1 (6) |
| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HDR) | ACG55683.1 (1) |
* Number of unigene with a hit in nr protein database.
Summary of EST-SSRs identified in rubber tree transcriptome
| Total number of sequences examined | 22,756 |
| Total size of examined sequences (bp) | 11,046,525 |
| Total number of identified EST-SSRs | 39,257 |
| Number of EST-SSRs containing sequences | 16,208 |
| Number of sequences containing more than one EST-SSRs | 9,659 |
| Di-nucleotide | 27,877 |
| Tri-nucleotide | 10,490 |
| Tetra-nucleotide | 430 |
| Penta-nucleotide | 203 |
| Hexa-nucleotide | 239 |
| Hepta-nucleotide | 13 |
| Octa-nucleotide | 2 |
| Nona-nucleotide | 3 |
The distribution of EST-SSRs based on the number of repeat units
| 3 | 23,762 | 8,847 | 375 | 146 | 157 | 12 | 2 | 3 | 33,304 |
| 4 | 2,985 | 1,014 | 42 | 33 | 49 | 1 | 0 | 0 | 4,124 |
| 5 | 527 | 302 | 7 | 13 | 26 | 0 | 0 | 0 | 875 |
| 6 | 171 | 142 | 5 | 5 | 7 | 0 | 0 | 0 | 330 |
| 7 | 84 | 72 | 1 | 2 | 0 | 0 | 0 | 0 | 159 |
| 8 | 59 | 41 | 0 | 2 | 0 | 0 | 0 | 0 | 102 |
| 9 | 45 | 27 | 0 | 1 | 0 | 0 | 0 | 0 | 73 |
| 10 | 32 | 24 | 0 | 1 | 0 | 0 | 0 | 0 | 57 |
| 11 | 36 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 47 |
| 12 | 27 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 32 |
| 13 | 23 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 26 |
| 14 | 28 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 30 |
| ≥15 | 98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 98 |
Figure 6 Frequency distribution of EST-SSRs based on motif types. Within the potential EST-SSRs, a total of 429 motif sequence types were identified. The frequency of main motif types was showed in this figure.