| Literature DB >> 24093758 |
Hui Ma1, Zhiqiang Lu, Bingbing Liu, Qiang Qiu, Jianquan Liu.
Abstract
BACKGROUND: Corylus was renowned for its production of hazelnut and taxol. To understand the local adaptation of Chinese species and speed up breeding efforts in China, we analyzed the leaf transcriptome of Corylus mandshurica, which had a high tolerance to fungal infections and cold.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24093758 PMCID: PMC3819738 DOI: 10.1186/1471-2229-13-152
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Comparison of transcriptome assembly and coding sequence prediction for and
| Average Length | 580 | 532 | 431 | 377 |
| Length Range | 201 ~ 6821 | 80 ~ 5490 | 30 ~ 4890 | 42 ~ 4143 |
| Numbers | 37846 | 28255 | 37652 | 28167 |
| N50 Length | 799 | 961 | 594 | 651 |
| Sequences (longer than N50) | 8328 | 4991 | 8028 | 4945 |
Figure 1The length distribution of assembled ESTs from transcriptomes of and . ESTs are counted at an interval of 80 bp, with ESTs longer than 2000 bp counted as 2000 bp. It is clear that more ESTs are assembled in the transcriptome of C. mandshurica than C. avellana at length intervals after 160 bp. (Trinity has a minimum length of 200 bp in transcriptome assembly).
Figure 2The contigs from genome mapped by transcriptomes of and . The contigs mapped uniquely by ESTs from C. mandshurica may represent novel genes specifically expressed in it or different fragments of the same genes unidentified in the transcriptome of C. avellana.
Figure 3The percentage of top blast hits by species.
Figure 4The distribution of sequence identity in blast regions for all orthologs. 98.7% orthologs have sequence identity no less than 90%; 87.9% orthologs have sequence identity no less than 97%. To exclude distantly homologous sequences, orthologs with sequence identity less than 90% are discarded in GO enrichment analysis.
Figure 5GO enrichment on orthologs with sequence identity lower than 97%. Direct parent GO terms in biological process domain are displayed for simplicity. The GO terms in the biological process domain that show increase in ESTs with lower sequence identity are related to immune response and response to stress.
Figure 6GO enrichment on orthologs with INDEL. Direct parent GO terms in biological process domain are displayed for simplicity. The GO terms in the biological process domain that show increase in ESTs with INDEL are related to hormone metabolic process and response to various stimuli (including response to stress).
Figure 7GO terms under response to stress in GO enrichment on orthologs with low sequence identity. GO terms related to defense against bacteria and fungi (including innate immune response), tolerance to cold and heat, and salt/drought/water stress show increase in ESTs with lower sequence identity.
Figure 8GO terms under response to stress in GO enrichment on orthologs with INDEL. GO terms related to defense against bacteria and fungi (including innate immune response), tolerance to cold and heat, and salt/drought/water stress show increase in ESTs with INDEL.
ESTs homologous to genes involved in taxol synthesis in
| comp37211_c1_seq1 | 386304248 | 36.51 | 189 | 10-deacetylbaccatin III-10-O-acetyl transferase, partial |
| comp42386_c0_seq1 | 28558088 | 28.85 | 104 | 3′-N-debenzoyl-2′-deoxytaxol N-benzoyltransferase |
| comp70594_c0_seq1 | 339521621 | 23.93 | 422 | C-13 phenylpropanoid side chain CoA acyltransferase |
| comp70669_c0_seq1 | 28380187 | 34.95 | 432 | taxa-4(20),11(12)-dien-5alpha-ol-O-acetyltransferase |
| comp68118_c0_seq1 | 28380187 | 28.44 | 450 | taxa-4(20),11(12)-dien-5alpha-ol-O-acetyltransferase |
| comp53308_c0_seq1 | 386304662 | 38.04 | 163 | taxadienol acetyl transferase, partial |
| comp119236_c0_seq1 | 53690152 | 45.98 | 87 | taxadien-5-alpha-ol-O-acetyltransferase |
| comp68580_c0_seq1 | 53690152 | 30.06 | 173 | taxadien-5-alpha-ol-O-acetyltransferase |
| comp37211_c0_seq1 | 53690152 | 32.39 | 142 | taxadien-5-alpha-ol-O-acetyltransferase |
| comp83331_c0_seq1 | 53690152 | 28.84 | 215 | taxadien-5-alpha-ol-O-acetyltransferase |
| comp64789_c0_seq1 | 53759170 | 42.6 | 446 | taxadiene 5-alpha hydroxylase |
| comp57975_c0_seq1 | 386304485 | 50.32 | 155 | taxadiene 5nalpha hydroxylase, partial |
| comp172528_c0_seq1 | 38201489 | 36.47 | 85 | taxa-4(5),11(12)-diene synthase |
| comp193967_c0_seq1 | 15080743 | 46.03 | 63 | taxadiene synthase |
| comp63152_c1_seq1 | 386304920 | 29.69 | 128 | taxadiene synthase, partial |
| comp40035_c0_seq1 | 24266823 | 47.89 | 71 | 5-alpha-taxadienol-10-beta-hydroxylase |
| comp53405_c1_seq1 | 24266823 | 49.47 | 95 | 5-alpha-taxadienol-10-beta-hydroxylase |
| comp133851_c0_seq1 | 44903417 | 32.47 | 77 | 5-alpha-taxadienol-10-beta-hydroxylase |
| comp110423_c0_seq1 | 60459952 | 41.38 | 87 | taxane 13-alpha-hydroxylase |
| comp69534_c0_seq1 | 60459952 | 33.79 | 441 | taxane 13-alpha-hydroxylase |
| comp38773_c1_seq1 | 60459952 | 45.83 | 96 | taxane 13-alpha-hydroxylase |
| comp36415_c0_seq1 | 60459952 | 34.19 | 427 | taxane 13-alpha-hydroxylase |
| comp57975_c1_seq1 | 60459952 | 44.93 | 69 | taxane 13-alpha-hydroxylase |
| comp104139_c0_seq1 | 60459952 | 42.17 | 83 | taxane 13-alpha-hydroxylase |
| comp93979_c0_seq1 | 75297723 | 38.03 | 71 | Taxane 14b-hydroxylase |
| comp61533_c1_seq1 | 380039801 | 33.57 | 143 | taxane 14b-hydroxylase |
| comp143394_c0_seq1 | 380039801 | 29.17 | 120 | taxane 14b-hydroxylase |
| comp74596_c0_seq1 | 380039801 | 29.51 | 122 | taxane 14b-hydroxylase |
| comp84707_c0_seq1 | 380039801 | 35.22 | 230 | taxane 14b-hydroxylase |
| comp36896_c0_seq1 | 67633430 | 30.84 | 467 | taxoid 2-alpha-hydroxylase |
| comp192945_c0_seq1 | 238915468 | 43.75 | 64 | taxoid 7-beta-hydroxylase |
| comp36946_c0_seq1 | 365776087 | 55 | 60 | transcription factor WRKY |
| comp64330_c0_seq1 | 365776087 | 54.24 | 59 | transcription factor WRKY |
| comp78449_c0_seq1 | 365776087 | 66.67 | 54 | transcription factor WRKY |
| comp67132_c0_seq1 | 365776087 | 56.34 | 71 | transcription factor WRKY |
| comp59687_c0_seq1 | 365776087 | 41.22 | 131 | transcription factor WRKY |
| comp68275_c0_seq1 | 365776087 | 56.72 | 67 | transcription factor WRKY |
| comp104123_c0_seq1 | 222355764 | 29.87 | 154 | JAMYC |
| comp69212_c0_seq1 | 222355764 | 41.69 | 710 | JAMYC |
| comp69212_c0_seq2 | 222355764 | 46.72 | 259 | JAMYC |
| comp69212_c0_seq3 | 222355764 | 41.69 | 710 | JAMYC |
| comp124731_c0_seq1 | 222355764 | 100 | 27 | JAMYC |
| comp69971_c3_seq1 | 222355764 | 29.9 | 204 | JAMYC |
| comp38183_c0_seq2 | 222355764 | 30 | 160 | JAMYC |
| comp83061_c0_seq1 | 222355764 | 35.71 | 112 | JAMYC |
Protein GIs, instead of their accession numbers, are provided here for convenience in table layout. These can be queried at NCBI protein databases.
Proteins most homologous to genes involved in taxol synthesis in species outside
| 15080743 | - | - | - | 62511183 | 48.53 | |
| 38201489 | - | - | - | 62511183 | 48.66 | |
| 386304920 | - | - | - | 62511183 | 49.3 | |
| 24266823 | 56609042 | 97.99 | 75319884 | 43.97 | ||
| 44903417 | 56609042 | 99.2 | 75319884 | 44.17 | ||
| 53759170 | 56609042 | 66.38 | 75319884 | 44.44 | ||
| 60459952 | 56609042 | 63.9 | 75319884 | 45.32 | ||
| 67633430 | 56609042 | 56.34 | 75319884 | 40.89 | ||
| 75297723 | 56609042 | 60.72 | 75319884 | 42.5 | ||
| 238915468 | 56609042 | 56.2 | 75319884 | 40 | ||
| 380039801 | 56609042 | 61.12 | 75319884 | 42.92 | ||
| 386304485 | 56609042 | 66.22 | 75319884 | 47.11 | ||
| 28380187 | 62461771 | 98.39 | 148906373 | 43.98 | ||
| 28558088 | 62461771 | 60.23 | 148906373 | 45.62 | ||
| 53690152 | 62461771 | 60.83 | 148906373 | 44.25 | ||
| 339521621 | 62461771 | 60 | 148906373 | 39.83 | ||
| 386304662 | 62461771 | 98.23 | 148906373 | 44.07 | ||
| 386304248 | 169135276 | 98.45 | 148906373 | 44.64 | ||
| 365776087 | - | - | - | 167859869 | 43.18 | |
| 222355764 | - | - | - | 148906957 | 46.99 |
* Ozonium sp. BT2 and fungal sp. BT2 are the same fungus species. [49,51].
Columns 2-4 show top hit protein information from fungi; columns 5-7 show top hit protein information from plants. Methodically, protein queries are blasted against NCBI nonredundant protein database and protein hits from the two designated sources with top sequence identity are recorded. Protein GIs, instead of their accession numbers, are provided here for convenience in table layout. These can be queried at NCBI protein databases. The reason for only three identified hit proteins from fungi is possibly due to the absence of genome data for taxol-producing fungi.