| Literature DB >> 26881995 |
Xian-Ge Hu1, Hui Liu1, YuQing Jin1, Yan-Qiang Sun1, Yue Li1, Wei Zhao1,2, Yousry A El-Kassaby3, Xiao-Ru Wang1,2, Jian-Feng Mao1.
Abstract
Platycladus orientalis, of the family Cupressaceae, is a widespread conifer throughout China and is extensively used for ecological reforestation, horticulture, and in medicine. Transcriptome assemblies are required for this ecologically important conifer for understanding genes underpinning adaptation and complex traits for breeding programs. To enrich the species' genomic resources, a de novo transcriptome sequencing was performed using Illumina paired-end sequencing. In total, 104,073,506 high quality sequence reads (approximately 10.3 Gbp) were obtained, which were assembled into 228,948 transcripts and 148,867 unigenes that were longer than 200 nt. Quality assessment using CEGMA showed that the transcriptomes obtained were mostly complete for highly conserved core eukaryotic genes. Based on similarity searches with known proteins, 62,938 (42.28% of all unigenes), 42,158 (28.32%), and 23,179 (15.57%) had homologs in the Nr, GO, and KOG databases, 25,625 (17.21%) unigenes were mapped to 322 pathways by BLASTX comparison against the KEGG database and 1,941 unigenes involved in environmental signaling and stress response were identified. We also identified 43 putative terpene synthase (TPS) functional genes loci and compared them with TPSs from other species. Additionally, 5,296 simple sequence repeats (SSRs) were identified in 4,715 unigenes, which were assigned to 142 motif types. This is the first report of a complete transcriptome analysis of P. orientalis. These resources provide a foundation for further studies of adaptation mechanisms and molecular-based breeding programs.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26881995 PMCID: PMC4755536 DOI: 10.1371/journal.pone.0148985
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
P. orientalis assembled transcripts and unigenes properties summary.
| Transcripts | Unigenes | |
|---|---|---|
| No. of reads > = 200 nt | 228,948 | 148,867 |
| No. of reads > = 500 nt | 111,828 | 49,330 |
| No. of reads > = 1000 nt | 73,298 | 28,822 |
| N50 (nt) | 1,755 | 1,320 |
| N90 (nt) | 388 | 259 |
| Total length (nt) | 216,674,972 | 102,175,229 |
| Max length (nt) | 27,201 | 27,201 |
| Min length (nt) | 201 | 201 |
| Average length (nt) | 946.39 | 686.35 |
| Sequencing depth (mean ± SD) | 101±32735 | |
| Median sequencing depth | 35 |
Fig 1Length distributions of all unigenes.
Unigene homology searches against the protein databases.
| Database | Unigenes | Percentage |
|---|---|---|
| Nr | 62,938 | 42.28% |
| Pfam | 58,566 | 39.34% |
| KOG | 23,179 | 15.57% |
| GO | 42,158 | 28.32% |
| KEGG | 25,625 | 17.21% |
Fig 2Distribution of the top BLASTX hits for unigenes in the Nr database.
Fig 3GO classification of P. orientalis unigenes.
Fig 4KOG classification of P. orientalis unigenes.
Fig 5KEGG classification of the assembled unigenes.
A total of 25,625 unigenes matched with BLASTX hits and 24,295 were assigned to five KEGG biochemical pathways: cellular processes (A), environmental information processing (B), genetic information processing (C), metabolism (D) and organismal system (E).
Fig 6Phylogenetic tree of the putative TPSs from P. orientalis transcriptome and representative characterized TPSs from a broad of plant lineages.
Nine subfamilies (groups) of TPSs were reconstructed, with two subfamilies (TPS-conifer and TPS-sm) were firstly recognized.
Fig 7Metabolic pathway of the circadian rhythm for the unigenes identified in P. orientalis.
Each box represents the substance involved in each section of the pathway. The red boxes represent substances assigned at least one unigene.
P. orientalis transcriptome generated simple sequence repeats (SSR).
| No. of unigenes longer than 1 kb | 28,822 |
| Total nucleotides screened (kb) | 26,360 |
| No. of unigenes containing SSRs | 4,715 |
| No. of identified SSRs loci | 5,296 |
| SSR motif types | 142 |
| Frequency of SSR in transcriptome | 1/1.3kb |
Frequency of simple sequence repeats (SSR) in the transcriptome of P. orientalis.
| Motif length | Repeat numbers | Total | % | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | >12 | |||
| Di | 836 | 171 | 86 | 50 | 53 | 40 | 40 | 3 | 1279 | 70.31 | |
| Tri | 340 | 116 | 59 | 7 | 522 | 28.70 | |||||
| Tetra | 10 | 6 | 16 | 0.88 | |||||||
| Penta | 2 | 2 | 0.11 | ||||||||
| Hexa | |||||||||||
| 1188 | 293 | 145 | 57 | 53 | 40 | 40 | 3 | 1819 | |||
| 65.31 | 16.11 | 7.97 | 3.13 | 2.91 | 2.20 | 2.20 | 0.16 | ||||