| Literature DB >> 27790228 |
Tao Zhou1, Chen Chen1, Yue Wei1, Yongxia Chang1, Guoqing Bai2, Zhonghu Li1, Nazish Kanwal1, Guifang Zhao1.
Abstract
Dipteronia (order Sapindales) is an endangered genus endemic to China and has two living species, D.sinensis and D. dyeriana. The plants are closely related to the genus Acer, which is also classified in the order Sapindales. Evolutionary studies on Dipteronia have been hindered by the paucity of information on their genomes and plastids. Here, we used next generation sequencing to characterize the transcriptomes and complete chloroplast genomes of both Dipteronia species. A comparison of the transcriptomes of both species identified a total of 7814 orthologs. Estimation of selection pressures using Ka/Ks ratios showed that only 30 of 5435 orthologous pairs had a ratio significantly >1, i.e., showing positive selection. However, 4041 orthologs had a Ka/Ks < 0.5 (p < 0.05), suggesting that most genes had likely undergone purifying selection. Based on orthologous unigenes, 314 single copy nuclear genes (SCNGs) were identified. Through a combination of de novo and reference guided assembly, plastid genomes were obtained; that of D. sinensis was 157,080 bp and that of D. dyeriana was 157,071 bp. Both plastid genomes encoded 87 protein coding genes, 40 tRNAs, and 8 rRNAs; no significant differences were detected in the size, gene content, and organization of the two plastomes. We used the whole chloroplast genomes to determine the phylogeny of D. sinensis and D. dyeriana and confirmed that the two species were highly divergent. Overall, our study provides comprehensive transcriptomic and chloroplast genomic resources, which will be valuable for future evolutionary studies of Dipteronia.Entities:
Keywords: Dipteronia; chloroplast genome; phylogenetic relationship; positive selection; purifying selection; transcriptome
Year: 2016 PMID: 27790228 PMCID: PMC5061820 DOI: 10.3389/fpls.2016.01512
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of statistics for the transcriptomes of .
| Total number of reads | 40,615,432 | 53,620,610 |
| Total number of transcripts | 91,340 | 101,628 |
| Total number of unigenes | 52,351 | 53,983 |
| Min length length of unigenes (bp) | 201 | 201 |
| Max length length of unigenes (bp) | 14,265 | 14,906 |
| N50 of unigenes (bp) | 1351 | 1519 |
| Mean length of unigenes (bp) | 749 | 809 |
| Mapping rates of unigenes | 74.6% | 76.8% |
Mapping rates were generated by mapping clean reads to the assembled unigenes using the Bowtie mode of RSEM 1.2.29.
Annotation information of .
| COG | 10,637 | 20.32 | 9411 | 17.43 |
| GO | 25,591 | 48.88 | 23,003 | 42.61 |
| KEGG | 7182 | 13.72 | 6225 | 11.53 |
| Swiss-Prot | 23,321 | 44.55 | 20,936 | 38.78 |
| Nr | 30,689 | 58.62 | 27,738 | 51.38 |
| All | 30,834 | 58.90 | 27,796 | 51.49 |
Figure 1Comparison of gene ontology (GO) terms distributions between . GO terms were annotated according to three main categories (biological process, cellular component, molecular function) and 63 sub-categories.
Figure 2Clusters of orthologous group (COG) classifications for . All unigenes were aligned to COG database to predict and classify possible functions.
Summary of two complete chloroplast genomes of .
| Total cp DNA size (bp) | 157,080 | 157,071 |
| Length of large single copy (LSC) region (bp) | 85,455 | 85,529 |
| Length of inverted repeat (IR) region (bp) | 26,766 | 26,730 |
| Length of small single copy (SSC) region (bp) | 18,093 | 18,082 |
| Total GC content (%) | 37.8 | 38.0 |
| LSC | 35.9 | 36.1 |
| IR | 42.7 | 42.8 |
| SSC | 32.1 | 32.5 |
| Total number of genes | 135 | 135 |
| Protein encodinga | 87 (8) | 87 (8) |
| tRNAa | 40 (7) | 40 (7) |
| rRNA | 8 (4) | 8 (4) |
The numbers in parenthesis indicate the genes duplicated in the IR regions.
Figure 3Circular gene map of . The genes lying outside of the outer circle are transcribed clockwise, while those inside the circle are transcribed counterclockwise. Small single copy (SSC), large single copy (LSC), and inverted repeats (IRa, IRb) are indicated.
Figure 4mVISTA percent identity plot comparing the two . The top line shows genes in order (transcriptional direction indicated by arrows). The sequence similarity of the aligned regions between Dipteronia species and Acer buergerianum subsp. Ningpoense is shown as horizontal bars indicating the average percent identity between 50 and 100% (shown on the y-axis of the graph). The x-axis represents the coordinate in the chloroplast genome. Genome regions are color coded as protein-coding (exon), tRNA or rRNA, and conserved noncoding sequences (CNS).
Figure 5Maximum likelihood phylogeny of the nine Sapindales species based on the complete plastid genome sequences. The numbers associated with the nodes are bootstrap support and posterior probability values.