| Literature DB >> 29449857 |
Yanci Yang1, Juan Zhu2, Li Feng3, Tao Zhou3, Guoqing Bai1,4, Jia Yang1, Guifang Zhao1.
Abstract
Fagaceae is one of the largest and economically important taxa within Fagales. Considering the incongruence among inferences from plastid and nuclear genes in the previous Fagaceae phylogeny studies, we assess the performance of plastid phylogenomics in this complex family. We sequenced and assembled four complete plastid genomes (Fagus engleriana, Quercus spinosa, Quercus aquifolioides, and Quercus glauca) using reference-guided assembly approach. All of the other 12 published plastid genomes in Fagaceae were retrieved for genomic analyses (including repeats, sequence divergence and codon usage) and phylogenetic inference. The genomic analyses reveal that plastid genomes in Fagaceae are conserved. Comparing the phylogenetic relationships of the key genera in Fagaceae inferred from different codon positions and gene function datasets, we found that the first two codon sites dataset recovered nearly all relationships and received high support. Thus, the result suggested that codon composition bias had great influence on Fagaceae phylogenetic inference. Our study not only provides basic understanding of Fagaceae plastid genomes, but also illuminates the effectiveness of plastid phylogenomics in resolving relationships of this intractable family.Entities:
Keywords: Fagaceae; codon composition bias; phylogenomics; plastid genome; topological incongruence
Year: 2018 PMID: 29449857 PMCID: PMC5800003 DOI: 10.3389/fpls.2018.00082
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Accessions in this study with taxonomic, collection locality, Illumina read, and coverage information.
| Group | / | / | / | ||
| Group | / | / | / | ||
| Group | / | / | / | ||
| Group | / | / | / | ||
| Group | / | / | / | ||
| Group | / | / | / | ||
| Group | Panzhihua, Sichuan, China | 788,550 | 616x | ||
| Group | Dali, Yunnan, China | 766,767 | 591x | ||
| Group | Chenshan Botanical Garden, Shanghai, China | 427,422 | 329 x | ||
| Group | / | / | / | ||
| / | / | / | |||
| / | / | / | |||
| / | / | / | |||
| / | / | / | |||
| / | / | / | |||
| Wuhan Botanical Garden, Wuhan, China | 362,613 | 281x |
Characteristics of Fagaceae plastid genomes.
| 161,304 | 90,541 | 19,025 | 51,738 | 137 | / | 89 | 40 | 8 | |
| 161,150 | 90,444 | 19,054 | 51,652 | 134 | / | 86 | 40 | 8 | |
| 161,153 | 90,457 | 19,044 | 51,652 | 134 | / | 86 | 40 | 8 | |
| 161,072 | 90,341 | 19,045 | 51,686 | 134 | / | 86 | 40 | 8 | |
| 161,237 | 90,461 | 19,048 | 51,728 | 134 | / | 86 | 40 | 8 | |
| 161,077 | 90,387 | 19,056 | 51,634 | 134 | / | 86 | 40 | 8 | |
| 161,225 | 90,535 | 19,000 | 51,690 | 134 | / | 86 | 40 | 8 | |
| 161,156 | 90,441 | 18,997 | 51,718 | 134 | / | 86 | 40 | 8 | |
| 160,798 | 90,229 | 18,907 | 51,662 | 134 | / | 86 | 40 | 8 | |
| 160,988 | 90,352 | 18,954 | 51,682 | 128 | 87 | 30 | 8 | ||
| 160,799 | 90,432 | 18,995 | 51,372 | 130 | 83 | 37 | 8 | ||
| 160,603 | 90,249 | 18,976 | 51,378 | 131 | 83 | 39 | 8 | ||
| 160,647 | 90,394 | 18,995 | 51,258 | 132 | / | 84 | 40 | 8 | |
| 161,020 | 90,596 | 19,160 | 51,264 | 134 | / | 87 | 39 | 8 | |
| 159,938 | 89,374 | 19,292 | 51,272 | 128 | / | 81 | 39 | 8 | |
| 158,346 | 87,667 | 18,895 | 51,784 | 131 | / | 83 | 40 | 8 |
The 4 newly generated plastid genomes were marked in .
Figure 1The comparison of the LSC, IR, and SSC border regions among the Fagaceae plastid genomes. Numbers above the gene features mean the distance from the end of gene to the boundary region. These features are not to scale.
GC content of sequences in Fagaceae plastid genomes.
| 53 | 36.8 | 38.6 | 46.3 | 38.3 | 31.0 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 30.9 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 30.9 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 31.0 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 30.9 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 31.0 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.4 | 31.0 | |
| 53 | 36.8 | 38.6 | 46.4 | 38.3 | 31.0 | |
| 53 | 36.9 | 38.6 | 46.5 | 38.4 | 31.0 | |
| 53 | 36.9 | 38.6 | 46.5 | 38.4 | 31.0 | |
| 53 | 36.8 | 38.5 | 46.4 | 38.3 | 30.9 | |
| 53 | 36.8 | 38.5 | 46.3 | 38.3 | 30.9 | |
| 53 | 36.7 | 38.6 | 46.4 | 38.4 | 31.0 | |
| 53 | 36.7 | 38.5 | 46.3 | 38.3 | 30.9 | |
| 53 | 37.0 | 38.8 | 46.6 | 38.5 | 31.3 | |
| 53 | 37.1 | 38.5 | 46.5 | 38.3 | 30.8 |
GCg: GC content of whole genome; GC.
Analyses of repeat elements in Fagaceae plastid genomes.
| Complete plastid genomes | 145 (53/50/37/5) | 8,459/2,572,513 | 199 (198/1) | 12,832/2,572,513 | 96 (96/0) | 6,150/2,572,513 | 27,441/2,572,513 |
| LSC | 40 (20/12/4/4) | 2219/1,442,900 | 97 (97/0) | 6,081/1,442,900 | 33 (33/0) | 2,050/1,442,900 | 10,350/1,442,900 |
| SSC | 11 (3/6/1/1) | 634/304,443 | 21 (20/1) | 1,670/304,443 | 18 (18/0) | 1,090/304,443 | 3,394/304,443 |
| IR | 94 (30/32/32/0) | 5,606/825,170 | 81 (81/0) | 5,081/825,170 | 45 (45/0) | 3,010/825,170 | 13,697/825,170 |
| Intergenic spacer regions | 79 (25/17/33/4) | 4,451/828,186 | 95 (95/0) | 6,289/828,186 | 60 (60/0) | 4,094/828,186 | 14,834/828,186 |
| Introns | 28 (26/0/1/1) | 1,388/284,012 | 41 (40/1) | 2,849/284,012 | 20 (20/0) | 1,096/284,012 | 5,333/284,012 |
| Genes | 38 (2/33/3/0) | 2,620/1,460,315 | 63 (63/0) | 3,694/1,460,315 | 16 (16/0) | 960/1,460,315 | 6,714/1,460,315 |
Numbers of different length repeats are given in brackets.
Figure 2Sequence identity plot comparing the 16 Fagaceae plastid genomes with Q. rubra as a reference. The y-axis represents % identity ranging from 50 to 100%. Coding and noncoding regions are marked in purple and pink, respectively.
List of the 76 common protein-coding genes divided into five functional groups.
| Gene expression | |
| Photosynthetic apparatus | |
| Photosynthetic metabolism | |
| Miscellaneous | |
| Unknown |
Numbers in parentheses indicate the genes duplicated in the IR regions.
Sites and models in ML and BI analyses for each dataset.
| 76 common protein-coding genes | 72,235 | GTR+G | GTR+I+G |
| Codon positions 1 + 2 | 48,176 | GTR+G | TVM+I+G |
| Codon position 3 | 24,353 | GTR+G | GTR+G |
| Gene expression | 18,856 | GTR+G | GTR+I+G |
| Photosynthetic apparatus | 16,970 | GTR+G | GTR+G |
| Photosynthetic metabolism | 18,817 | GTR+G | TVM+I+G |
| Miscellaneous | 3,786 | GTR+G | TVM+G |
| Unknown | 13,914 | GTR+G | GTR |
Figure 3Fagaceae phylogeny based on ML and BI analyses of 76 protein-coding genes. ML topology shown with bootstrap support values and posterior probability values listed at each node.
Figure 4Fagaceae phylogeny based on ML and BI analyses of the first two codon positions of protein-coding genes. ML topology shown with bootstrap support values and posterior probability values listed at each node. Dash denotes nodes contradicted by the BI trees with posterior probability values < 0.50.