| Literature DB >> 23577110 |
Yan-xia Sun1, Michael J Moore, Ai-ping Meng, Pamela S Soltis, Douglas E Soltis, Jian-qiang Li, Heng-chang Wang.
Abstract
The early-diverging eudicot order Trochodendrales contains only two monospecific genera, Tetracentron and Trochodendron. Although an extensive fossil record indicates that the clade is perhaps 100 million years old and was widespread throughout the Northern Hemisphere during the Paleogene and Neogene, the two extant genera are both narrowly distributed in eastern Asia. Recent phylogenetic analyses strongly support a clade of Trochodendrales, Buxales, and Gunneridae (core eudicots), but complete plastome analyses do not resolve the relationships among these groups with strong support. However, plastid phylogenomic analyses have not included data for Tetracentron. To better resolve basal eudicot relationships and to clarify when the two extant genera of Trochodendrales diverged, we sequenced the complete plastid genome of Tetracentron sinense using Illumina technology. The Tetracentron and Trochodendron plastomes possess the typical gene content and arrangement that characterize most angiosperm plastid genomes, but both genomes have the same unusual ∼4 kb expansion of the inverted repeat region to include five genes (rpl22, rps3, rpl16, rpl14, and rps8) that are normally found in the large single-copy region. Maximum likelihood analyses of an 83-gene, 88 taxon angiosperm data set yield an identical tree topology as previous plastid-based trees, and moderately support the sister relationship between Buxaceae and Gunneridae. Molecular dating analyses suggest that Tetracentron and Trochodendron diverged between 44-30 million years ago, which is congruent with the fossil record of Trochodendrales and with previous estimates of the divergence time of these two taxa. We also characterize 154 simple sequence repeat loci from the Tetracentron sinense and Trochodendron aralioides plastomes that will be useful in future studies of population genetic structure for these relict species, both of which are of conservation concern.Entities:
Mesh:
Year: 2013 PMID: 23577110 PMCID: PMC3618518 DOI: 10.1371/journal.pone.0060429
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Map of the Tetracentron sinense plastid genome.
Figure 2Map of the Trochodendron aralioides plastid genome.
Basic characteristic of the Tetracentron sinense and Trochodendron aralioides plastid genomes.
|
|
| |
| total genome length | 164467 | 165945 |
| IR length | 30231 | 30744 |
| SSC length | 19539 | 18974 |
| LSC length | 84466 | 85483 |
| total length of coding sequence | 94699 | 95168 |
| total length of noncoding sequence | 69768 | 70777 |
| overall G/C content | 38.1% | 38.0% |
All values given are in base pairs (bp), unless otherwise noted.
Figure 3Comparison of the IR junctions in Tetracentron and Trochodendron.
The principal noncoding regions contributing to the size difference between the Tetracentron and Trochodendron plastid genomes.
| Spacer region or intron names |
|
| length difference |
|
| 870 | 1308 | 438 |
|
| 1529 | 1797 | 268 |
|
| 505 | 658 | 153 |
|
| 957 | 1316 | 359 |
|
| 1199 | 1309 | 110 |
|
| 1146 | 754 | −392 |
|
| 440 | 325 | −115 |
| * | 865 | 972 | 107 |
All sizes are in base pairs. The only locus residing in the IR is marked with an asterisk (*).
List of genes present in the plastid genomes of Tetracentron sinense and Trochodendron aralioides.
| Group of genes | Name of genes | |
| Protein synthesis and DNA replication | Ribosomal RNAs |
|
| Transfer RNAs |
| |
| small subunit |
| |
| Ribosomal proteins large subunit |
| |
| RNA polymerase |
| |
| Photosynthesis | Photosystem I |
|
| Photosystem II |
| |
| Cytochrome b6/f |
| |
| ATP synthase |
| |
| NADH dehydrogenase |
| |
| Large subunit of Rubisco |
| |
| Miscellaneous proteins | Subunit of Acetyl-CoA-carboxylase |
|
| c-type cytochrome synthesis gene |
| |
| Envelope membrane protein |
| |
| Protease |
| |
| Translational initiation factor |
| |
| Maturase |
| |
| Genes of unknown function | Hypothetical conserved coding frame |
|
Genes with introns are marked with asterisks (*).
Comparisons of the protein-coding genes of Tetracentron and Trochodendron.
| Gene | Length in | Length in | Number of nucleotide differences | Proportion of nucleotide differences | Number of indel differences |
|
| 102 | 102 | 0 | 0 | 0 |
|
| 111 | 111 | 0 | 0 | 0 |
|
| 129 | 129 | 0 | 0 | 0 |
|
| 252 | 252 | 0 | 0 | 0 |
|
| 120 | 120 | 0 | 0 | 0 |
|
| 123 | 123 | 0 | 0 | 0 |
|
| 117 | 117 | 0 | 0 | 0 |
|
| 108 | 108 | 0 | 0 | 0 |
|
| 288 | 288 | 0 | 0 | 0 |
|
| 279 | 279 | 0 | 0 | 0 |
|
| 468 | 468 | 0 | 0 | 0 |
|
| 399 | 399 | 0 | 0 | 0 |
|
| 825 | 825 | 1 | 0.00121 | 0 |
|
| 657 | 657 | 1 | 0.00152 | 0 |
|
| 504 | 504 | 1 | 0.00198 | 0 |
|
| 501 | 501 | 1 | 0.00249 | 0 |
|
| 369 | 369 | 1 | 0.00271 | 0 |
|
| 6879 | 6897 | 19 | 0.00276 | 1 |
|
| 1533 | 1533 | 5 | 0.00326 | 0 |
|
| 507 | 507 | 2 | 0.00394 | 0 |
|
| 201 | 201 | 1 | 0.00498 | 0 |
|
| 189 | 189 | 1 | 0.00529 | 0 |
|
| 2253 | 2253 | 12 | 0.00533 | 0 |
|
| 186 | 186 | 1 | 0.00538 | 0 |
|
| 372 | 372 | 2 | 0.00538 | 0 |
|
| 1062 | 1062 | 6 | 0.00565 | 0 |
|
| 354 | 354 | 2 | 0.00565 | 0 |
|
| 2049 | 2070 | 12 | 0.00586 | 1 |
|
| 1524 | 1524 | 9 | 0.00591 | 0 |
|
| 486 | 480 | 3 | 0.00625 | 1 |
|
| 477 | 477 | 3 | 0.00629 | 0 |
|
| 1062 | 1062 | 7 | 0.00659 | 0 |
|
| 963 | 963 | 7 | 0.00727 | 0 |
|
| 3213 | 3213 | 24 | 0.00747 | 0 |
|
| 132 | 132 | 1 | 0.00758 | 0 |
|
| 2205 | 2205 | 17 | 0.00771 | 0 |
|
| 1422 | 1422 | 11 | 0.00774 | 0 |
|
| 246 | 246 | 2 | 0.00813 | 0 |
|
| 246 | 246 | 2 | 0.00813 | 0 |
|
| 1095 | 1095 | 9 | 0.00822 | 0 |
|
| 606 | 606 | 5 | 0.00825 | 0 |
|
| 234 | 234 | 2 | 0.00855 | 0 |
|
| 1497 | 1497 | 13 | 0.00868 | 0 |
|
| 690 | 690 | 6 | 0.0087 | 0 |
|
| 114 | 114 | 1 | 0.00877 | 0 |
|
| 111 | 111 | 1 | 0.00901 | 0 |
|
| 1428 | 1428 | 13 | 0.0091 | 0 |
|
| 648 | 648 | 6 | 0.00926 | 0 |
|
| 744 | 744 | 7 | 0.00941 | 0 |
|
| 609 | 609 | 6 | 0.00985 | 0 |
|
| 303 | 303 | 3 | 0.0099 | 0 |
|
| 402 | 402 | 4 | 0.00995 | 0 |
|
| 966 | 966 | 10 | 0.01035 | 0 |
|
| 1527 | 1527 | 16 | 0.01048 | 0 |
|
| 1491 | 1491 | 16 | 0.01073 | 0 |
|
| 822 | 858 | 9 | 0.01095 | 1 |
|
| 363 | 363 | 4 | 0.01102 | 0 |
|
| 90 | 90 | 1 | 0.01111 | 0 |
|
| 531 | 531 | 6 | 0.0113 | 0 |
|
| 4137 | 4146 | 50 | 0.01209 | 1 |
|
| 1503 | 1503 | 18 | 0.01264 | 0 |
|
| 711 | 711 | 9 | 0.01266 | 0 |
|
| 222 | 222 | 3 | 0.01351 | 0 |
|
| 543 | 543 | 8 | 0.01473 | 0 |
|
| 555 | 555 | 9 | 0.01622 | 0 |
|
| 1536 | 1536 | 25 | 0.01628 | 0 |
|
| 306 | 303 | 5 | 0.0165 | 1 |
|
| 303 | 303 | 5 | 0.0165 | 0 |
|
| 1182 | 1182 | 20 | 0.01692 | 0 |
|
| 555 | 555 | 10 | 0.01805 | 0 |
|
| 273 | 273 | 5 | 0.01832 | 0 |
|
| 105 | 105 | 2 | 0.01905 | 0 |
|
| 417 | 417 | 9 | 0.02158 | 0 |
|
| 1014 | 1014 | 24 | 0.02367 | 0 |
|
| 162 | 162 | 4 | 0.02469 | 0 |
|
| 227 | 227 | 6 | 0.02622 | 0 |
|
| 2223 | 2223 | 61 | 0.02744 | 0 |
|
| 5688 | 5691 | 195 | 0.0345 | 6 |
|
| 114 | 114 | 5 | 0.04386 | 0 |
Genes are ranked from lowest to highest proportion of nucleotide differences.
Figure 4Amount of sequence divergence between the protein-coding genes of Tetracentron and Trochodendron.
Figure 5Sequence identity plot between Trochodendron and Tetracentron.
Exon and intron lengths (bp) in plastid genes containing introns in Tetracentron sinense and Trochodendron aralioides, respectively.
| Gene | Exon 1 ( | Intron 1 ( | Exon 2 ( | Intron 2 ( | Exon 3 ( |
|
| 37/37 | 35/35 | |||
|
| 24/24 | 698/698 | 48/48 | ||
|
| 35/35 | 444/442 | 50/50 | ||
|
| 39/39 | 583/585 | 37/37 | ||
|
| 42/42 | 954/954 | 35/35 | ||
|
| 38/38 | 794/794 | 35/35 | ||
|
| 6/6 | 793/797 | 642/642 | ||
|
| 8/8 | 704/709 | 496/496 | ||
|
| 145/145 | 727/724 | 410/410 | ||
|
| 553/553 | 1106/1084 | 542/542 | ||
|
| 777/777 | 700/700 | 756/756 | ||
|
| 391/391 | 671/674 | 434/434 | ||
|
| 9/9 | 865/972 | 402/402 | ||
|
| 114/114 | 232/232 | 538/536 | 26/26 | |
|
| 432/432 | 728/714 | 1617/1638 | ||
|
| 71/71 | 682/710 | 292/292 | 659/650 | 246/246 |
|
| 124/124 | 734/725 | 230/230 | 731/758 | 153/153 |
|
| 40/40 | 831/844 | 227/227 |
The rps12 gene is trans-spliced, and hence the length of intron 1 is unknown.
A/T content (%) of different regions in Tetracentron and Trochodendron.
| Region |
|
|
| overall | 61.86 | 61.98 |
| LSC | 63.50 | 63.74 |
| IR | 57.63 | 57.83 |
| SSC | 67.84 | 67.48 |
| Protein-coding regions | 61.58 | 61.53 |
Distribution of SSR loci in the plastid genomes of Tetracentron and Trochodendron.
| Base | Length | Position in plastid genome |
| SSR loci in | ||
| A | 10 | 2085–2094 7164–7173 9478–9487 17266–17275 39220–39229 47812–47821 58880–58889 69930–69939 124816–124825 136417–136426 141648–141657 |
| 11 | 9611–9621 46892–46902 47147–47157 50813–50823 75797–75807 80873–80883 82302–82312 133069–133079 160432–160442 | |
| 12 | 217–228 49977–49988 50332–50343 118899–118910 162450–162461 163452–163463 163940–163951 | |
| 14 | 65157–65170 | |
| 15 | 38842–38856 | |
| 17 | 39891–39907 | |
| 18 | 74838–74855 | |
| 22 | 72886–72907 | |
| T | 10 | 5266–5275 6724–6733 9153–9162 19332–19341 54468–54477 63461–63470 67706–67715 107277–107286 112508–112517 117373–117382 118300–118309 121204–121213 126456–126465 130614–130623 |
| 11 | 7004–7014 7679–7689 13144–13154 31361–31371 37925–37935 47779–47789 67810–67820 76013–76023 88492–88502 | |
| 12 | 55307–55318 71723–71734 84983–84994 85471–85482 86473–86484 118884–118895 119027–119038 | |
| 13 | 13902–13914 | |
| 14 | 72926–72939 | |
| AT | 10 | 1734–1743 20833–20842 50404–50413–63181–63190 |
| 12 | 4862–4873 12996–13007 114822–114833 | |
| 14 | 60686–60699 | |
| TA | 10 | 34083–34092 34111–34120 114741–114750 |
| 14 | 49132–49145 | |
| TAAA | 20 | 46875–46894 |
| SSR loci in | ||
| A | 10 | 118854–118863 126258–126267 142993–143002 163821–163830 18142–18151 40389–40398 41060– 41069 51091–51100 6136–6145 68969–68978 76681–76690 86529–86538 |
| 11 | 134406–134416 16427–16437 30306–30316 39963–39973 51490–51500 70911–70921 81823–81833 9789–9799 | |
| 12 | 10420–10431 48058–48069 48322–48333 | |
| 13 | 164932–164944 | |
| 16 | 161805–161820 73777–73792 75726–75741 | |
| 15 | 46189–46203 | |
| 17 | 214–230 83299–83315 9304–9320 | |
| T | 10 | 108427–108436 120424–120433 121028–121037 122665–122674 131951–131960 164891–164900 20189–20198 40375–40387 48933–4894253154–53163 53339–53348 5700–5709 6030–6039 68604–68613 72934–72943 83282–83291 87599–87608 |
| 11 | 127885–127895 14709–14719 55604–55614 57547–57557 | |
| 12 | 50271–50282 | |
| 13 | 73814–73826 86485–86497 | |
| 14 | 76896–76909 | |
| 15 | 48889–48903 | |
| 16 | 89609–89624 | |
| AT | 10 | 1724–1733 51556–51565 64459–64468 |
| 12 | 4921–4932 4943–4954 4984–4995 4998–5009 5044–5055 5085–5096 5099–5110 5145–5156 5186–5197 5200–5211 | |
| 18 | 73275–73292 | |
| TA | 10 | 1738–1747 21689–21698 |
| TAA | 18 | 5016–5033 5218–5235 |
| C | 10 | 55999–56008 |
Figure 6A maximum likelihood tree determined by GARLI (−ln L = −1095466.026) for the 83-gene, 88-taxon data set.
Numbers associated with branches are ML bootstrap support values. Error bars around nodes correspond to 95% highest posterior distributions of divergence times based on 6 fossils using the program BEAST. Eo = Eocene, Mi = Miocene, Ol. = Oligocene, Pa = Paleocene, Pl = Pliocene.
Numbers of genes (including genes that span IR/SC junctions) in the IR regions of early-diverging eudicots.
| Basal eudicot lineages | Species | Genes in IR region | cp genome size (bp) |
| Ranunculales |
| 20 | 155129 |
|
| 19 | 159924 | |
|
| 19 | 156599 | |
| Proteales |
| 18 | 163206 |
|
| 19 | 161791 | |
| Sabiales |
| 18 | 160357 |
| Buxales |
| 18 | 159010 |
| Trochodendrales |
| 24 | 164467 |
|
| 24 | 165945 |