| Literature DB >> 25405773 |
Nadja Korotkova1, Lars Nauheimer2, Hasmik Ter-Voskanyan3, Martin Allgaier4, Thomas Borsch1.
Abstract
Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)-a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC-trnV, trnR-atpA, ndhF-rpl32, psbM-trnD, and trnQ-rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF-rpl32 and trnK-rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low.Entities:
Mesh:
Year: 2014 PMID: 25405773 PMCID: PMC4236126 DOI: 10.1371/journal.pone.0112998
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
GenBank accession numbers and references for the plastid genomes used in this study.
| Species | GenBank accession number | Reference |
|
| HG737342 | this study |
|
| NC015996 | Terakami et al. |
|
| NC021431 | Yang et al. |
|
| NC021430 | Yang et al. |
|
| NC010362 | Greiner et al. |
|
| EU262887 | Greiner et al. |
|
| NC015608 | Besnard et al. |
|
| NC015401 | Besnard et al. |
Sequence statistics for the four genome pairs compared.
| Genome pair | p-distance | Aligned length [bp] | Length difference | SNPs | Indels |
|
| 0.00145 | 160607 bp | 227 bp | 230 | 173 |
|
| 0.00294 | 156091 bp | 30 bp | 458 | 112 |
|
| 0.00122 | 165952 bp | 1690 bp | 199 | 173 |
|
| 0.0008 | 155833 bp | 79 bp | 124 | 62 |
Figure 1Circular representation of plastid genome pair in Pyrus.
Shown are consensus sequences of compared species pairs of Pyrus spinosa and P. pyrifolia with their differing p-distances, numbers of SNPs and indels across the consensus. Radial grey highlights show the regions in focus of study with their names. Circular graphs from outside to inside: outermost circle with ticks for every 1,000 bp (small) and 10,000 bp (big) indicates part of genome, single copy regions in light grey and inverted repeats in dark grey; bands show locations of genes (blue), tRNAs (yellow) and rRNAs (red); the three outermost histograms display p-distances (blue), number of SNPs (green) and indels (orange) per spacer region; innermost graph shows number of SNPs (green histogram), indels (orange histogram), and AT content relative to the whole consensus (black line graph) of 500 bp long parts of the whole consensus.
Figure 2Circular representation of plastid genome pair in Cymbidium.
Shown are consensus sequences of compared species pairs of Cymbidium tortisepalum and C. sinense with their differing p-distances, numbers of SNPs and indels across the consensus. Radial grey highlights show the regions in focus of study with their names. Circular graphs from outside to inside: outermost circle with ticks for every 1,000 bp (small) and 10,000 bp (big) indicates part of genome, single copy regions in light grey and inverted repeats in dark grey; bands show locations of genes (blue), tRNAs (yellow) and rRNAs (red); the three outermost histograms display p-distances (blue), number of SNPs (green) and indels (orange) per spacer region; innermost graph shows number of SNPs (green histogram), indels (orange histogram), and AT content relative to the whole consensus (black line graph) of 500 bp long parts of the whole consensus.
Ranking and comparison of p-distances and differences in the four plastid genome pairs.
|
|
|
|
| |||||||||||||
| Rank | Region | Aligned length [bp] | PICs (SNPs/Indels) | p-distance [*10−3] | Region | Aligned length [bp] | PICs (SNPs/Indels) | p-distance [*10−3] | Region | Aligned length [bp] | PICs (SNPs/Indels) | p-distance [*10−3] | Region | Aligned length [bp] | PICs (SNPs/Indels) | p-distance [*10−3] |
| 1 |
| 184 | 6 (5/1) | 37.88 |
| 366 | 6 (5/1) | 14.04 |
| 381 | 15 (9/6) | 36.73 |
| 170 | 5 (4/1) | 23.81 |
| 2 |
| 149 | 4 (3/1) | 22.06 |
| 259 | 4 (3/1) | 13.04 |
| 134 | 2 (2/0) | 14.93 |
| 243 | 6 (5/1) | 21.1 |
| 3 |
| 760 | 24 (12/12) | 20.34 |
| 613 | 17 (7/10) | 12.15 |
| 332 | 5 (4/1) | 12.16 |
| 112 | 3 (2/1) | 18.02 |
| 4 |
| 909 | 20 (10/10) | 13.61 |
| 629 | 8 (6/2) | 9.93 |
| 172 | 3 (2/1) | 11.9 |
| 715 | 12 (10/2) | 14.33 |
| 5 |
| 1078 | 20 (12/8) | 11.41 |
| 345 | 4 (3/1) | 8.7 |
| 408 | 5 (4/1) | 9.9 |
| 702 | 11 (10/1) | 14.27 |
| 6 |
| 459 | 5 (5/0) | 10.89 |
| 122 | 1 (1/0) | 8.2 |
| 577 | 6 (5/1) | 8.68 |
| 354 | 5 (5/0) | 14.12 |
| 7 |
| 974 | 9 (8/1) | 8.38 |
| 635 | 6 (5/1) | 7.9 |
| 355 | 6 (3/3) | 8.52 |
| 447 | 8 (6/2) | 13.51 |
| 8 |
| 905 | 10 (6/4) | 8.3 |
| 129 | 1 (1/0) | 7.75 |
| 932 | 7 (7/0) | 8.26 |
| 326 | 5 (4/1) | 12.31 |
| 9 |
| 268 | 6 (2/4) | 7.81 |
| 146 | 1 (1/0) | 6.85 |
| 2615 | 23 (12/11) | 5.59 |
| 174 | 2 (2/0) | 11.49 |
| 10 |
| 403 | 4 (3/1) | 7.59 |
| 168 | 1 (1/0) | 5.95 |
| 397 | 4 (2/2) | 5.13 |
| 186 | 2 (2/0) | 10.75 |
| 11 |
| 137 | 1 (1/0) | 7.3 |
| 676 | 5 (4/1) | 5.93 |
| 976 | 8 (4/4) | 5.06 |
| 1322 | 15 (14/1) | 10.6 |
| 12 |
| 145 | 2 (1/1) | 6.99 |
| 180 | 1 (1/0) | 5.56 |
| 216 | 1 (1/0) | 4.63 |
| 899 | 12 (9/3) | 10.06 |
| 13 |
| 448 | 4 (3/1) | 6.79 |
| 185 | 1 (1/0) | 5.41 |
| 219 | 2 (1/1) | 4.59 |
| 209 | 4 (2/2) | 9.66 |
| 14 |
| 1235 | 11 (8/3) | 6.51 |
| 230 | 1 (1/0) | 4.35 |
| 515 | 4 (2/2) | 4.37 |
| 314 | 3 (3/0) | 9.55 |
| 15 |
| 156 | 1 (1/0) | 6.41 |
| 233 | 1 (1/0) | 4.29 |
| 462 | 2 (2/0) | 4.33 |
| 105 | 1 (1/0) | 9.52 |
| 16 |
| 1003 | 9 (6/3) | 6.01 |
| 257 | 1 (1/0) | 3.89 |
| 939 | 5 (4/1) | 4.26 |
| 107 | 2 (1/1) | 9.43 |
| 17 |
| 526 | 3 (3/0) | 5.7 |
| 287 | 1 (1/0) | 3.48 |
| 249 | 2 (1/1) | 4.03 |
| 657 | 7 (6/1) | 9.15 |
| 18 |
| 569 | 6 (3/3) | 5.33 |
| 1191 | 9 (4/5) | 3.46 |
| 926 | 3 (3/0) | 3.24 |
| 335 | 5 (3/2) | 9.01 |
| 19 |
| 1241 | 8 (6/2) | 4.94 |
| 610 | 4 (2/2) | 3.36 |
| 669 | 3 (2/1) | 3.01 |
| 222 | 2 (2/0) | 9.01 |
| 20 |
| 413 | 4 (2/2) | 4.89 |
| 300 | 1 (1/0) | 3.33 |
| 348 | 1 (1/0) | 2.87 |
| 697 | 7 (6/1) | 8.62 |
| 21 |
| 207 | 2 (1/1) | 4.83 |
| 323 | 1 (1/0) | 3.1 |
| 758 | 4 (2/2) | 2.67 |
| 1171 | 12 (10/2) | 8.62 |
| 22 |
| 218 | 2 (1/1) | 4.67 |
| 348 | 1 (1/0) | 2.87 |
| 761 | 3 (2/1) | 2.65 |
| 236 | 2 (2/0) | 8.47 |
| 23 |
| 651 | 3 (3/0) | 4.62 |
| 728 | 3 (2/1) | 2.75 |
| 788 | 2 (2/0) | 2.54 |
| 479 | 4 (4/0) | 8.37 |
| 24 |
| 909 | 7 (4/3) | 4.48 |
| 367 | 2 (1/1) | 2.74 |
| 804 | 4 (2/2) | 2.5 |
| 120 | 1 (1/0) | 8.33 |
| 25 |
| 225 | 1 (1/0) | 4.48 |
| 947 | 2 (2/0) | 2.11 |
| 412 | 5 (1/4) | 2.45 |
| 121 | 1 (1/0) | 8.26 |
| 26 |
| 451 | 3 (2/1) | 4.44 |
| 960 | 3 (2/1) | 2.11 |
| 984 | 4 (2/2) | 2.05 |
| 122 | 1 (1/0) | 8.2 |
| 27 |
| 242 | 2 (1/1) | 4.29 |
| 1020 | 2 (2/0) | 1.96 |
| 1114 | 4 (2/2) | 1.89 |
| 741 | 6 (6/0) | 8.1 |
| 28 |
| 472 | 4 (2/2) | 4.29 |
| 1216 | 3 (2/1) | 1.68 |
| 1441 | 6 (2/4) | 1.43 |
| 268 | 3 (2/1) | 7.49 |
| 29 |
| 1216 | 8 (5/3) | 4.13 |
| 638 | 1 (1/0) | 1.57 |
| 720 | 1 (1/0) | 1.39 |
| 835 | 10 (6/4) | 7.26 |
| 30 |
| 514 | 3 (2/1) | 3.9 |
| 1461 | 3 (2/1) | 1.38 |
| 775 | 4 (1/3) | 1.34 |
| 1119 | 12 (8/4) | 7.24 |
The regions are sorted according to p-distances.
Figure 4Circular representation of plastid genome pairs in Olea.
Shown are consensus sequences of compared species pairs of Olea europaea and O. woodiana with their differing p-distances, numbers of SNPs and indels across the consensus. Radial grey highlights show the regions in focus of study with their names. Circular graphs from outside to inside: outermost circle with ticks for every 1,000 bp (small) and 10,000 bp (big) indicates part of genome, single copy regions in light grey and inverted repeats in dark grey; bands show locations of genes (blue), tRNAs (yellow) and rRNAs (red); the three outermost histograms display p-distances (blue), number of SNPs (green) and indels (orange) per spacer region; innermost graph shows number of SNPs (green histogram), indels (orange histogram), and AT content relative to the whole consensus (black line graph) of 500 bp long parts of the whole consensus.
Genomic regions proposed for evolutionary analyses in Pyrus and primers for their amplification.
| Region | Amplified fragment | Primer name | Primer sequence | Reference |
|
| 900 bp | ndhC–F |
| Goodson et al. |
| PYRtrnV–150R |
| this study | ||
|
| 1000 bp | trnR–F |
| this study |
| atpA–180R | GGAACRAACGGYTATCTTGATTC | this study | ||
|
| 1350 bp | PYRpsbM–F |
| this study |
| PYRtrnD–R |
| this study | ||
|
| 900 bp | trnQ (UUG) |
| Shaw et al. |
| rps16x1 |
| Shaw et al. | ||
|
| 1300 bp | PYR–rps3F |
| this study |
| PYR–rpl16R |
| this study |
Figure 5Mutational dynamics in group II introns.
a) Schematic consensus structure of plastid group II introns based on Michel et al. (1989). Roman numbers indicate the six domains. B) Alignment and predicted RNA secondary structure for domain IV of the atpF intron in Cymbidium, Pyrus, Oenothera and Olea. The apparently non-homologous sequence blocks are placed separately in the alignment. There are no substitutions or length mutations in Pyrus and Cymbidium, the structures shown are therefore identical in the two species compared. The shown secondary structures of Oenothera and Olea are consensus structures. Two conserved nucleotide blocks at the 3′ and 5′ ends, indicated by thick blue bars, are conserved across all taxa and homologous in primary sequence and secondary structure. These conserved sequence blocks form the stem of the domain while variation occurs in the terminal stem-loops part of the domain. c) Alignment and predicted secondary RNA structures of domain IV of the rpl16 intron. For clarity, only the part of the domain with positions variable within genera are shown; “[-]” mark the omitted stem-loop elements. The apparently non-homologous sequence blocks are placed separately in the alignment. Those positions where variation occurs within a genus are marked with arrows. See text for more explanation.
Figure 3Circular representation of plastid genome pairs in Oenothera.
Shown are consensus sequences of compared species pairs of Oenonthera parviflora and O. argillicola with their differing p-distances, numbers of SNPs and indels across the consensus. Radial grey highlights show the regions in focus of study with their names. Circular graphs from outside to inside: outermost circle with ticks for every 1,000 bp (small) and 10,000 bp (big) indicates part of genome, single copy regions in light grey and inverted repeats in dark grey; bands show locations of genes (blue), tRNAs (yellow) and rRNAs (red); the three outermost histograms display p-distances (blue), number of SNPs (green) and indels (orange) per spacer region; innermost graph shows number of SNPs (green histogram), indels (orange histogram), and AT content relative to the whole consensus (black line graph) of 500 bp long parts of the whole consensus.
Identification of most variable plastid regions based on pairwise genome comparisons across angiosperms.
| Reference | Taxa studied | Markers found as most variable |
| Daniell et al. | Asterids: |
|
| Timme et al. | Asterids: |
|
| Shaw et al. | Angiosperms: Asterids: |
|
| Doorduin et al. | Asterids: |
|
| Gargano et al. | Asterids: |
|
| Yang et al. | Monocots: |
|
| Dong et al. | Angiosperms: |
|
| Ku et al. | Asterids: |
|
| Ku et al. | Asterids: |
|
| Särkinen & George | Asterids: |
|
Figure 6AT content of indels and areas around substitutions.
Boxplot representation of the AT content in different types of indels (polyN, short sequence repeats (SSR) and other indels) on the left side and in areas with different sizes around all substitutions (SNPs) in the genome on the right side for a) Pyrus spinosa and P. pyrifolia), b) Cymbidium tortisepalum and C. sinense, c) Oenonthera parviflora and O. argillicola and d) Olea europaea and O. woodiana. The cross in each boxplot indicates the mean of the distribution, the thick line refers to the median. The dotted line shows the AT content of the whole consensus sequence.