| Literature DB >> 20091301 |
Mary M Guisinger1, Timothy W Chumley, Jennifer V Kuehl, Jeffrey L Boore, Robert K Jansen.
Abstract
Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20091301 PMCID: PMC2825539 DOI: 10.1007/s00239-009-9317-3
Source DB: PubMed Journal: J Mol Evol ISSN: 0022-2844 Impact factor: 2.395
Comparison of major features of Typha and nine grass plastid genomes
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|
| Size (bp) | 161,572 | 136,584 | 135,199 | 136,462 | 135,282 | 134,525 | 141,182 | 140,754 | 134,545 | 140,384 |
| LSC length (bp) | 89,140 | 80,546 | 79,447 | 80,600 | 79,972 | 80,592 | 83,048 | 82,688 | 80,348 | 82,352 |
| SSC length (bp) | 19,652 | 12,740 | 12,668 | 12,704 | 12,428 | 12,335 | 12,544 | 12,502 | 12,791 | 12,536 |
| IR length (bp) | 26,390 | 21,649 | 21,542 | 21,579 | 21,441 | 20,799 | 22,795 | 22,782 | 20,703 | 22,748 |
| Total number of genesa | 131 | 128d | 128e | 128d | 128 | 128 | 128 | 128d | 128 | 128 |
| Number of genes duplicated in IRb | 18 | 18 | 18 | 18 | 18 | 18 | 18 | 18 | 18 | 18 |
| Number of genes | ||||||||||
| With introns | 18 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 |
| % GC content | 33.8 | 37.4 | 37.5 | 37.3 | 37.2 | 37.9 | 37.4 | 37.4 | 37.2 | 37.4 |
| % Coding3 | 57.1 | 53.6 | 55.5 | 56.7 | 55.1 | 55.6 | 54.0 | 52.1% | 55.5 | 54.6 |
aOnly includes named genes and ycfs, not orfs; both ycf15 and ycf68 were not included based on comparisons reported in Raubeson et al. (2007)
bDoes not include rps12, which is split between the LSC and IR
cIncludes protein-coding genes, tRNAs, and rRNAs
dAnnotations for Agrostis stolonifera (NC_008591), Hordeum vulgare subsp. vulgare (NC_008590), Sorghum bicolor (NC_008602) in Saski et al. (2007) incorrectly included two extra tRNAs (trnM-cau between trnG-ucc and trnT-ggu; trnfM-cau between trnR-ucu and rps14). These have been deleted in the calculation of number of genes
eBortiri et al. (2008) reported 136 putatively functional genes; the difference in the number reported here is due to annotation errors in the Brachypodium distachyon (NC_011032) genome sequence. These have been deleted in the calculation of number of genes
Fig. 1Extent of the inverted repeat (IR) in 10 Poales plastid genomes. Selected genes or portions of genes are indicated by gray boxes above or below the genome. Gene and IR lengths are not to scale (see Table 1 for Poales IR lengths)
Fig. 2Multipip analyses (Schwartz et al. 2003) showing overall sequence similarity of plastid genomes based on complete genome alignment. Levels of sequence similarity are indicated by black (75–100%), gray (50–75%), and white (<50%). a Comparison of 10 members of Poales, using Typha latifolia as the reference genome. Arrows indicate gene/intron losses and deletions; partial duplication of ycf1 is due to IR expansion. b Comparison of nine Poaceae genomes using Hordeum vulgare as the reference genome. Arrows indicate deletions; 995 bp deletion is present twice because it is in the IR
Variation in accD, ycf1, and ycf2 in Poales (length in bp/percent divergence relative to Typha)
| Taxon |
|
|
|
|---|---|---|---|
|
| 1509 bp | 5508 bp | 6882 bp |
|
| 0 bp | 851 bp/24.3% | 1380 bp/15.0% |
|
| 0 bp | 851 bp//24.1% | 1412 bp/16.4% |
|
| 134 bp/30% | 855 bp/24.0% | 1413 bp/16.3% |
|
| 132 bp/32.5% | 845 bp/23.4% | 1314 bp/16.5% |
|
| 253 bp/29.2% | 845 bp/24.3% | 698 bp/17.5% |
|
| 0 bp | 863 bp/25.0% | 2063 bp/18.3% |
|
| 0 bp | 863 bp/25.1% | 2061 bp/18.0% |
|
| 0 bp | 837 bp/24.1% | 704 bp/17.6% |
|
| 0 bp | 867 bp/25.6% | 2089 bp/19.1% |
Fig. 3Percent identity plot (Elnitski et al. 2002). a Typha latifolia compared to Hordeum vulgare. Numbers along the x-axis indicate the coordinates for Typha and along the y-axis for Hordeum. INV inversion. b Hordeum vulgare compared to Zea mays. Numbers along the x-axis indicate the coordinates for Hordeum and along the y-axis for Zea
Fig. 4ML tree of 47 taxa for 73 protein-coding genes (−lnL = 568622.59691). MP analysis was generally congruent, but topological differences are shown in the inset. Bootstrap values are shown at nodes for ML/MP; and only one statistic is reported where values are the same except in the eurosid clade where ML values are shown on the full tree and MP values are on the inset. The Poales clade is shaded and genomic changes within Poales are indicated by black bars. Subfamilies sampled are shown (EHR Ehrhartoideae, POO Pooideae, PAN Panicoideae)
Fig. 5Sample trees from codeml analyses showing rate acceleration (dN or dS) for three plastid genes. a, b Large subunit ribosomal protein L32. c, d Small subunit ribosomal protein S11. e, f Photosystem II protein J. The Poaceae clade is shaded
Branch comparisons of dN/dS, dN, and dS for gene groups
| Gene groups | Branch comparisons | d | d | d |
|---|---|---|---|---|
| All 73 genes | Branch leading to Poaceae vs. all other branches | 0.0984 | <0.0001** | <0.0001** |
| Branch leading to Poaceae vs. internal Poaceae | 0.0872 | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. non-Poaceae monocots | 0.0904 | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. other angiosperms | 0.1103 | <0.0001** | <0.0001** | |
| Internal Poaceae vs. non-Poaceae monocots | na | <0.0001** | <0.0001** | |
| Internal Poaceae vs. other angiosperms | 0.6577 | <0.0001** | <0.0001** | |
| Non-Poaceae monocots vs. other angiosperms | 0.7123 | 0.8426 | 0.6613 | |
| Photosynthetic apparatus | Branch leading to Poaceae vs. all other branches | 0.0001** | 0.2746 | <0.0001** |
| Branch leading to Poaceae vs. internal Poaceae | 0.0001** | 0.0053* | <0.0001** | |
| Branch leading to Poaceae vs. non-Poaceae monocots | 0.0001** | 0.6835 | <0.0001** | |
| Branch leading to Poaceae vs. other angiosperms | 0.0001** | 0.4589 | <0.0001** | |
| Internal Poaceae vs. non-Poaceae monocots | na | <0.0001** | <0.0001** | |
| Internal Poaceae vs. other angiosperms | 0.0116 | <0.0001** | <0.0001** | |
| Non-Poaceae monocots vs. other angiosperms | 0.0223 | 0.2702 | 0.9622 | |
| Genes expression | Branch leading to Poaceae vs. all other branches | <0.0001** | <0.0001** | <0.0001** |
| Branch leading to Poaceae vs. internal Poaceae | <0.0001** | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. non-Poaceae monocots | <0.0001** | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. other angiosperms | <0.0001** | <0.0001** | <0.0001** | |
| Internal Poaceae vs. non-Poaceae monocots | na | <0.0001** | <0.0001** | |
| Internal Poaceae vs. other angiosperms | <0.0001** | <0.0001** | <0.0001** | |
| Non-Poaceae monocots vs. other angiosperms | <0.0001** | 0.1513 | 0.5967 | |
| Photosynthetic metabolism ( | Branch leading to Poaceae vs. all other branches | 0.0123 | <0.0001** | <0.0001** |
| Branch leading to Poaceae vs. internal Poaceae | 0.0411 | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. non-Poaceae monocots | 0.0427 | <0.0001** | <0.0001** | |
| Branch leading to Poaceae vs. other angiosperms | 0.0065 | <0.0001** | <0.0001** | |
| Internal Poaceae vs. non-Poaceae monocots | na | <0.0001** | <0.0001** | |
| Internal Poaceae vs. other angiosperms | 0.1186 | <0.0001** | <0.0001** | |
| Non-Poaceae monocots vs. other angiosperms | 0.1574 | 0.7511 | 0.8547 |
P-values were generated using Wilcoxon rank sums tests and asterisks show significant values after correction for multiple comparisons using Holm’s method, i.e., sequential Bonferroni correction (** α = 0.01, * α = 0.05). The value ‘na’ is due to model parameters in PAML analyses
Fig. 6Average dN/dS, dN, and dS values per gene plotted across the length of the plastid genome using the grass gene order. Values for the branch leading to Poaceae (circles), internal Poaceae branches (“x”s), non-Poaceae monocot branches (triangles), and other angiosperm branches (squares) were compared. For values of dN/dS, black squares show both non-Poaceae monocot and other angiosperm branches due to PAML model parameters. Note that the scales are different for dN/dS, dN, and dS plots
The degree of rate acceleration for the ratio of nonsynonymous to synonymous substitutions (dN/dS) on the branch leading to Poaceae relative to other branches in the phylogeny
| Gene name/gene group | Leading to Poaceae | Internal Poaceae | Increase d | Other monocots | Increase d | Others angiosperms | Increase d |
|---|---|---|---|---|---|---|---|
| atp | 0.2678 | 0.1077 |
| 0.1077 |
| 0.0985 |
|
| ccsA | 0.2234 | 0.1945 |
| 0.1945 |
| 0.2481 |
|
| cemA | 1.6600 | 0.2016 |
| 0.2016 |
| 0.3203 |
|
| clpP | 0.6711 | 0.1513 |
| 0.1513 |
| 0.3187 |
|
| matK | 0.3096 | 0.3502 |
| 0.3502 |
| 0.3820 |
|
| ndh | 0.1652 | 0.1483 |
| 0.1483 |
| 0.1281 |
|
| pet | 0.0149 | 0.0790 |
| 0.0790 |
| 0.0780 |
|
| psa | 0.0607 | 0.1137 |
| 0.1137 |
| 0.1336 |
|
| psb | 0.0532 | 0.1025 |
| 0.1025 |
| 0.0776 |
|
| rbcL | 0.0391 | 0.0963 |
| 0.0963 |
| 0.0889 |
|
| rpl | 0.3400 | 0.1655 |
| 0.1655 |
| 0.2034 |
|
| rps | 0.5426 | 0.1766 |
| 0.1766 |
| 0.1923 |
|
| rpo | 0.3207 | 0.1553 |
| 0.1553 |
| 0.1853 |
|
| Average |
|
|
|
|
|
|
|
|
|
|
|
|
Asterisks show P-values that are significant after correction for multiple comparisons using Holm’s method, i.e., sequential Bonferroni correction (** α = 0.01, * α = 0.05)
The degree of rate acceleration for nonsynonymous substitutions (dN) on the branch leading to Poaceae relative to other branches in the phylogeny
| Gene name/gene group | Leading to Poaceae | Internal Poaceae | Increase d | Other monocots | Increase d | Others angiosperms | Increase d |
|---|---|---|---|---|---|---|---|
| atp | 0.0424 | 0.0018 |
| 0.0073 |
| 0.0069 |
|
| ccsA | 0.0540 | 0.0049 |
| 0.0244 |
| 0.0249 |
|
| cemA | 0.1715 | 0.0040 |
| 0.0141 |
| 0.0238 |
|
| clpP | 0.2117 | 0.0025 |
| 0.0085 |
| 0.0197 |
|
| matK | 0.0900 | 0.0114 |
| 0.0386 |
| 0.0378 |
|
| ndh | 0.0378 | 0.0030 |
| 0.0109 |
| 0.1281 |
|
| pet | 0.0026 | 0.0016 |
| 0.0054 |
| 0.0053 |
|
| psa | 0.0107 | 0.0019 |
| 0.0073 |
| 0.0068 |
|
| psb | 0.0099 | 0.0022 |
| 0.0055 |
| 0.0043 |
|
| rbcL | 0.0128 | 0.0026 |
| 0.0069 |
| 0.0064 |
|
| rpl | 0.0962 | 0.0040 |
| 0.0108 |
| 0.0130 |
|
| rps | 0.0784 | 0.0032 |
| 0.0106 |
| 0.0120 |
|
| rpo | 0.0656 | 0.0041 |
| 0.0116 |
| 0.0146 |
|
| Average |
|
|
|
|
|
|
|
|
|
|
|
|
Asterisks show P-values that are significant after correction for multiple comparisons using Holm’s method, i.e., sequential Bonferroni correction (** α = 0.01, * α = 0.05)
The degree of rate acceleration for synonymous substitutions (dS) on the branch leading to Poaceae relative to other branches in the phylogeny
| Gene name/gene group | Leading to Poaceae | Internal Poaceae | Increase d | Other monocots | Increase d | Other angiosperms | Increase d |
|---|---|---|---|---|---|---|---|
| atp | 0.1868 | 0.0179 |
| 0.0679 |
| 0.0718 |
|
| ccsA | 0.2416 | 0.0251 |
| 0.1255 |
| 0.1003 |
|
| cemA | 0.1033 | 0.0199 |
| 0.0698 |
| 0.0743 |
|
| clpP | 0.3155 | 0.0164 |
| 0.0559 |
| 0.0617 |
|
| matK | 0.2907 | 0.0327 |
| 0.1101 |
| 0.0990 |
|
| ndh | 0.2280 | 0.0244 |
| 0.0872 |
| 0.0858 |
|
| pet | 0.1847 | 0.0190 |
| 0.0676 |
| 0.0653 |
|
| psa | 0.2004 | 0.0199 |
| 0.0686 |
| 0.0649 |
|
| psb | 0.1875 | 0.0188 |
| 0.0527 |
| 0.0571 |
|
| rbcL | 0.3270 | 0.0276 |
| 0.0716 |
| 0.0720 |
|
| rpl | 0.2773 | 0.0246 |
| 0.0715 |
| 0.0820 |
|
| rps | 0.1652 | 0.0195 |
| 0.0690 |
| 0.0693 |
|
| rpo | 0.2085 | 0.0250 |
| 0.0740 |
| 0.0778 |
|
| Average |
|
|
|
|
|
|
|
|
|
|
|
|
Asterisks show P-values that are significant after correction for multiple comparisons using Holm’s method, i.e., sequential Bonferroni correction (** α = 0.01, * α = 0.05)