| Literature DB >> 27087667 |
J Chris Blazier1, Tracey A Ruhlman1, Mao-Lun Weng1, Sumaiyah K Rehman1, Jamal S M Sabir2, Robert K Jansen1,2.
Abstract
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27087667 PMCID: PMC4834550 DOI: 10.1038/srep24595
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of conserved domain database (CDD) search results for Annonaceae and Passifloraceae data sets.
| Annonaceae Comparison | N-terminal | dimer | β | β′ | nt identity (%) | aa identity (%) | ORF length (bp) |
|---|---|---|---|---|---|---|---|
| Y | Y | Y | Y | 57.0 | 40.7 | 1,035 | |
| Y | Y | Y | Y | 74.5 | 63.8 | 1,020 | |
| Y | Y | Y | Y | 89.0 | 86.6 | 1,020 | |
| Y | Y | Y | Y | 55.9 | 38.5 | 1,143 | |
| Y | Y | Y | Y | 100.0 | 100.0 | 1,002 | |
| Y | Y | Y | Y | 90.2 | 85.4 | 1,017 | |
| Y | Y | Y | Y | 91.3 | 89.8 | 1,014 | |
| Y | Y | Y | Y | 91.8 | 89.4 | 1,014 | |
| Y | Y | Y | Y | 85.7 | 80.2 | 1,017 | |
| Y | Y | Y | Y | 92.0 | 88.9 | 1,023 | |
| Y | Y | Y | Y | 91.8 | 87.9 | 1,017 | |
| Y | Y | Y | Y | 86.9 | 80.8 | 1,011 | |
| Y | Y | Y | Y | 91.9 | 87.8 | 1,029 | |
| Y | Y | Y | Y | 89.8 | 85.3 | 1,017 | |
| Y | Y | Y | Y | 53.6 | 37.4 | 1,071 | |
| Y | Y | Y | Y | 93.4 | 88.8 | 1,005 | |
| Y | Y | Y | Y | 93.8 | 90.6 | 1,017 | |
| Y | Y | Y | Y | 93.3 | 89.1 | 1,017 | |
| Y | Y | Y | Y | 100.0 | 100.0 | 1,017 | |
| Y | Y | Y | Y | 91.9 | 88.0 | 996 | |
| Y | Y | Y | Y | 80.7 | 71.0 | 891 |
Predictions of the PEP α subunit N-terminus, homodimer interface, beta and beta prime interfaces are indicated (Y = Yes, N = No). The pairwise identity of each sequence with the outgroups Populus or Chloranthus is given for nucleotide (nt) and amino acid (aa) alignments. Generic names in Annonaceae comparison are in bold; other genera represent related families of magnoliids or the outgroup Chloranthus. P. = Passiflora; species in bold in Passifloraceae comparison are members of the genus Passiflora; other genera are related familes of rosids.
Summary of conserved domain database (CDD) search results for Pelargonium data set.
| Outgroup and Geraniales | N-terminal | dimer | β | β′ | nt identity (%) | aa identity (%) | ORF length (bp) |
|---|---|---|---|---|---|---|---|
| 100.0 | 100.0 | 1,014 | |||||
| 92.2 | 85.8 | 1,020 | |||||
| 91.3 | 84.4 | 1,020 | |||||
| 84.0 | 73.2 | 1,014 | |||||
| 77.8 | 65.4 | 1,089 | |||||
| 46.0 | 31.6 | 885 | |||||
| 46.0 | 31.6 | 885 | |||||
| 46.2 | 31.9 | 885 | |||||
| 46.0 | 31.6 | 885 | |||||
| 46.3 | 31.6 | 885 | |||||
| 45.1 | 32.0 | 858 | |||||
| 44.9 | 31.4 | 855 | |||||
| 46.2 | 32.2 | 885 | |||||
| 46.2 | 31.6 | 885 | |||||
| 44.0 | 34.1 | 828 | |||||
| 48.8 | 33.8 | 945 | |||||
| 46.0 | 33.0 | 879 | |||||
| 45.1 | 31.0 | 864 | |||||
| 46.2 | 32.7 | 885 | |||||
| 34.1 | 24.4 | 750 | |||||
| 32.6 | 23.5 | 708 | |||||
| 35.1 | 25.2 | 714 | |||||
| 46.0 | 29.8 | 912 | |||||
| 46.2 | 30.1 | 912 | |||||
| 32.3 | 25.0 | 1,701 | |||||
| 31.6 | 25.1 | 1,737 | |||||
| 30.1 | 19.9 | 1,773 | |||||
| 35.0 | 23.0 | 1,788 | |||||
| 35.3 | 21.1 | 1,788 | |||||
| 36.0 | 20.5 | 1,788 | |||||
| 35.4 | 21.1 | 1,788 | |||||
| 35.5 | 21.8 | 1,782 | |||||
| 35.5 | 21.4 | 1,866 | |||||
| 33.7 | 19.3 | 1,788 | |||||
| 34.8 | 33.5 | 702 | |||||
| 31.8 | 15.9 | 1,737 | |||||
| 30.7 | 14.8 | 1,794 | |||||
| 33.6 | 20.6 | 1,560 | |||||
| 32.0 | 15.9 | 1,737 | |||||
| 30.7 | 15.3 | 1,794 | |||||
| 33.0 | 20.6 | 1,554 | |||||
| 31.9 | 15.7 | 1,737 | |||||
| 30.3 | 14.9 | 1,815 | |||||
| 33.4 | 21.1 | 1,566 | |||||
| 31.9 | 15.9 | 1,737 | |||||
| 30.6 | 14.9 | 1,794 | |||||
Predictions of the PEP α subunit N-terminus, homodimer interface, beta and beta prime interfaces are indicated (Y = Yes, N = No). The pairwise identity of each sequence with outgroup Eucalyptus is given for nucleotide (nt) and amino acid (aa) alignments.
Figure 1Alignment of PEP promoter regions.
(A) Alignment of promoter region for rbcL in three species with functional PEP (Nicotiana tabacum, Arabidopsis thaliana, Pelargonium x hortorum) and one lacking PEP (Cuscuta obtusiflora). (B) Alignment of promoter region for psbA in three species with functional PEP (N. tabacum, A. thaliana, P. x hortorum) and one lacking PEP (C. obtusiflora). The conserved −10 and −35 elements are indicated by block arrows and the transcription start site is indicated by a red box (+1).
Figure 2Representative maximum likelihood trees and dN/dS values for the nine taxa of Annonaceae.
(A) Likelihood scores for the matK and rpoA trees were −5638.2661 lnL and −5506.0125 lnL, respectively. Branches in bold are members of Annonaceae. Bootstrap values greater than 50 are shown at the nodes. Scale bar indicates non-synonymous substitutions per codon. (B) Histogram of dN/dS values for seven genes for the Annonaceae. For each gene, dN/dS values (y axis) are given for all branches of interest: the branch leading to the family, the internal branch to Annona/Asimina, and the terminal branches to Annona, Asimina, and Cananga. Only one ratio was marginally >1, the terminal branch to Asimina for matK (dN/dS = 1.0069).
Figure 3Representative maximum likelihood trees and dN/dS values for 12 taxa of Passifloraceae.
(A) Likelihood scores for the matK and rpoA trees were lnL −7243.3426 and −4810.9045 lnL, respectively. Branches in bold are members of Passifloraceae. Bootstrap values greater than 50 are shown at the nodes. Scale bar indicates non-synonymous substitutions per codon. (B) Histogram of dN/dS ratios for seven genes for the Passifloraceae. For each gene, dN/dS values (y axis) are given for all branches of interest: the branch leading to the family as well as all internal and terminal branches. The primary branch of interest is the terminal branch to P. biflora, the only species with a divergent rpoA gene. The terminal branch to P. quadrangularis for rpoC1 has a dN/dS value >1, but this is likely an artifact, as the branch length is extremely short. The lack of a bar for rbcL is due to a dS value of 0.
Figure 4Maximum likelihood tree generated for all 46 rpoA ORFs from 26 Pelargonium species with likelihood score −21428.281249 lnL.
Species in clade C2 contain two (P. spinosum and P. endlicherianum), three (four species from section Ciconium) or six (P. transvaalense) rpoA paralogs. Bootstrap values greater than 50 are shown at the nodes; values of 100 are indicated by asterisks. Scale bar indicates non-synonymous substitutions per codon. The constraint tree (inset) does not contain clade C2 taxa as these species all contain multiple rpoA sequences.
Taxon sampling by data set.
| Magnoliids/Chloranthales | Accession numbers |
|---|---|
| KU563738 | |
| KU645794, KU645799, KU645804, KU645810, KU645815, KU645820, KU645825 | |
| NC_004993 | |
| KU645791, KU645796, KU645801, KU645806, KU645812, KU645817, KU645822 | |
| NC_009598 | |
| NC_008456 | |
| NC_008326 | |
| NC_015892 | |
| NC_008457 | |
| NC_015308 | |
| NC_012224 | |
| KU645792, KU645797, KU645802, KU645808, KU645813, KU645818, KU645823 | |
| NC_010433 | |
| EU002528, KF224983, HM850223, EU002248, GQ998560, GQ998561, GQ998562 | |
| EU017067, KU645807, EU017069, EU017092, EU017096, EU017121, EU017122 | |
| JX661956, JX662765, JX664062, JX662034, JX663490, JX664953, JX662679 | |
| KU645790, KU645795, KU645800, KU645805, KU645811, KU645816, KU645821 | |
| KU645791, KU645796, KU645801, KU645806, KU645812, KU645817, KU645822 | |
| NC_009143 | |
| NC_016736 | |
| JX664965, JX664074, JX663502, JX662777, JX662690, JX662046, JX661965 | |
| NC_008115 | |
| NC_021101 | |
| NC_023256 | |
| NC_007957 | |
| NC_023260 | |
| KM527888 | |
| KM527887 | |
| KM527896 | |
| KM527897 | |
| NC_023261 | |
| KM527891 | |
| KM527893 | |
| KM527894 | |
| KU535486-KU535492 | |
| KM459517 | |
| KM459516 | |
| KM527892 | |
| KU535493-KU535499 | |
| KU535500-KU535506 | |
| KM527889 | |
| KM527898 | |
| KM527895 | |
| KM527899 | |
| KU535507-KU535513 | |
| KU535514-KU535522 | |
| KU535523-KU535530 | |
| KM527900 | |
| KU535531-KU535539 | |
| KU535540-KU535548 | |
| KU535549-KU535557 | |
| NC_008454 | |
Single accession numbers represent complete plastomes and seven accession numbers are for taxa with individual sequences for each gene.
Figure 5Histogram of dN/dS ratios for seven genes for Geraniaceae.
In addition to MAFFT results presented here, three other alignment algorithms were used (See Table S2). For each gene, dN/dS values are given for all branches of interest, the branch leading to the family (Geraniaceae), to Pelargonium, to the branch to clades A/B, to clade A, to clade B, and to clade C1.
Revised naming system and basic statistics for P. x hortorum and other sect. Ciconium rpoA ORFs.
| Old | New ORF name | Length in bp/aa | pairwise identity (%) | identical sites (%) | |||
|---|---|---|---|---|---|---|---|
| ORF574 | ORF597 | 1794/597 | 1815/604 | 1794/597 | 1794/597 | 99.40 | 98.80 |
| ORF365 | ORF578 | 1737/578 | 1737/578 | 1737/578 | 1737/578 | 99.50 | 99 |
| ORFs332+221 | ORF521 | 1566/521 | 1554/517 | 702/233* | 1560/519 | 92.70 | 87 |
*P. alchemilloides ORF521 homolog ends after 702 bp but is otherwise in frame through conserved stop codon after 1470 bp/490aa.
Gene conversion events detected by ORGCONV.
| Converted Sequence | Donor | Start | End | P-value (L/N) | P-value (L-N) |
|---|---|---|---|---|---|
| P_alchemilloides_ORF597 | P_alchemilloides_ORF521 | 130 | 713 | 1.13E-07 | 5.14E-06 |
| P_quinquelobatum_ORF597 | P_quinquelobatum_ORF521 | 119 | 713 | 2.30E-10 | 1.09E-08 |
| P_tongaense_ORF597 | P_tongaense_ORF521 | 124 | 713 | 5.52E-10 | 2.74E-08 |
| Pxhortorum_ORF578 | Pxhortorum_ORF521 | 223 | 300 | 1.31E-03 | 5.20E-03 |
| Pxhortorum_ORF597 | Pxhortorum_ORF521 | 669 | 713 | 1.03E-03 | 1.77E-02 |
The donor and acceptor of each putative gene conversion event are given along with the coordinates of the converted region and the p-value of the conversion event.
Gene conversion events detected by manual count from an alignment of all 12 ORFs from the four Pelargonium section Ciconium species.