| Literature DB >> 34699531 |
Antonina Kalkus1, Joy Barrett1, Theyjasvi Ashok1, Brian R Morton1.
Abstract
The codon usage of the Angiosperm psbA gene is atypical for flowering plant chloroplast genes but similar to the codon usage observed in highly expressed plastid genes from some other Plantae, particularly Chlorobionta, lineages. The pattern of codon bias in these genes is suggestive of selection for a set of translationally optimal codons but the degree of bias towards these optimal codons is much weaker in the flowering plant psbA gene than in high expression plastid genes from lineages such as certain green algal groups. Two scenarios have been proposed to explain these observations. One is that the flowering plant psbA gene is currently under weak selective constraints for translation efficiency, the other is that there are no current selective constraints and we are observing the remnants of an ancestral codon adaptation that is decaying under mutational pressure. We test these two models using simulations studies that incorporate the context-dependent mutational properties of plant chloroplast DNA. We first reconstruct ancestral sequences and then simulate their evolution in the absence of selection on codon usage by using mutation dynamics estimated from intergenic regions. The results show that psbA has a significantly higher level of codon adaptation than expected while other chloroplast genes are within the range predicted by the simulations. These results suggest that there have been selective constraints on the codon usage of the flowering plant psbA gene during Angiosperm evolution.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34699531 PMCID: PMC8570520 DOI: 10.1371/journal.pcbi.1009535
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Context-dependent properties of the MC2020 model.
| Context | G+C TS:TV | A+T TS:TV | Sub. Rate | GC:AT Rate | EQUIL. G+C | EQuil. C-G | EQUIL. T-A |
|---|---|---|---|---|---|---|---|
|
| 0.84 | 0.33 | 0.76 | 2.50 | 21.3 | - 32.7 | - 32.6 |
|
| 2.23 | 0.41 | 0.53 | 2.03 | 26.3 | 3.4 | 3.8 |
|
| 0.99 | 0.32 | 0.77 | 2.20 | 26.4 | - 2.8 | - 3.5 |
|
| 1.95 | 1.16 | 0.47 | 2.30 | 30.0 | - 0.27 | - 14.4 |
|
| 4.26 | 1.50 | 0.63 | 1.82 | 34.1 | - 38.4 | - 23.4 |
|
| 1.72 | 0.71 | 0.78 | 2.01 | 32.1 | 25.3 | - 21.2 |
|
| 1.12 | 0.82 | 0.49 | 2.94 | 25.5 | - 24.3 | - 33.9 |
|
| 3.07 | 2.80 | 0.66 | 1.57 | 40.8 | 40.6 | 16.0 |
|
| 3.06 | 1.24 | 1.01 | 1.08 | 46.4 | - 6.1 | 11.0 |
|
| 1.41 | 2.89 | 0.61 | 1.95 | 36.6 | 5.2 | 15.1 |
§–C(N_N) refers to the complement of the matrix. In these cases the two complementary contexts were combined as described in the text
†—TS:TV at sites where the outgroup is a G/C or A/T
††–Substitutions/site x 100. GC:AT Rate gives the ratio of the rate of substitution from an ancestral G or C to the rate of substitution from an ancestral A or T
‡–Equilibrium G+C for the stationary vector of the matrix (see text)
‡‡–Equilibrium C-G and T-A skews for the stationary vector of the matrix
Fig 1Simulation results for the ML_AA ancestral sequence of psbA under 5 models; MC with synonymous substitutions only, MC with nonsynonymous substitutions, MC2020 with synonymous substitutions only, MC2020 with nonsynonymous substitutions, and K2P with synonymous substitutions only.
The four MC and MC2020 models are shown in black and the K2P model in red.
Fig 2Simulation results for the MC2020 model with synonymous substitutions only for the four reconstructed ancestral sequences of psbA; MP-High (red), MP-Low (green) and ML-AA and ML-DNA (both black).
Fig 3Simulation results for the ML-AA ancestor of each gene under the MC2020 model with synonymous substitutions only (black) and with nonsynonymous substitutions included (red). Standard deviations are not shown for clarity of presentation.
Fig 4Comparisons of simulation results for each gene using the MC2020 model, with synonymous substitutions only, applied to each of the four ancestral sequences, with the extant sequence CAI values (blue). The ancestral sequences that yielded the highest and lowest simulation lines, determined by the CAI value of the ancestral sequences, are indicated and colored red and green respectively. For each gene the two plots are for the minimum (A) and maximum (B) number of mutations estimated from one of the reconstructed ancestral sequences (see text).