| Literature DB >> 28432324 |
Paweł Błażej1, Dorota Mackiewicz1, Małgorzata Grabińska1, Małgorzata Wnętrzak1, Paweł Mackiewicz2.
Abstract
Mutations are considered a spontaneous and random process, which is important component of evolution because it generates genetic variation. On the other hand, mutations are deleterious leading to non-functional genes and energetically costly repairs. Therefore, one can expect that the mutational pressure is optimized to simultaneously generate genetic diversity and preserve genetic information. To check if empirical mutational pressures are optimized in these ways, we compared matrices of nucleotide mutation rates derived from bacterial genomes with their best possible alternatives that minimized or maximized costs of amino acid replacements associated with differences in their physicochemical properties (e.g. hydropathy and polarity). It should be noted that the studied empirical nucleotide substitution matrices and the costs of amino acid replacements are independent because these matrices were derived from sites free of selection on amino acid properties and the amino acid costs assumed only amino acid physicochemical properties without any information about mutation at the nucleotide level. Obtained results indicate that the empirical mutational matrices show a tendency to minimize costs of amino acid replacements. It implies that bacterial mutational pressures can evolve to decrease consequences of amino acid substitutions. However, the optimization is not full, which enables generation of some genetic variability.Entities:
Mesh:
Year: 2017 PMID: 28432324 PMCID: PMC5430830 DOI: 10.1038/s41598-017-01130-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Comparison of costs of amino acid replacements in two physicochemical properties, hydropathy (x-axis) and polarity (y-axis) generated by: random started matrices (start), empirical matrices (empirical) and matrices maximizing hydropathy and minimizing polarity (MaxMin); minimizing hydropathy and maximizing polarity (MinMax) as well as maximizing (Max) or minimizing (Min) the both costs.
Figure 2As in Fig. 1.
Figure 3As in Fig. 1.
Relative minimal distances of empirical matrices from two DNA strands in bacterial genomes to respective Pareto fronts of matrices maximizing hydropathy and minimizing polarity (MaxMin); minimizing hydropathy and maximizing polarity (MinMax); maximizing (Max) and minimizing (Min) the both costs.
| Genome | Leading strand | Lagging strand | ||||||
|---|---|---|---|---|---|---|---|---|
| Max | MaxMin | MinMax | Min | Max | MaxMin | MinMax | Min | |
|
| 0.662 | 0.090 | 0.093 | 0.155 | 0.700 | 0.078 | 0.118 | 0.104 |
|
| 0.632 | 0.052 | 0.088 | 0.229 | 0.674 | 0.087 | 0.058 | 0.181 |
|
| 0.650 | 0.086 | 0.136 | 0.128 | 0.648 | 0.076 | 0.167 | 0.109 |
|
| 0.667 | 0.077 | 0.122 | 0.135 | 0.656 | 0.073 | 0.166 | 0.105 |
|
| 0.642 | 0.062 | 0.187 | 0.110 | 0.396 | 0.045 | 0.187 | 0.371 |
|
| 0.639 | 0.091 | 0.179 | 0.091 | 0.649 | 0.045 | 0.178 | 0.128 |
|
| 0.674 | 0.075 | 0.093 | 0.157 | 0.673 | 0.080 | 0.086 | 0.161 |
|
| 0.687 | 0.081 | 0.086 | 0.146 | 0.699 | 0.080 | 0.082 | 0.140 |
|
| 0.693 | 0.075 | 0.112 | 0.120 | 0.699 | 0.093 | 0.101 | 0.106 |
The distances were calculated in the final 2000th step of simulations.
Figure 4Comparison of costs of amino acid replacements in two selected physicochemical properties, generated by: random started matrices (start), empirical matrices (empirical), matrices maximizing one and minimizing other property (MaxMin) and vice versa (MinMax) as well as matrices maximizing (Max) or minimizing (Min) the both costs.
Figure 5As in Fig. 4.
Figure 6As in Fig. 4.
Figure 7Biplot for results of Principal Component Analysis based on probability rates of the empirical matrices and matrices from Pareto fronts, which minimized or maximized both physicochemical costs of amino acid replacements A covariance matrix was assumed in the calculation of the principal components.
The transition probability P matrix describing mutational pressure in the leading DNA strand from Escherichia coli genome.
| A | T | G | C | |
|---|---|---|---|---|
| A | 0.7600 | 0.0594 | 0.1394 | 0.0412 |
| T | 0.0452 | 0.7828 | 0.0508 | 0.1212 |
| G | 0.1534 | 0.0481 | 0.7720 | 0.0265 |
| C | 0.0368 | 0.2491 | 0.0290 | 0.6852 |
A nucleotide from the column is replaced by a nucleotide from the row.
Substitution rate matrix Q for the unrestricted model of nucleotide substitutions (UNREST).
| A | T | G | C | |
|---|---|---|---|---|
| A | — |
|
|
|
| T |
| — |
|
|
| G |
|
| — |
|
| C |
|
|
| — |
The diagonals of Q are determined to each row sum up to 0. The nucleotide stationary distribution π = (π , π , π , π ) is given by the set of equations πQ = 0 under the constraint ∑ π = 1.
Nucleotide stationary distribution generated by matrices from leading and lagging DNA strands for studied genomes.
| Genome | Leading strand | Lagging strand | ||||||
|---|---|---|---|---|---|---|---|---|
| A | T | G | C | A | T | G | C | |
|
| 0.356 | 0.273 | 0.229 | 0.141 | 0.273 | 0.356 | 0.141 | 0.229 |
|
| 0.317 | 0.488 | 0.137 | 0.059 | 0.488 | 0.317 | 0.059 | 0.137 |
|
| 0.245 | 0.252 | 0.282 | 0.222 | 0.225 | 0.227 | 0.290 | 0.259 |
|
| 0.234 | 0.214 | 0.293 | 0.260 | 0.253 | 0.252 | 0.253 | 0.242 |
|
| 0.247 | 0.328 | 0.247 | 0.179 | 0.268 | 0.308 | 0.207 | 0.217 |
|
| 0.222 | 0.305 | 0.244 | 0.229 | 0.305 | 0.222 | 0.229 | 0.244 |
|
| 0.295 | 0.308 | 0.207 | 0.190 | 0.327 | 0.272 | 0.238 | 0.163 |
|
| 0.407 | 0.393 | 0.121 | 0.080 | 0.353 | 0.450 | 0.087 | 0.110 |
|
| 0.326 | 0.420 | 0.123 | 0.131 | 0.301 | 0.402 | 0.094 | 0.203 |
Figure 8The workflow of the algorithm SPEA2.