| Literature DB >> 19812731 |
Abstract
Negative selection against protein instability is a central influence on evolution of proteins. Protein stability is maintained over evolution despite changes in underlying sequences. An empirical all-site stability-based model of evolution was developed to focus on the selection of residues arising from their contributions to protein stability. In this model, site rates could vary. A structure-based method was used to predict stationary frequencies of hemoglobin residues based on their propensity to promote protein stability at a site. Sites with destabilizing residues were shown to change more rapidly in hemoglobins than sites with stabilizing residues. For diverse proteins the results were consistent with stability-based selection. Maximum likelihood studies with hemoglobins supported the stability-based model over simple Poisson-based methods. These observations are consistent with suggestions that purifying selection to maintain protein structural stability plays a dominant role in protein evolution.Entities:
Keywords: evolutionary model; negative selection; protein stability; protein structure
Year: 2009 PMID: 19812731 PMCID: PMC2747123 DOI: 10.4137/ebo.s3120
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Structures and sequences.
| Human α hemoglobin (PDB code: 1A3n_A), hemAhum |
| Human β hemoglobin (PDB code: 1A3n_B) hemBhum |
| Bovine α hemoglobin (PDB code: 1hDA_A) hemAbov |
| Bovine β hemoglobin (PDB code: 1hDA_B) hemBbov |
| Bar-neck goose α hemoglobin, |
| Bar-neck goose β hemoglobin, |
| Sea lamprey monomeric hemoglobin V, |
| Clam hemoglobin, |
| Ascaris hemoglobin, |
| Lamprey hemoglobin 1, |
| Lamprey hemoglobin 2, |
| Lamprey hemoglobin 3, |
| Zebrafish α hemoglobin, |
| Zebrafish β1 hemoglobin, |
| Shark hemoglobin α, |
| Shark hemoglobin β, |
Accession codes for protein sequences or PDB codes for structures (which encompass protein sequences).
Figure 1Selection for stability in hemoglobins. Open circles, mean distribution of ΔΔG values of residues observed to occur in 9 hemoglobins (subject to selection); closed circles, mean distribution of ΔΔG values of all possible substitutions, estimated for 9 hemoglobins (unselected). ΔΔG estimated by Modeller-Fold-X analysis of protein structures.
Figure 2Relationship between stability energy and frequency of observance. Stability is plotted against frequency of occurance. The normalized observed residue stability energy (ΔΔG) distribution for 9 hemoglobins (filled circles) is shown, Also shown is the non-linear least squares fitted exponential curve used to estimate the distribution.
Figure 3Stability-promoting sites of hemoglobin. The structure of human α hemoglobin is shown with residues predicted to have stationary frequencies greater than the unselected value (0.05) shaded. These sites are predicted to stabilize the protein more than the average site. These sites are potentially under selection and are dispersed throughout helices and loops of the protein.
Jensen-Shannon Divergence of stationary amino acid probability profiles from diverse hemoglobins.
| Divergence between pairs | ||||
|---|---|---|---|---|
| Human α | Human β | |||
| Divergence | Control | Divergence | Control | |
| Human α | [0] | 0.280 | 0.076 | 0.308 |
| Human β | 0.076 | 0.281 | [0] | 0.297 |
| Bovine α | 0.051 | 0.262 | 0.071 | 0.290 |
| Bovine β | 0.119 | 0.264 | 0.085 | 0.270 |
| Goose α | 0.080 | 0.260 | 0.094 | 0.272 |
| Goose β | 0.134 | 0.308 | 0.103 | 0.329 |
| Lamprey | 0.109 | 0.318 | 0.104 | 0.325 |
| Clam | 0.093 | 0.291 | 0.114 | 0.316 |
| Ascaris | 0.230 | 0.258 | 0.252 | 0.270 |
Profile sites randomized to eliminate site-specific correlations. For all pairs except those involving Ascaris there were significant (P < 0.01) differences between test and control by bootstrap tests. A divergence of 0 indicates identity.
Rate differences between high and low stability sites.
| Taxa | Low stability | High stability | Rate ratio |
|---|---|---|---|
| Human α, β | 1.550 | 0.324 | 4.779 |
| Cow α, β | 1.789 | 0.333 | 5.368 |
| Goose α, β | 1.684 | 0.536 | 3.144 |
| Zebrafish α, β | 1.174 | 0.538 | 2.180 |
| Human α/Zebrafish α Hemoglobin | 0.852 | 0.268 | 3.175 |
Ratio of residues changing to residues not changing at sites ranked as low stability. Only sites where both homologs shared low stability were scored.
Ratio of residues changing to residues not changing at sites ranked as high stability.
Only sites where both homologs shared high stability were scored.
Ratio of column #1 and column #2.
(P < 0.01) bootstrap resampling.
Comparision of stabilization of sites by observed or unselected residues.
| PDB Code | Protein | ΔΔG | Rate ratio |
|---|---|---|---|
| 1A3N | α-hemoglobin | 0.880 | 2.98 |
| 1W6Z | lysozyme | 1.224 | 4.34 |
| 1LKK | LCK | 0.964 | 3.18 |
| 1LZO | triosephosphate isomerase | 1.232 | 4.39 |
| 1FNF | fibronectin(repeat) | 1.112 | 3.80 |
| 1KNB | adenovirus fiber protein | 0.914 | 3.00 |
| 1WBA | bean albumin | 1.039 | 3.48 |
| 1ABR | abrin | 1.037 | 3.47 |
| 1ICN | fatty acid binding protein | 1.042 | 3.49 |
| 1IHF | integration host factor | 0.662 | 2.21 |
| 1RRG | ARF-1 | 0.975 | 3.22 |
| 1OCT | OCT-1 | 0.614 | 2.09 |
| 1PUE | ETS | 1.018 | 3.39 |
| 1THV | thaumatin | 1.463 | 5.78 |
| 2AK3 | adenylate kinase | 0.940 | 3.09 |
| 1A2P | barnase | 1.469 | 5.88 |
| 1X1R | m-RAS | 1.101 | 3.75 |
| 1COF | cofilin | 1.276 | 4.63 |
| 1FIM | MIF | 0.968 | 3.20 |
| 1RCI | ferritin | 0.857 | 2.80 |
| 1TTB | transthyretin | 0.725 | 2.39 |
| 1ONR | transaldolase B | 1.261 | 4.54 |
| 1XIK | ribonucleotide diphosphate reductase | 1.166 | 4.05 |
| 1MUP | pheromone binding protein | 0.590 | 2.54 |
| 1NHK | nucleoside diphosphate kinase | 1.273 | 4.61 |
| 1SVP | sinbis virus capsid | 0.725 | 2.39 |
Average site difference in stability between unselected residues and observed residues; kcal/mol. Values were calculated using the Modeller-Foldx method.
Predicted average ratio at sites of evolutionary rates for unselected residues vs. observed residues. Values were calculated using Eq. 4.
Comparison of models in ML analysis.
| (a) ML with ES model | ||
| | ln | |
| ES-model based | −2489.534 | |
| Null | −2498.238 | |
| Unselected | −2499.096 | |
| (b) Bootstrap test of ES model | ||
| | ||
| ES model-based/null | 23.7 | <0.001 |
| ES model-based/Unselected | 21.7 | <0.001 |
All sites set to average amino acid frequencies.
All sites set to frequency of 0.05.
AIC statistic with 100 bootstrap replicates.
Figure 4ES-based phylogenetic reconstruction. ES model derived distances based on protein stability predictions were used in Neighbor-Joining phylogenetic infererence of vertebrate hemoglobins. Numbers beside nodes represent support levels from 100 bootstrap replications. Tree was rooted with lamprey hemoglobin. Note that the lamprey, α hemoglobin, β hemoglobin splits are correctly determined.
Figure 5Observed differences vs. distance for metazoan hemoglobins. Sequence differences between hemoglobin proteins (see Fig. 4) were recorded and compared with model predictions of distance (arbitrary units). Unfilled circles, Poisson model; Filled circles, ES model. Distances for a given degree of sequence difference are greater for the ES model.