| Literature DB >> 29904213 |
Daniel M Weinreich1,2, Yinghong Lan1, Jacob Jaffe1, Robert B Heckendorn3.
Abstract
The effect of a mutation on the organism often depends on what other mutations are already present in its genome. Geneticists refer to such mutational interactions as epistasis. Pairwise epistatic effects have been recognized for over a century, and their evolutionary implications have received theoretical attention for nearly as long. However, pairwise epistatic interactions themselves can vary with genomic background. This is called higher-order epistasis, and its consequences for evolution are much less well understood. Here, we assess the influence that higher-order epistasis has on the topography of 16 published, biological fitness landscapes. We find that on average, their effects on fitness landscape declines with order, and suggest that notable exceptions to this trend may deserve experimental scrutiny. We conclude by highlighting opportunities for further theoretical and experimental work dissecting the influence that epistasis of all orders has on fitness landscape topography and on the efficiency of evolution by natural selection.Entities:
Keywords: Fitness landscapes topography; Higher-order epistasis; NK landscape; Natural selection; Sequence space combinatorics
Year: 2018 PMID: 29904213 PMCID: PMC5986866 DOI: 10.1007/s10955-018-1975-3
Source DB: PubMed Journal: J Stat Phys ISSN: 0022-4715 Impact factor: 1.548
Fig. 1Analytic pipeline, illustrated with data from Palmer et al. [45]. a For each dataset, published fitness data (or a suitable proxy, written ) were first converted to the corresponding epistatic terms ( using the Fourier–Walsh transformation (Eq. 1). b Explanatory power of a succession of models using only the m largest epistatic terms in absolute value () were compared with the published data. For given value of m, these models provably have the greatest explanatory power (smallest residual variance) of any model with exactly m parameters (Appendix). The symbols plotted represent the epistatic order (Sect. 1.2) of each successive parameter added to the model. c Rank correlation coefficient between the empirical sequence of epistatic orders and those of our naïve expectation (Eq. 2) were computed. In cases where experimental variance was reported, these sequences were truncated as soon as the remaining model variance was less than the experimental variance. For the data shown, that truncation occurred after the epistatic term. Finally, statistical significance was assessed by a permutation test that asked whether the observed sequence of epistatic orders was significantly different than random. For the data shown (red arrow), the observed value of (0.1921) was smaller than only the 3639 largest of values obtained by the permutation test, yielding
Analyses of published combinatorially complete empirical and simulated (NK) fitness landscapes, sorted by P value associated with Kendall’s
| Phenotype [citation] | Number of loci ( | Number of maxima | Number of epistatic terms significantly different from zero | Kendall’s | |
|---|---|---|---|---|---|
| Log[ | 6 | 4 | 41 | 0.6202 | < 0.00001*** |
| Log[diploid S | 6 | 4 | 3 | 0.6667 | <0.00001*** |
| Log[ | 6 | 1 | N.D. | 0.7566 | < 0.00001*** |
| Avian lysozyme thermostability [ | 3 | 1 | 4 | 0.5 | <0.00001*** |
| 5 | 1 | N.D. | 0.3333 | <0.00001*** | |
| Log[relative fitness among | 4 | 1 | N.D. | 0.7527 | 0.00001*** |
| Log[HIV replicative capacity on CCR5+ cells] [ | 5 | 3 | 25 | 0.5703 | 0.00002*** |
| Log[cefotaxime MIC of | 5 | 1 | 29 | 0.5490 | 0.00011** |
| Log[relative viability among fruit fly mutants] [ | 5 | 3 | N.D. | 0.4896 | 0.00027** |
| Log[cefalexin MIC of | 4 | 1 | N.D. | 0.5376 | 0.00448 |
| 5 | 2 | N.D. | 0.5714 | 0.00474 | |
| Log[relative fitness among LTEE | 5 | 2 | 30 | 0.3486 | 0.01023 |
| 5 | 2 | N.D. | 0.4428 | 0.0156 | |
| Log[relative colony growth rate among | 5 | 4 | 10 | 0.4387 | 0. 02002 |
| Percent production of 5-epi-aristolochene by sesquiterpene synthase mutants [ | 6 | 10 | N.D. | 0.1974 | 0.02639 |
| Log[pyrimethamine IC50 of | 4 | 2 | 14 | 0.4337 | 0.03133 |
| Log[IC75 of | 6 | 2 | 55 | 0.1921 | 0.03639 |
| 5 | 5 | N.D. | 0.1735 | 0.1442 | |
| Mammalian glucocorticoid receptor cortisol sensitivity [ | 4 | 4 | N.D. | 0.1075 | 0.30182 |
| Log[MIC of | 4 | 3 | N.D. | 0.0430 | 0.41356 |
| 5 | 7 | N.D. | −0.1632 | 0.86161 |
Maximum possible value is 2
Uncorrected value from permutation test (n = 10 replicates). Bonferroni-corrected P values: *** 0.001; 0.001 < ** 0.01; 0.01 < * 0.05
No data because no experimental variance estimates provided with this dataset
No data because simulated fitness landscapes have no experimental variance
Published combinatorially complete fitness landscapes not examined here
| Phenotype [citation] | Number of loci | Number of genotypes |
|---|---|---|
| Cycloguanil IC50 of | 3 | 2 |
| Pyrimethamine IC50 of | 3 | 2 |
| MIC against pyrimethamine of | 3 | 2 |
| Pyrimethamine IC50 of | 4 | 2 |
| MIC of TEM | 4 | 2 |
| 4 | 4 | |
| Relative fitness among LTEE | 5 | 2 |
| Relative fitness among LTEE | 5 | 2 |
| HIV replicative capacity on CXCR5+ cells | 5 | 2 |
| Transcription factor/response element specificity in an ancient steroid hormone receptor [ | 5 | 4 |
| MIC of TEM- | 6 | 2 |
| Diploid | 6 | 2 |
| Percent production of minor products by sesquiterpene synthase mutants | 6 | 2 |
| Percent production of 4-EE by sesquiterpene synthase mutants | 6 | 2 |
| Percent production of PSD by sesquiterpene synthase mutants | 6 | 2 |
| 104 mouse DNA-binding proteins’ affinity for 10 bp binding motifs [ | 10 | 4 |
| GFP affinity for 10 nucleotide base pair binding motifs [ | 10 | 4 |
| Affinity of 1032 DNA-binding proteins spanning eukaryotic diversity against 10 nucleotide base pair binding motifs [ | 10 | 4 |
Written as the product of cardinalities across loci
Another phenotype from this system is included in Table 1
In total, 16 15 distinct -lactam compounds = 240 observations are reported in this study
This study examined all combinations of 4 nucleotides at two key positions in the DNA response element together with all combinations of two amino acids at three key positions in the transcription factor
In total 1,048,576 104 DNA-binding proteins = 109,051,904 observations are reported in this study
In total 1,048,576 1,032 DNA-binding proteins = 1,082,130,432 observations are reported in this study
Fig. 2Distribution of uncorrected values among 16 empirical datasets. Under any null model, P values are expected to be uniformly distributed (black bars; note both axes are log-transformed). Instead observed P values (grey bars) are sharply skewed toward small values (, )
Average epistatic influence on fitness landscape topography as a function of epistatic order in select datasets
| Epistatic order | Aggregate reduction in residual variance | Number of epistatic terms significantly different from zero | Mean reduction in residual variance per epistatic term |
|---|---|---|---|
| (a) | Log[IC75 of | ||
| First | 0.279 | 5 |
|
| Second | 0.266 | 12 | 0.022 |
| Third | 0.233 | 18 | 0.013 |
| Fourth | 0.144 | 11 | 0.013 |
| Fifth | 0.0685 | 6 | 0.011 |
| Sixth | 0.0065 | 1 | 0.0065 |
| (b) | Mammalian glucocorticoid receptor cortisol sensitivity [ | ||
| First | 0.171 | 4 | 0.043 |
| Second | .405 | 6 | 0.067 |
| Third | 0.420 | 4 |
|
| Fourth | 0.004 | 1 | 0.004 |
| (c) | Log[MIC of | ||
| First | 0.353 | 4 | 0.088 |
| Second | 0.278 | 6 | 0.043 |
| Third | 0.279 | 4 | 0.070 |
| Fourth | 0.091 | 1 |
|
| (d) | |||
| First | 0.027 | 5 | 0.005 |
| Second | 0.315 | 10 | 0.315 |
| Third | 0.402 | 10 | 0.402 |
| Fourth | 0.241 | 5 |
|
| Fifth | 0.015 | 1 | 0.015 |
Largest value for each dataset shown in bold