Characteristics of the new phenotypic variation introduced via mutation have broad implications in evolutionary and medical genetics. Standardized estimates of this mutational variance, VM, span 2 orders of magnitude, but the causes of this remain poorly resolved. We investigated estimate heterogeneity using 2 approaches. First, meta-analyses of ∼150 estimates of standardized VM from 37 mutation accumulation studies did not support a difference among taxa (which differ in mutation rate) but provided equivocal support for differences among trait types (life history vs morphology, predicted to differ in mutation rate). Notably, several experimental factors were confounded with taxon and trait, and further empirical data are required to resolve their influences. Second, we analyzed morphological data from an experiment in Drosophila serrata to determine the potential for unintentional heterogeneity among environments in which phenotypes were measured (i.e. among laboratories or time points) or transient segregation of mutations within mutation accumulation lines to affect standardized VM. Approximating the size of an average mutation accumulation experiment, variability among repeated estimates of (accumulated) mutational variance was comparable to variation among published estimates of standardized VM. This heterogeneity was (partially) attributable to unintended environmental variation or within line segregation of mutations only for wing size, not wing shape traits. We conclude that sampling error contributed substantial variation within this experiment, and infer that it will also contribute substantially to differences among published estimates. We suggest a logistically permissive approach to improve the precision of estimates, and consequently our understanding of the dynamics of mutational variance of quantitative traits.
Characteristics of the new phenotypic variation introduced via mutation have broad implications in evolutionary and medical genetics. Standardized estimates of this mutational variance, VM, span 2 orders of magnitude, but the causes of this remain poorly resolved. We investigated estimate heterogeneity using 2 approaches. First, meta-analyses of ∼150 estimates of standardized VM from 37 mutation accumulation studies did not support a difference among taxa (which differ in mutation rate) but provided equivocal support for differences among trait types (life history vs morphology, predicted to differ in mutation rate). Notably, several experimental factors were confounded with taxon and trait, and further empirical data are required to resolve their influences. Second, we analyzed morphological data from an experiment in Drosophila serrata to determine the potential for unintentional heterogeneity among environments in which phenotypes were measured (i.e. among laboratories or time points) or transient segregation of mutations within mutation accumulation lines to affect standardized VM. Approximating the size of an average mutation accumulation experiment, variability among repeated estimates of (accumulated) mutational variance was comparable to variation among published estimates of standardized VM. This heterogeneity was (partially) attributable to unintended environmental variation or within line segregation of mutations only for wing size, not wing shape traits. We conclude that sampling error contributed substantial variation within this experiment, and infer that it will also contribute substantially to differences among published estimates. We suggest a logistically permissive approach to improve the precision of estimates, and consequently our understanding of the dynamics of mutational variance of quantitative traits.
The magnitude of per-generation increase in genetic variance due to spontaneous mutations (V) is important for a wide range of genetic and evolutionary phenomena, including the maintenance of quantitative genetic variance (Lynch 1988; Barton and Turelli 1989; Johnson and Barton 2005). Much of our understanding of V comes from mutation accumulation (MA) experiments, where populations diverge phenotypically due solely to the neutral fixation of new mutations (Mukai 1964; Halligan and Keightley 2009). Reviews of MA experiments in a range of traits and taxa have reported that mutation increases phenotypic variance in quantitative traits by 10−4–10−2 times the environmental variance of the trait, or 0.02–5.1% of the trait mean per generation (Houle ; Lynch ; Halligan and Keightley 2009). Differences in V may cause differences in the magnitude of standing quantitative genetic variation and, ultimately, in rates of phenotypic evolution (Houle 1998; Lynch ; Houle ; Walsh and Lynch 2018). However, the causes of variation among estimates of V, and thus the evolutionary interpretation of this variability, are not well resolved.Mutation rate is known to vary widely among species (reviewed in Katju and Bergthorsson 2019), with further opportunity for differences in per-generation mutation number arising through differences in ploidy, genome size, and/or effective population size (Lynch ; Lynch 2010; Sung ). Marked variation in mutation rate has also been observed within species, both among replicated MA experiments (i.e. different founder genotypes: Ness ; Sung ; Schrider ; Ho ) and among lines within a single MA panel (Huang ; Ho ). Resulting differences in mutation number may explain variation in V estimates, such as, for example, the 4-times difference in of body size estimated for different MA in Caenorhabditis elegans (Azevedo ; Estes ; Ostrow ).Traits have also been hypothesized to differ in magnitude of V due to differences in mutation rate, arising due to differences in the number of contributing loci. Specifically, life history traits are hypothesized to be affected by more loci than morphological traits (Houle 1991, 1992, 1998; Houle ; Merilä and Sheldon 1999). The magnitude of V depends not only on the rate of mutation, but also on their effects, and the relationship between rate and effect size is not well characterized. Besnard demonstrated that the high mutational variance (and relatively rapid evolution) of a vulval phenotype in nematodes was due to a broad mutational target size, rather than large-effect mutation. Whether trait types differ systematically in mutational target size is difficult to assess, as a full catalog of causal loci is unknown for most quantitative traits (Barton and Keightley 2002; Mackay ; Yang ; Rockman 2012). Indeed, emerging evidence that diverse traits, including morphology, are all highly polygenic (Yang ; Boyle ) suggests that differences in the distribution of mutational effect sizes (Simons ), rather than simply in number of contributing loci, might cause heterogeneity in estimates of V among traits.Comparison among trait types is complicated by differences among them in variability and measurement scale, which may influence standardized values. Low mutational heritability ( = V/V, where V is the environmental variance) of life-history traits relative to morphological traits has been attributed to greater environmental variance (larger V) for life-history traits, rather than lower V (Houle ). Thus, comparison on the coefficient of mutational variance scale (CV; , where is the trait mean) reveals a different picture, one of greater mutational variance in life history than morphological traits, consistent with the prediction of greater mutational target size (Houle ).Other contributions to variation in magnitude of V might be revealed by consideration of the MA experimental design itself. The timeframe over which mutations accumulate might influence estimates of V. When MA lines are established from a homozygous (heterozygous) base population, estimates of V will be downwardly (upwardly) biased before 6 Ne generations (Lynch and Hill 1986). However, V is typically estimated after > 6 N generations, suggesting limited contribution of ancestral variation to variability of V. Conversely, long-running MA experiments might under-estimate V when the cumulative effect of low fitness mutation causes line extinction, or within-line selection against further accumulation (Lynch ; Estes ; McGuigan and Blows 2013). A decline in V over time has been observed in some studies (Mackay ), but not in others (García-Dorado ; Hall ).Stochastic sampling from the distribution of mutational effects could also introduce temporal heterogeneity among estimates of V. For example, among-line variance estimated before vs after a line(s) fixed a large effect mutation(s) could result in inference of a much larger per-generation increase in variance at the second time-point relative to the first. Transient within-line segregation of mutations might generate variability in estimates, for example causing temporary inflation of within-line variance (V), impacting power to detect among-line variance, and potentially biasing estimates of (downward) and CV (upward; see Hoffmann ). Notably, several studies in nematode have suggested that within-line variance increased over the duration of the MA experiment (Baer 2008; Baer ; Braendle ), which may contribute to a pattern of lower estimated V in longer-running MA experiments.Environmental context within which MA lines are assayed could also contribute to variation among V estimates. Several studies have considered the effect of replicable, experimenter-imposed, changes in the environment, including in temperature (Wayne and Mackay 1998), light (Kavanaugh and Shaw 2005), and density (Fry ). Although the magnitude of V often varies under such environmental manipulations, there is only weak evidence for predicable patterns, such as novel or stressful environments increasing the magnitude of V (Kondrashov and Houle 1994; Martin and Lenormand 2006). Even in carefully controlled laboratory experiments, factors such as food quality or quantity, light, humidity and diurnal timing of collection will vary among individuals or lines within a phenotyping assay, and among assays conducted in different laboratories or at different times within the same laboratory. Such variation may impact estimates of standardized V through inflation of within-line variance, similar to the effect of transient, within-line segregation of new mutations. Furthermore, if MA lines differ in their response to this unintended environmental heterogeneity, then genotype by environment (G×E) variance could contribute variation among MA lines, and variability among estimates of V, a potential source of variation that has received little attention (but see García-Dorado ).Here, we combined 2 approaches to investigate causes of variability in estimates of mutational variance. Given that it has been over 20 years since this variability has been broadly documented and investigated (Houle ; Houle 1998; Lynch ), we first conducted a meta-analysis to update tests of the previously implicated causal factors of taxon (Lynch and Walsh 1998; Lynch ; Halligan and Keightley 2009) and trait type (Houle ; Houle 1998). We had intended to examine how the number of generations affected V (Lynch and Hill 1986; Mackay ), but MA duration was confounded with taxon (detailed in the Results). Second, we conducted a new empirical experiment in Drosophila serrata, in which we repeatedly estimate the among-line (mutational) variance to investigate whether unintended environmental heterogeneity, or transient within-line segregation of mutations can contribute variation among estimates. After accounting for these effects within the data, we finally quantify the magnitude of variation among estimates from a set of 10 wing shape traits to characterize the magnitude of variation among estimates within a trait category.
Methods
Meta-analysis of empirical estimates of mutational variance
Literature search
We extracted all studies in 7 reviews of mutational variance: Lynch (1988), Keightley ), Houle ), Lynch and Walsh (1998), Lynch ), Halligan and Keightley (2009), and Walsh and Lynch (2018). We then searched the Web of Science database on 11/12/2019 at 4:38 p.m. AEST for journal article document types meeting the topic criteria of “MA” and (varia* or “mutat* coefficient*”) and published between 1998 and 2019. These years overlapped Halligan and Keightley (2009) (fitness traits only) and Walsh and Lynch (2018) (brief update on Lynch and Walsh 1998), allowing us to capture papers that may have been excluded from those reviews, as well as those published subsequently.Further details on the papers identified and preliminary handling steps can be found in Supplementary Fig. 1. For 473 unique papers identified, we screened titles and abstracts, then the full text, for relevance, applying 4 strict criteria, retaining only studies where the estimates of mutational variance were: (1) quantitative; (2) from spontaneous MA; (3) from MA environmental conditions; and (4) not re-reporting of previously published estimates. We excluded 6 studies of transcriptomic data as the number of traits was much larger than for other trait categories.
Meta-analysis data collection
For each of the 65 papers retained after applying the above criteria, mutational parameter estimates were extracted (as described in Supplementary Table 1), associated with taxon and trait identifiers, and details of the experimental design. Twenty papers not reporting error for the mutational parameters were excluded (Supplementary Table 1c). Following initial qualitative assessments of data, we excluded 5 studies (15 traits) due to low representation of taxon type (one vertebrate, Mus musculus; 1 alga, Chlamydomonas reinhardtii, and; 2 non-Drosophila insects: Daktulosphaira vitifoliae and Nasonia vitripennis), and 1 study due to low representation of trait type: mitotic cell division traits (Supplementary Table 1b).Where possible, we extracted (or calculated from provided information) both the coefficient of variance (CV;, where is the trait mean) and mutational heritability (; V/V where V is the environmental variance) for each trait. As detailed below, estimates were weighted by the inverse of their standard error (SE) in the meta-analysis. Where these were not reported for or CV, but were for V and V or , we used a sampling approach to obtain estimates. We sampled from N ∼ (, V) 10,000 times, using the rnorm function in R [v. 3.6.1], where and V were respectively the reported parameter value and its SE. We then calculated CV or for each of these simulated samples, and obtained the SE of this sample of estimates. Samples with negative values of V are undefined for CV; to ensure unbiased estimates of the magnitude of error we calculated CV as: . This sampling approach was used to estimate the error for 28% of the estimates and 61% of the CV estimates analyzed (Supplementary Table 1a). Two studies (17 estimates) were excluded due to nonsensically large SE estimates, while a further 3 estimates (from 3 studies) were excluded due nonsensical scaled parameter estimates (detailed in Supplementary Table 1b). Two extreme values (>3 SD) of and 2 of CV were excluded from the analyses (Supplementary Table 1b). There were 11 cases with extremely small SE (>5 IQR below the median); notably, 6 of these came from studies where confidence intervals (CIs) were constrained to be positive, suggesting that this boundary condition had reduced the SE estimate, inflating meta-analysis weights for traits where the mutational variance was not supported. These outliers were excluded from analyses, although results and conclusions were qualitatively consistent when they were included.
Predictor variables for the meta-analysis
Estimates came from 11 species, and based on the distribution of estimates, we defined 5 taxon categories (Fig. 1a): Daphnia (Daphnia pulex only); Drosophila (Drosophila melanogaster [n = 68] and D. serrata [n = 5]); Plant (Arabidopsis thaliana [n = 12], Amsinckia douglasiana [n = 2] and Amsinckia gloriosa [n = 1]) and; Nematode (C. elegans [n = 62], C. brenneri [n = 4], C. briggsae [n = 8], C. remanei [n = 5], and Oscheius myriophila [n = 5]). We differ from a previous study Houle , 1998) in considering size of juveniles as morphological (not growth) traits. Reflecting more recent publications, we defined a physiology category (33% of estimates; Fig. 1b), which included locomotive, enzymatic and metabolic activity traits (Supplementary Table 2), which may differ from life-history or morphological traits in mutational target size or environmental sensitivity. We assigned the relatively well-represented life-history traits (52% of estimates; Fig. 1b) into more narrowly defined subcategories: total fitness, survival, productivity, and a miscellaneous category (capturing traits such as development time, phenology, longevity and mating success, which were individually less well represented) (Fig. 1b; Supplementary Table 2).
Fig. 1.
The distribution of published estimates of mutational variance across taxon (a) and trait (b) categories. The number of studies (first value in brackets) and estimates (second value) per category are shown. See Methods (and Supplementary Table 2) for details of the categories and Supplementary Table 1 for the studies.
The distribution of published estimates of mutational variance across taxon (a) and trait (b) categories. The number of studies (first value in brackets) and estimates (second value) per category are shown. See Methods (and Supplementary Table 2) for details of the categories and Supplementary Table 1 for the studies.
Meta-analyses of mutational variance estimates
We implemented a mixed model analyses via PROC MIXED in SAS v.9.4 (SAS Institute Inc., Cary, N.C.), using restricted-maximum likelihood (REML) and applying the Satterthwaite approximation to correct the denominator degrees of freedom, to fit the model:
where was the vector of published estimates (either or CV), and μ was the grand mean; the categorical predicators of taxon and trait (defined above) were fit as fixed effects. Estimates were weighted by the inverse of the SE of or CV, obtained as detailed above. The study was fit as a random effect, accounting for nonindependence among estimates within a paper (1–29 estimates per study; median = 3). Studies reporting multiple estimates varied widely in whether these were estimates from different trait types, species (strains), sexes, or time points. Likely reflecting this, most variation not accounted for by the fixed effects was observed at the residual, not study, level (99% for ; 85% for CV). Similarly, while some studies shared the same MA lines (Supplementary Table 3) fitting a further random effect to account for this nonindependence did not explain any variation, a likely consequence of both the unbalanced design (only some studies share lines), and the relative variation of estimates. We investigated different options for fitting heterogenous residuals (e.g. allowing separate estimation of residuals for studies grouped depending on the number of estimates reported), but interpretation of the fixed effects (taxon and trait) were consistent across all investigated models, and we report results only from model (1) above.
Variation in estimates of among-line variance within the same taxon and trait type: an experiment in D. serrata
To what extent do differences in magnitude among estimates of V reflect differences in mutation number and/or effect sizes (correlated with the above-investigated proxies of taxon and trait), vs factors such as mutations segregating within MA lines or environmental dependency of mutational effects? We conducted a further experiment to address this question. Drosophila serrata is a member of the montium species group, endemic to Australia and Papua New Guinea, which has been extensively used in quantitative genetic research, including study of mutational variance (e.g. McGuigan and Blows 2013; Hine ; Dugand ). A panel of 200 MA lines was founded from one of the D. serrata reference genome panel (DsRGP) lines described in Reddiex . These MA lines were each maintained by brother-sister inbreeding for 20 generations, following protocols established by McGuigan to minimize selection. Genome-wide heterozygosity was very low (0.3%) in the DsRGP line that founded the MA lines, and among-line variance for wing traits (defined below) was not statistically supported in the first generation of the MA (S. Chenoweth, pers. comm.).As detailed below, we applied an experimental design to this MA panel that allowed us to generate repeated estimates of the magnitude of among-line variance over 6 sequential generations, and characterize the relative contribution to differences among these sequential estimates of (1) mutations segregating within the MA lines or (2) unintentional variation in environment. We randomly chose 42 of the MA lines for this investigation based on the median number of MA lines in the reviewed published studies (see Results). The number of MA lines is the relevant degrees of freedom for the among-line variance, and this value (42) allows us to consider the other 2 effects against a relevant level of sampling error. Quantitative genetic parameters are associated with large sampling errors (Klein ; Klein 1974; Lynch and Walsh 1998), and the relatively low signal (i.e. few genetic differences) among MA lines will make mutational variance particularly vulnerable to “noisy” estimation, and as such, it is important to document the potential for statistical sampling error to contribute to the observed variation among published estimates.There were 3 key aspects of the experimental design that allowed us to test whether segregating variation or unintended environmental variation could explain differences among repeated estimates of among-line variance. First, we increased the population size within each MA line to a minimum of 12 males and 12 females (Fig. 2a). Empirical evidence suggests that population sizes as low as 10 may be sufficient to prevent fixation of mutations (Estes ; Katju ; Luijckx ). Therefore, we expect no ongoing fixation of mutations among lines during this experiment, and for the repeated estimates of among-line variance to be true replicate sampling of the same mutations (but also test this assumption, as detailed below). We note that these changes in census population size complicate calculation of a per-generation rate of increase in phenotypic variance (Lynch and Hill 1986; Lynch and Walsh 1998); here, we instead focus on the among-line variance, V, and do not interpret a per-generation rate of change.
Fig. 2.
Schematic of design (a) and phenotypes (b) from a manipulative experiment in D. serrata. (A) 42 MA lines (evolved through 20 generations of brother-sister mating) each founded 2 sublines: Small (S; 12 virgin males and 12 virgin females) and Large (L; 144 virgin males and 144 virgin females, distributed evenly among 12 vials). These 84 lines (S and L subline per 42 MA lines) were maintained at these census population sizes for 6 generations (only 2 shown here). Each generation, all emergent flies from the 12 vials per L subline were pooled prior to virgin collection. For S sublines, 2 vials were established each generation; the focal vial contributed offspring to the next generation, while the replicate vial (gray shaded) did not. Each generation,1 wing was sampled from each of 6 randomly chosen males from each of 2 vials per line (focal and replicate vials for S; randomly chosen 2 for L). (B) Wing size and shape were characterized from landmarks recorded on an image of each wing: proximal (1) and distal intersections of the radial vein (2); distal intersections of medial (3), cubital (4), and distal (5) veins and; the posterior (6, 7) and anterior (8, 9) cross-veins. ILD traits were described by their end-point landmarks (e.g. ILD1.2 was the distance between landmark 1 and landmark 2).
Schematic of design (a) and phenotypes (b) from a manipulative experiment in D. serrata. (A) 42 MA lines (evolved through 20 generations of brother-sister mating) each founded 2 sublines: Small (S; 12 virgin males and 12 virgin females) and Large (L; 144 virgin males and 144 virgin females, distributed evenly among 12 vials). These 84 lines (S and L subline per 42 MA lines) were maintained at these census population sizes for 6 generations (only 2 shown here). Each generation, all emergent flies from the 12 vials per L subline were pooled prior to virgin collection. For S sublines, 2 vials were established each generation; the focal vial contributed offspring to the next generation, while the replicate vial (gray shaded) did not. Each generation,1 wing was sampled from each of 6 randomly chosen males from each of 2 vials per line (focal and replicate vials for S; randomly chosen 2 for L). (B) Wing size and shape were characterized from landmarks recorded on an image of each wing: proximal (1) and distal intersections of the radial vein (2); distal intersections of medial (3), cubital (4), and distal (5) veins and; the posterior (6, 7) and anterior (8, 9) cross-veins. ILD traits were described by their end-point landmarks (e.g. ILD1.2 was the distance between landmark 1 and landmark 2).The second key aspect of the experimental design was to manipulate the mutation-selection-drift dynamics within an MA line; this was achieved by imposing 2, substantially different, population sizes on sublines of each of the 42 MA lines (N = 24 vs 288 flies, referred to hereafter as small, S, and large, L, population size treatments: Fig. 2a). Segregating variants within MA lines (i.e. mutations that have not yet been fixed or lost) could cause transient inflation of among and/or within line variance (V), impacting on both the estimation and scaling of V, and this manipulation allowed us to determine the magnitude of this effect. The treatments contrast deterministic evolution of mutations with relatively strong (s > ∼ 0.038: Ne ∼ 13), vs weak (s > 0.003: Ne ∼ 158) fitness effects, based on s = 1/2 N (Wright 1931; Kimura 1983) and genomic estimates of N in MA lines of D. melanogaster maintained similarly to our small population size treatment (10 males and 10 females: Huang ). The S and L treatments therefore had different opportunities for new mutations to increase in frequency within a line, and thus for the magnitude of within-line variance.The final key aspect of the experimental design was the repeated measures themselves, allowing us to observe the effect of environmental variation on among-line variance. If the phenotypic effects of a mutation are context-dependent (i.e. exhibit G×E variance), then unintended differences in assay conditions could contribute heterogeneity among estimates when phenotypic data is collected at different timepoints (or in different laboratories). We randomly sampled the average environmental conditions present within our laboratory by repeatedly sampling the lines (genotypes) over 6 consecutive generations. Thus, our experiment consisted of applying 2 population size treatments (S, L) to each of 42 lines (derived from a classical MA experiment, with low among-line variation), where these 84 lines were maintained under the same conditions (12 flies per sex per vial founding each generation, with S and L differing in the number of vials) for 6 generations (Fig. 2a). As detailed below, we consider 11 wing shape and size traits. This allows us to understand the general influences of segregating variation, environment and sampling error for a set of related morphological traits. After accounting for the 3 factors that are the main focus of the investigation, we also determine whether the magnitude of V varies among these traits, allowing insight into potential magnitude of differences in mutational variance among traits within the same category (morphology).
Data collection
Each generation, 12 males (6 from each of 2 rearing vials) from each of the 84 sublines were randomly sampled for wing phenotypes (Fig. 2). Wings were mounted on microscope slides and photographed using a Leica MZ6 microscope camera and the software LAS EZ v2.0.0 (Leica Microsystems Ltd, Switzerland). A total of 5,135 wings were landmarked for 9 positions, defined by wing vein and margin intersections (Fig. 2b), using the software tpsDIG2 (Rohlf 2013). The number of wings were evenly distributed across treatments (2,583 in S and 2,552 in L) and generations (∼425 per generation, per treatment). Landmarks were aligned using a General Procrustes fit in tpsRelw (Rohlf 2007). Centroid size (CS), the square root of the sum of squared deviations of the coordinates from the centroid (Rohlf 1999), was recorded as a metric of wing size. The aligned X-Y coordinates for each landmark were then used to calculate 10 inter-landmark distances (ILDs) (Fig. 2b). ILDs scores were re-scaled prior to analysis (multiplied by 100) to aid model convergence. Outliers >3.0 SD from the mean were removed for each of the 11 traits (10 ILD and size) (329 measures across the 56,485 total measures).
Analyses of variation in among-line variance estimates
Our experimental design allows us to repeatedly estimate variance among MA lines under conditions where we expect the number of mutations fixed among the lines, and their phenotypic effects, to be constant, and thus to investigate other potential causes of variability in estimates. We first treat the data from each generation and population size treatment as independent experiments of similar size (number of lines and individuals sampled per line) to typical MA experiments. To estimate among line variance from these 12 “experiments” for each of the 11 traits we fit the following model using REML in PROC MIXED in SAS v9.4 (SAS Institute Inc., Cary, NC.):
where was the trait value for the mth wing (individual), from the lth vial, within the kth line, was the mean value of these observations; Line and replicate rearing Vial (nested within line) were fit as random effects, along with the among-individual variation (residual error, ε). We used REML-MVN sampling (Meyer and Houle 2013; Houle and Meyer 2015; Sztepanacz and Blows 2017) to estimate CIs, sampling 10,000 times from N ∼ (, V) using the rnorm in R [v. 3.6.1], where was the vector of REML random effect parameter estimates, and V was their inverse Fisher information matrix, ()−1. We similarly estimated the CIs for the trait mean, sampling based on least-squares mean and SE estimates output from model (2). The samples of random effect variances were not constrained to the parameter space (i.e. could be negative), allowing inference of statistical support when the lower 5% CI did not encompass zero (a 1-tailed test); this approach is equivalent to a log likelihood ratio test (LRT) (Dugand ). Here, we are interested in general patterns of variability among these 12 “experiments,” and thus do not correct for multiple testing.As detailed in the Results, substantial heterogeneity in magnitude was observed among the 12 replicate estimates of V per trait. We considered the potential contribution to this heterogeneity from unintentional heterogeneity in the culture conditions among sampling time points (generations) or between replicate measures of the same MA line within a generation (the S and L treatments). First, to determine if simple effects of variability in culture conditions on trait scale could account for the variability of among-line (mutational) variation, we placed estimates on a heritability (V/V where V was the sum of among and within vial variances) or coefficient of variance () scales, and calculated confidences intervals by applying these equations to each of the 10,000 samples described above (and applying the sign correction for coefficients of variance as detailed in the meta-analysis methods). We further explored the relationship between V and the scaling parameters by regressing the 12 estimates of V on the corresponding estimates of V or trait mean.Second, we determined whether mutational effects changed in response to the unintended changes in culture conditions, with such G×E causing differences among sequential estimates of V. Therefore, we extended this investigation, following García-Dorado in treating different generations as different environments to formally test the null hypothesis that there was no G×E variance within the S or L treatments, using PROC MIXED and REML to fit:
where the fixed effect of generation (G) accounted for differences in trait mean among generations and the random effect of G(eneration) Line estimated the variation in genetic effects among generations (where generations represent different local environments). For the component of V (i.e. Vial and residual), generation-specific effects were modeled (using the GROUP statement) to account for among-generation heterogeneity in the magnitude of V. This model was applied to each trait within each population size treatment, and statistical support for G Line (and for Line) was determined using log-LRTs (0.5 d.f.: Self and Liang 1987; Littell ) to compare model (3) to reduced models that did not fit G × Line (or did not fit Line). We applied the Benjamini-Hochberg method (Benjamini and Hochberg 1995) to correct for multiple hypothesis testing (within each random effect), using a conservative 5% false discovery rate (FDR). Sampling based on the REML variance estimate and the Fisher information matrix, as detailed above, was used to estimate CIs for plotting.While nonzero generation by line variance could reveal the presence of environment-specific mutational effects, it could alternatively be explained by changes in the frequency at which mutations were segregating within or among lines. In contrast to environmental heterogeneity, we expect these evolutionary processes to systematically differ between the 2 population size treatments due to the different efficacy of selection in the S vs L sublines, and the independent sampling of mutations in the sublines after they were established. Differences between S and L are predicted to increase with increasing time since divergence. For each of the 11 traits, we analyzed all data (from both L and S) collected within a single generation, using PROC MIXED and REML to fit:
where treatment was fit as a fixed effect to account for differences in trait mean between L and S panels of sublines within that generation. Vial and residual are as described for model (2). At the among-line level, we took advantage of the paired subline design to model the between treatment variance-covariance matrix. We employed LRT to test 2 hypotheses. First, we determined whether, for these analyses within a generation, there was support for differences between treatments in the magnitude of V. Mutations that are segregating (i.e. occur at frequencies other than 0 or 1) within an MA line will contribute to variation both among-vials and the residual. We compared a model in which one (common to both Treatments) among Vial variance and one residual variance were estimated to a model in which Treatment-specific variances were modeled at both levels (fit using the GROUP statement). Second, we tested whether the 2 copies of the MA lines had diverged from one another by testing the hypothesis that the correlation between the paired sublines was <1.00 (implemented using a PARMS statement). To correct for multiple hypothesis testing (within each hypothesis), we employed a FDR correction as described above.Finally, as there was little support for varying mutational effects (no G×E) or number (no divergence between S and L) contributing to the apparent heterogeneity among repeated estimates per trait (detailed in Results), we use our data to revisit the question of whether traits inherently differ from one another in the magnitude of mutational variance. We obtained a single estimate of V per trait by using PROC MIXED in SAS to fit:
where all effects are as described above, including the fixed effects of population size treatment (T), generation (G), and their interaction (TG), as well as the random effects of Line, Generation by Line, Vial and residual. We obtained REML-MVN CIs for each parameter, as described above. To test whether observed differences in V among traits were due to differences among them in scale, we took the among-line (V) estimates from model (5) and regressed them on the corresponding estimates of environmental variance or on the squared trait mean. These regressions were applied to the REML parameter estimates, and to each of 10,000 samples of these parameters to determine statistical significance (95% CI of slope did not include zero).
Results
Meta-analysis of published mutational variance estimates
Our final meta-analysis data set consisted of 154 estimates of and 148 estimates of CV. These estimates of ranged from 2.50 × 10−5 to 1.02 × 10−2, while CV ranged from 0.13 to 7.32. We predicted that differences in genome size and/or genomic mutation rate would cause differences in the magnitude of mutational variance among taxa. However, there was no statistical support for a difference in mutational variance among the taxon categories (: F3,24.2 = 2.28, P = 0.1044; CV: F3,17.5 = 1.24, P = 0.3261), although estimates from Plants were markedly lower than estimates from Daphnia and Drosophila (Fig. 3a). We predicted that differences among traits in the number of contributing loci would cause differences in the magnitude of mutational variance. However, trait categories only differed in the magnitude of CV (F5,37.1 = 3.86, P = 0.0064), not (F5,85.9 = 0.40, P = 0.8497) (Fig. 4). Overall, these factors (taxon and trait category) accounted for 1.64% of the variation in estimates of and 9.88% of variation among CV estimates.
Fig. 3.
Variation in estimates of (a) mutational heritability and (b) coefficient of mutational variance across taxon categories. Plotted are the least-squares mean estimate (±SE) from the analyses of model (1). The number of studies (and estimates) analyzed for each category are shown. The dashed line indicates the global mean value.
Fig. 4.
Variation in estimates of (a) mutational heritability and (b) coefficient of mutational variance across trait categories. Plotted are the least-squares mean estimate (±SE) from the analyses of model (1). The number of studies (and estimates) analyzed for each category are shown. The dashed line indicates the global mean value.
Variation in estimates of (a) mutational heritability and (b) coefficient of mutational variance across taxon categories. Plotted are the least-squares mean estimate (±SE) from the analyses of model (1). The number of studies (and estimates) analyzed for each category are shown. The dashed line indicates the global mean value.Variation in estimates of (a) mutational heritability and (b) coefficient of mutational variance across trait categories. Plotted are the least-squares mean estimate (±SE) from the analyses of model (1). The number of studies (and estimates) analyzed for each category are shown. The dashed line indicates the global mean value.Although not statistically supported, it is notable that the among-trait trend did not follow predictions for : fitness traits had the largest average , not the lowest as expected (Fig. 4a). Following Houle , we also analyzed V (fit model (1) to ). There was no statistical support for a difference among traits in the magnitude of CV (F5,25.2 = 1.32, P = 0.2887); morphology (average CV = 7.4) and physiology (71.6) differed the most, with life history traits having intermediate values (e.g. fitness = 42.6) (Supplementary Fig. 2a). For CV, the statistically supported differences did follow the predicted pattern, with morphological traits having the smallest CV and fitness the largest (Fig. 4b). Survival notably had lower CV than productivity and fitness (Fig. 4b), although surviving to reproduce was a component of fitness. Physiological traits had a similar magnitude of CV to morphological traits, lower than any life history trait category (Fig. 4b).While the lack of observed difference in scaled estimates of V among taxa may reflect a true commonality among species in this important evolutionary parameter, aspects of the MA design also differed markedly among taxa. Estimates from Plants were derived from MA experiments that were of short duration (maximum 25 generations) relative to other taxa (median 44, 75, and 214 for Drosophila, Daphnia and Nematode, respectively) (Supplementary Fig. 2b). As mutations arise independently in each MA line, the number of MA lines maintained may also influence the number of mutations that similar duration MA could sample, and the potential for sampling of rarer mutational effects (i.e. from the tails of the distribution of effect sizes) to influence estimates; while the median number of MA lines was similar in Nematode (43) Plant (50) and Drosophila (52), it was substantially lower in Daphnia (8) (Supplementary Fig. 2c). While mutational variance was estimated for multiple types of traits in every taxon, most data from Plants was for fitness, while most data from Daphnia was for morphological traits, and Drosophila and Nematode were the only 2 taxon categories that contributed estimates for physiological traits (Supplementary Fig. 2d).
Variation in estimates of among-line variance within the same taxon and trait type
Within the D. serrata experiment, we first determined the heterogeneity in among-line variance (V) under the assumption that mutation number and effects were constant, treating the 12 V estimates per trait as independent MA experiments. There was substantial variation among the 12 estimates per trait, with some differences of over an order of magnitude (Fig. 5; Supplementary Table 4). Notably, the smallest estimate of V for size (CS) was four times lower than the largest estimate, comparable to the 4-times difference among reported estimates of for body size in C. elegans (Azevedo ; Estes ; Ostrow ). Predictably given this heterogeneity in magnitude of effect (i.e. in V), there was also inconsistent statistical support for the presence of V for most traits, despite consistent sample sizes in each of the 12 “experiments” (Fig. 5; Supplementary Table 4). Thus, we might draw very different conclusions about the magnitude of V for a trait, depending on which “experiment” we had conducted (Fig. 5).
Fig. 5.
Among-line variance estimates across 6 generations in an experiment in D. serrata. Variances were estimated independently for each trait (panel; see Fig. 2b for trait definitions) in each generation (x-axis) for each of the 2 population size treatments (Small: solid circles; Large: open circles). Plotted are the REML point estimate, and the REML-MVN 90% CIs. The dashed horizontal line indicates 0; estimates for which the lower CI did not overlap zero were interpreted as statistically supported. Where REML estimates of among-line variance were zero, CI were not estimated.
Among-line variance estimates across 6 generations in an experiment in D. serrata. Variances were estimated independently for each trait (panel; see Fig. 2b for trait definitions) in each generation (x-axis) for each of the 2 population size treatments (Small: solid circles; Large: open circles). Plotted are the REML point estimate, and the REML-MVN 90% CIs. The dashed horizontal line indicates 0; estimates for which the lower CI did not overlap zero were interpreted as statistically supported. Where REML estimates of among-line variance were zero, CI were not estimated.Due to the changes in N within this experiment, we do not place these V estimates on a per-generation scale (i.e. do not calculate V). However, there is no trend for V to increase through time (i.e. no signal of ongoing divergence through fixation of mutations), or to diverge between the different population size (N) treatments (Fig. 5) (addressed further below). Therefore, calculating V is not expected to eliminate the heterogeneity in estimates.Reporting mutational variance estimates as or CV facilitates comparison among estimates by accounting for inherent differences in scale. Although here the 12 estimates come from the same trait, scale differences may still arise through typical effects of any unintended variation in culture conditions (occurring among generations or between the replicate S and L sublines) on nongenetic trait variance (V) or mean. Both V and the trait mean varied substantially among the 12 repeated estimates for all traits (Supplementary Figs. 3 and 4; Supplementary Table 4). However, this variation in V and trait mean was independent of the observed variation in V; regressing the 12 estimates of V on their corresponding estimate of V or trait mean supported only 1 slope (ILD3.7, V on mean) as statistically different from zero (although this did not remain significant following FDR correction) (Fig. 6; Supplementary Table 5). Consistent with this pervasive independence of V from the scaling factors for these repeated measures of the same trait, when the 12 estimates were placed on either a heritability (Supplementary Fig. 5; Supplementary Table 4) or coefficient of variance scale (Supplementary Fig. 6; Supplementary Table 4), the variation among them was of a similar magnitude to that observed for V itself (i.e. the variation plotted in Fig. 5). Plotting the scaled estimates (h2 or CV) against their respective numerator (V or ) and denominator (V or mean) illustrates the predominant contribution from variation in V to variation in the scaled estimates (Supplementary Fig. 7). Overall, the 12 estimates of V are more variable than the corresponding estimates of V or trait mean, with variability of the scaled estimates (h2 or CV) more similar to V than to their respective scaling factor (Supplementary Fig. 8).
Fig. 6.
Among-line variance estimates for D. serrata wing traits plotted as a function of trait mean or variance. The 12 estimates of among-line variances for each of the 11 wing traits (panels; see Fig. 2b for trait definitions) are plotted against the corresponding (i.e. same generation and treatment) squared trait mean (bottom x-axis, black symbols) or environmental variance (summed among and within vial variances; top x-axis, gray symbols). All regression statistics are reported in Supplementary Table 4; only the effect of mean2 on V of ILD3.7 was significant at P < 0.05, although it does not remain significant after applying a 5% FDR correction.
Among-line variance estimates for D. serrata wing traits plotted as a function of trait mean or variance. The 12 estimates of among-line variances for each of the 11 wing traits (panels; see Fig. 2b for trait definitions) are plotted against the corresponding (i.e. same generation and treatment) squared trait mean (bottom x-axis, black symbols) or environmental variance (summed among and within vial variances; top x-axis, gray symbols). All regression statistics are reported in Supplementary Table 4; only the effect of mean2 on V of ILD3.7 was significant at P < 0.05, although it does not remain significant after applying a 5% FDR correction.To compare the variability among published estimates of and CV in similar morphological traits (excluding bristle traits: 19 estimates, 8 each from Daphnia and Nematode, 3 Drosophila and 1 Plant) to the variability among the V estimates for D. serrata wing trait traits, we calculated the coefficient of variance (cv = standard deviation/mean) for each set of estimates. The cv of all 19 published (0.80) and CV (0.82) estimates was above the median cv of the 11 D. serrata traits on the observed V scale (0.55), V-scale (h2: 0.54) or mean-scale (CV: 0.39), but nonetheless within the same range: cv of V (V-scale; mean-scaled) ranged from 0.30 (0.37; 0.15) up to 0.92 (0.85; 0.62) across the 11 traits (Supplementary Fig. 8). Within taxa, cv of published (CV) estimates ranged from 0.29 (0.08) for the 3 Drosophila estimates up to 0.97 (0.38) in Daphnia, with a median of the 3 within-taxon (Drosophila, Daphnia, and Nematode) cvs of 0.64 (0.24). Thus, overall, the heterogeneity among repeated V estimates in D. serrata is of a similar magnitude to the variation among published estimates of the same trait type.Having established that variation in the magnitude of V is not a simple consequence of varying scale (V or mean), we investigated other putative causes. In addition to the general effects on scale, unintended differences in culture conditions among generations could also affect V if mutations had context-dependent effects on the trait (i.e. G×E), as characterized by generation by among-line variance. GxE was statistically supported in only 4 cases, with only 2 remaining significant at a 5% FDR (LRT for CS, in L: = 13.9, P = 0.0001; LRT for ILD2.5 in L: = 11.1, P = 0.0005) (Fig. 7a). Thus, for these 2 traits, the analysis suggests that mutational effects, and the magnitude of among-line variance, may depend on the specific conditions under which the traits were assayed.
Fig. 7.
Estimates of variance from an experiment in D. serrata. a) Among-line by generation (G×E) variance and (b) among-line variance estimated for 11 D. serrata wing traits (x-axis), in 2 different population size treatments (Small: solid circles; Large: open circles). Plotted are the REML point estimates (from model 3) and the REML-MVN 90% CIs. The dashed horizontal line indicates 0; statistical significance was inferred where the lower 5% CI did not overlap 0. After applying a conservative 5% FDR correction, 2 estimates in a) and 21 in b) remained significant (indicated by an asterisk).
Estimates of variance from an experiment in D. serrata. a) Among-line by generation (G×E) variance and (b) among-line variance estimated for 11 D. serrata wing traits (x-axis), in 2 different population size treatments (Small: solid circles; Large: open circles). Plotted are the REML point estimates (from model 3) and the REML-MVN 90% CIs. The dashed horizontal line indicates 0; statistical significance was inferred where the lower 5% CI did not overlap 0. After applying a conservative 5% FDR correction, 2 estimates in a) and 21 in b) remained significant (indicated by an asterisk).Ongoing mutation-drift-selection processes could contribute to variation among the 12 estimates, where the S and L treatments are expected differ in the potential effects of these processes on both within and among-line variance. Segregating variation within a line will contribute to the estimate of V, and we determined whether the S and L treatments differed in the magnitude of V, analyzing each of the 11 traits within each of the 6 generations separately. Eight of the 66 estimates of V differed significantly between S and L at P < 0.05, but only one remained significant at 5% FDR (CS in generation 5; Supplementary Table 6). There was no statistical support for the S and L sublines founded by each of the 42 original MA to have diverged from one another in the mutations they carried, either through initial sampling when lines were founded, or through (near) fixation of mutations arising after establishment of the sublines. Specifically, the among-line correlation between S and L sublines was not statistically distinguishable from 1.0 for any trait in any generation (Supplementary Table 6).Finally, we obtained a single estimate of among-line variance for each trait to determine whether the magnitude of V was consistent among the 10 wing shape traits, which are expected to share a genetic basis, and developmental pathways (e.g. Mezey ; Neto-Silva ). The among-trait heterogeneity (i.e. nonoverlapping CIs: Fig. 8a; cv = 0.77) was larger than the median variability among the repeated estimates per trait (cv = 0.55; see above), and comparable to the variability among published estimates of mutational variance in morphological traits (cv = 0.80, detailed above). Variation among shape traits (excluding size) in V accounted for ∼18% (95% CI: 0.049–0.383) of this variation (β = 0.064 [95% CI: 0.050–0.106]), but variation in trait mean did not account for any (β = 5.3 × 10−7 [−0.74 × 10−8–1.86 × 10−6]; R2 = 0.003 [95% CI: <0.001–0.027]) (Fig. 8b). Establishing whether these differences are informative of the inherent genetic architecture, or are a manifestation of the stochastic nature of mutation, requires repeating the estimation using either the same or a different genetic background to determine if consistent differences among traits persist. When wing size was also considered, overall scale influenced V, with much of the variation among estimates accounted from by the scaling factors (V: R2 = 0.995 [0.978–0.998]; Mean2: R2 = 0.921 [0.903–0.925]), suggesting that there was little difference in the magnitude of underlying mutational variance between wing size and the shape traits (Fig. 8b).
Fig. 8.
Among-line variances for 11 wing traits in D. serrata. a) Among-line variance, V REML estimates (and 90% CI) (model 5; see Fig. 2b for trait definitions) are plotted. Dashed line indicates 0. b) REML estimates of among line variance (points in panel a) were plotted against the corresponding estimate of environmental variance (black circles, top x-axis) or mean squared (gray squares, bottom x-axis) of the trait. Regression results are reported in text.
Among-line variances for 11 wing traits in D. serrata. a) Among-line variance, V REML estimates (and 90% CI) (model 5; see Fig. 2b for trait definitions) are plotted. Dashed line indicates 0. b) REML estimates of among line variance (points in panel a) were plotted against the corresponding estimate of environmental variance (black circles, top x-axis) or mean squared (gray squares, bottom x-axis) of the trait. Regression results are reported in text.
Discussion
Although numerous estimates of mutational variance have been published, it remains unclear what contributes to the ∼2 orders of magnitude difference among these estimates. Our meta-analytic investigation provided some support for a difference among trait types in the magnitude of mutational variance, but also revealed substantial confoundment between potential causal factors. Analyses of data from a manipulative experiment in D. serrata suggests that, for the morphological traits under consideration, factors such as unintended heterogeneity in environmental conditions or transient segregation of mutations within MA lines may contribute little to the variation among estimates. Given this experimental design, and the evidence that mutation number and effect did not typically cause differences among repeated estimates, we conclude that substantial variability among repeated estimates of the among-line variance must reflect sampling error. Below we discuss the specific outcomes and limitations of both approaches, and the implications our analyses have for future work characterizing mutational input to quantitative genetic variation.
Effects of taxon and trait type on the magnitude of mutational variance
Given the ∼4-times higher per site mutation rate (Katju and Bergthorsson 2019), and slightly larger genome of A. thalania relative to C. elegans we predicted (assuming the same mutational effect sizes) ∼5 times more mutational variance in the Plant than Nematode taxon categories. However, the meta-analysis did not support a difference among taxa in the magnitude of mutational variance, and the observed (strong but nonsignificant) pattern in contradicted this rank prediction, with Plants (to which Arabidopsis contributed most estimates) having substantially smaller than other taxa (Fig. 3a). Taxon categories differed substantially in the number of generations (Supplementary Fig. 2b), and the number of genomes (MA lines) (Supplementary Fig. 2c) sampled. However, scaling predicted genomic mutation by this opportunity for mutation also fails to predict the trend, with Daphnia MA experiments predicted to sample the fewest mutations but observed to have the largest (Fig. 3b). We suggest that further MA experiments, decoupling the confounded effects of MA duration and trait type from taxon, are warranted to determine whether V does vary among taxa. Advances in accessibility of genome data provides substantial scope for such experiments to explicitly estimate relevant genomic parameters (e.g. frequency spectra for different types of mutations across putatively causal genes) alongside the phenotypic variation generated by those mutations (Katju and Bergthorsson 2019). Furthermore, given evidence that epigenetic mutations arise more frequently than genetic mutations (e.g. van der Graaf ; Beltran ), we suggest that the potential contribution of epimutations to patterns of heterogeneity of V should be explicitly assessed in future studies.Houle ; see also Lynch and Walsh 1998; Lynch ) concluded that life-history traits had lower and higher CV than morphological traits, a pattern that is also observed in standing genetic variation (e.g. Houle 1992; Hansen ; but see Hoffmann ). However, this conclusion was not supported by our analysis, where the trend was for fitness and productivity to have the highest (nonsignificant) as well as highest CV (Fig. 4). As expected given this shared pattern, and CV estimates were positively correlated (Spearman’s correlation coefficient: 0.309, N = 117, P = 0.0007). Although with reverse rank (life-history traits having lowest values), standing genetic variation estimates have also been reported to be positively correlated between the 2 scales (h2 and CV) when biologically uninformative CV estimates were excluded (Hoffmann ). Garcia-Gonzalez highlight the potential for skewed data distributions to inflate (deflate) CV, an issue that may be particularly relevant to estimates of CV. While strong bias toward mutations that decrease mean fitness has been reported (Halligan and Keightley 2009), bias in other traits is less well-established. If trait types differ in the magnitude of directional bias of mutational effects, this may also result in differences in skew, and exaggerate differences between trait types on the CV scale. Again, resolution of the key question of whether differences among traits in and CV reflect differences in mutation number and/or effect size may depend on further genomic data.
The contributions of unintended environmental variation, mutation-drift-selection processes, and sampling error to variation in the magnitude of mutational variance
We observed substantial variation among repeated estimates of V each of 11 wing traits measured in D. serrata (Fig. 5), resulting in variation among scaled (h2 or CV) estimates that was of comparable magnitude to the differences observed among published estimates. Although both V and trait mean also varied among repeated measures (Supplementary Figs. 3 and 4), this heterogeneity was substantially less than that observed for V (or h2 or CV) (Supplementary Fig. 8). Given the evidence that mutational effects can vary among environments (Kondrashov and Houle 1994; Martin and Lenormand 2006), we tested the effects on the magnitude of V resulting from unintentional and undocumented minor changes in culture conditions (e.g. density, humidity, or temperature), such as may occur among phenotype assays conducted at different times or in different laboratories. Variation in mutational effects among phenotypic assays (generations) was supported in only 2 cases (Fig. 7a). García-Dorado also found evidence of GxE among consecutive generations for 1 (sternopleural bristle count) of 4 traits investigated. Notably, in D. serrata, wing size, which might be particularly sensitive to variation in energy availability (or competing energetic demands) (Cavicchi ; Bitner-Mathé and Klaczko 1999), exhibited the strongest GxE (Fig. 7a). Our results, and those of García-Dorado suggest that changes in mutational effects with environment may contribute to heterogeneity among published estimates of some traits, which may reflect differences in trait environmental sensitivity, or potentially in the covariation of environmental sensitivity and mutational effect size (Lynch ; García-Dorado ).The mutation-drift process itself may also contribute to variability among published estimates due to effects on both the within-line variance (transient inflation leading to increased magnitude of V but not V) and among-line variance (transient contribution to V of additive or dominant mutations that are subsequently lost via random sampling). We introduced an ∼order of magnitude difference in census size in paired sets of MA sublines to manipulate the mutation-drift-selection processes. However, analyses did not support an effect of population size on either within- or among-line variation; size was again an exception, where there was some evidence that relaxed selection allowed the S treatment to accumulate greater within-line variance (Supplementary Table 6). The effect of segregating variation can be expected to be greater at smaller population sizes (e.g. when N = 2, mutations can reach within-line frequency of 75% before being lost by drift) than considered here, and so may play a greater role in explaining variation among estimates from classical MA breeding designs. But, nonetheless, this factor did not account for the substantial heterogeneity that was observed among the 12 estimates per trait within the current study.Rejecting general contributions from environmental variation and transient segregation of mutations as explanations of the heterogeneity among the 12 repeated estimates of V for the wing shape traits, we conclude that the observed variability is largely the consequence of sampling error. Lynch suggested that a substantial part of the order of magnitude range of reported for D. melanogaster may be due to sampling error. Here, we observed the magnitude of heterogeneity among the 12 repeated estimates of V to be similar to the heterogeneity among published, scaled estimates of mutational variance in morphological traits, consistent with their prediction. We observed that V estimates varied markedly more than the other estimated parameters (Supplementary Fig. 8), as expected given that quantitative genetic parameters are associated with relatively large sampling errors. Notably, V was more variable among the 12 estimates than trait mean was, which may lead to greater variability among estimates of than CV (Supplementary Fig. 8). We designed this experiment to mimic an average MA sample size, and considered traits expected to have relatively low experimental noise (residual variation) and small effect size (due to the relatively few generations; see e.g. Vassilieva ). While traits and MA panels will vary in their vulnerability to sampling error, we nonetheless suggest that greater consideration must be given to the consequences of this error when designing experiments. The heterogeneity among repeated estimates resulted in the total confidence range for each trait spanning a far greater region than suggested by the error estimated for each repeated estimation of V (Fig. 5), indicating that within-study estimates of error do not fully capture the uncertainty in estimates.The sequential repeated-measures experimental design provided greater statistical control over the experimental noise, allowing us to consistently detect statistically significant mutational variance in all traits (Fig. 8a), including in traits for which very few of the 12 estimates were distinguishable from 0 (e.g. ILD2.8; Fig. 5). While increasing sample sizes within a generation is likely to have similarly improved estimate precision, this can be logistically prohibitive in some systems. Given these limits, our analysis highlights the potential benefits of short-term repeated measures (sequential generations) to improve estimate precision, and power to detect small effects. Repeated measures of lines at relatively large generation intervals have also been utilized to estimate V as the slope of the regression of among-line variance on generation (Vassilieva ; Houle and Nuzhdin 2004; McGuigan ), which may also improve estimation.Understanding the contribution that mutations make to evolutionary and genetic phenomena relies on accurate estimates of the phenotypic variance generated by new mutation. Our meta-analysis of empirical estimates of mutational variance was unsuccessful in clearly resolving causes of variation due to confounding of predictors, and inconsistent patterns. Our manipulative experiment suggested that sampling error may contribute substantially to estimate variability, and demonstrated that repeated measures over few (e.g. sequential) generations provides a simple but effective approach to address this and improve inference. Overall, further empirical studies are needed to fully assess how both general and study specific factors influence VM estimates, where improved precision and replicability in estimates will consequently advance broader evolutionary questions such as those addressing the maintenance of quantitative genetic variance (Barton and Turelli 1989; Johnson and Barton 2005; Walsh and Lynch 2018).
Data availability
Both analyzed datasets are available at doi: 10.6084/m9.figshare.14913051.Supplemental material is available at GENETICS online.Click here for additional data file.Click here for additional data file.
Authors: Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330
Authors: Robert J Dugand; J David Aguirre; Emma Hine; Mark W Blows; Katrina McGuigan Journal: Proc Natl Acad Sci U S A Date: 2021-08-03 Impact factor: 11.205