| Literature DB >> 30114186 |
Christos Vlachos1, Robert Kofler1.
Abstract
Evolve and Resequencing (E&R) studies allow us to monitor adaptation at the genomic level. By sequencing evolving populations at regular time intervals, E&R studies promise to shed light on some of the major open questions in evolutionary biology such as the repeatability of evolution and the molecular basis of adaptation. However, data interpretation, statistical analysis and the experimental design of E&R studies increasingly require simulations of evolving populations, a task that is difficult to accomplish with existing tools, which may i) be too slow, ii) require substantial reformatting of data, iii) not support an adaptive scenario of interest or iv) not sufficiently capture the biology of the used model organism. Therefore we developed MimicrEE2, a multi-threaded Java program for genome-wide forward simulations of evolving populations. MimicrEE2 enables the convenient usage of available genomic resources, supports biological particulars of model organism frequently used in E&R studies and offers a wide range of different adaptive models (selective sweeps, polygenic adaptation, epistasis). Due to its user-friendly and efficient design MimicrEE2 will facilitate simulations of E&R studies even for small labs with limited bioinformatics expertise or computational resources. Additionally, the scripts provided for executing MimicrEE2 on a computer cluster permit the coverage even of a large parameter space. MimicrEE2 runs on any computer with Java installed. It is distributed under the GPLv3 license at https://sourceforge.net/projects/mimicree2/.Entities:
Mesh:
Year: 2018 PMID: 30114186 PMCID: PMC6112681 DOI: 10.1371/journal.pcbi.1006413
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Fig 1Flow diagrams showing the order of events occurring at each generation during simulations with MimicrEE2.
A separate diagram is shown for each model of selection (i.e. mode’s) supported by MimicrEE2. A) At the w-mode the fitness of each individual is directly computed from the selection coefficients of the SNPs present in the genome. The mating success of individual scales with fitness. B) With the qt-mode, MimicrEE2 first computes the phenotypic values for each individual based on the effect sizes of the SNPs and some environmental variance. Then it performs truncating selection, where the individuals with the most pronounced phenotypic values are culled. C) During the qff-mode, MimicrEE2 computes the phenotypic values of a quantitative trait and maps these values to fitness using a fitness function (e.g.: a Gaussian fitness function for stabilizing selection). D) Events occurring during clonal evolution using the w-mode as example. Most importantly, clones do not mate but generate identical copies of themselves (with the exception of de novo mutations). In the flow diagram’s yellow indicates migrants and the width of the circles indicates the population size. Optional events are shown in square brackets.
Fig 2Simulation of truncating selection for starvation resistance in D. melanogaster.
A) Manhattan-plot showing the significance (cmh-test) of allele frequency differences between the founder and evolved populations, which were subject to truncating selection for 40 generations (10 replicates). Four loci in genes known to contribute to starvation resistance were picked as targets of selection (big black dots, gene names in italics) and an effect size was assigned to each (in brackets). A hemizygous X-chromosome was simulated in males. B) We used a different recombination rate for females (red) and males (blue). C) Nucleotide diversity of the 205 DGRP haplotypes used as founder population.
Comparison of tools for genome-wide forward simulations of evolving populations.
MimicrEE1 (mim1) [11], MimicrEE2 (mim2, this study), forqs [28], quantiNemo (qNemo) [29], SLiM2 [30], FFPopSim [31].
| feature | mim2 | mim1 | forqs | qNemo | SLiM2 | FFPopSim |
|---|---|---|---|---|---|---|
| run time in hours | 0.93 | 0.92 | 72 (1.81) | 6.9 | 1.01 | 13.8 |
| required RAM in GB | 4.7 | 4.7 | 28 (15) | 61 | 2.4 | 23 |
| quantitative traits | + | - | + | + | o | o |
| selection coefficients | + | + | + | - | + | + |
| ploidy (d: diploid, h: haploid) | h/d | d | d | h/d | h/d | h |
| +/+ | -/- | +/+ | +/+ | +/+ | +/o | |
| variable recombination map | + | + | + | + | o | - |
| sex (♀, ♂, ⚥) / hemizygous sex chromosomes | +/+ | -/- | +/- | +/- | +/+ | +/- |
| direct usage of genomic resources | + | + | - | - | - | - |
| complex epistasis | + | - | - | + | o | + |
| diminishing returns epistasis | + | - | - | - | o | o |
| truncating selection (ts) / temporarily variable ts | +/+ | -/- | +/- | -/- | o/o | o/o |
| disruptive / stabilizing selection | +/+ | -/- | -/+ | -/+ | o/o | o/o |
| adaptation to a moving optimum | + | - | o | + | o | o |
| gene-environment interactions / spatial model | -/- | -/- | -/- | -/- | +/+ | -/- |
| multi-threading / support for computer cluster | +/+ | +/- | -/- | -/- | -/- | -/- |
| output compatible with E&R tools (sync, fasta) | + | o | - | - | o | o |
1 requires substantial coding in Eidos or Python, such as implementation of a file parser
2 does not require conversion of genotypes into binary (01) or concatenation of chromosomes into one superscaffold
3 fitness values may be provided for all combinations of genotypes at pairs of loci
4 the effect size of QTLs may vary, not the position and shape of the fitness function
5 in brackets: haplotypes are requested as output
6 features are referring to the haploid_highd class which allows genome-wide simulations (>20 loci)
Fig 3Validation of MimicrEE2.
A) Allele frequency distribution of 10.000 SNPs with an initial frequency 0.5 after 50 generations of genetic drift (N = 250) compared to theoretical expectations (dashed line). B) Trajectories of 50 selected loci (grey lines; s = 0.1, h = 0.5) compared to theoretical expectation (dashed line) C) Response to selection (R; box plots based on 100 replicates) of a quantitative trait (QTLs = 10, h2 = 0.5) compared to theoretical expectations (dashed line). D) Decay of linkage disequilibrium between two initially linked loci (D = 0.25) due to recombination (r = 0.05). We simulated 100 replicates (grey lines) and show theoretical expectations (dashed line).