Mathias Currat1,2, Miguel Arenas3,4, Claudio S Quilodràn1, Laurent Excoffier5,6, Nicolas Ray7,8. 1. Laboratory of Anthropology, Genetics and Peopling History, Department of Genetics and Evolution - Anthropology Unit, University of Geneva, Geneva 1205, Switzerland. 2. Institute of Genetics and Genomics in Geneva (IGE3), University of Geneva, Geneva 1211, Switzerland. 3. Department of Biochemistry, Genetics and Immunology, Vigo 36310, Spain. 4. Biomedical Research Center (CINBIO), University of Vigo, Vigo 36310, Spain. 5. Computational and Molecular Population Genetics Laboratory, Institute of Ecology and Evolution, University of Bern, Bern 3012, Switzerland. 6. Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland. 7. Institute of Global Health, GeoHealth Group, University of Geneva, Geneva 1205, Switzerland. 8. Institute for Environmental Sciences, University of Geneva, Geneva 1205, Switzerland.
Abstract
SUMMARY: SPLATCHE3 simulates genetic data under a variety of spatially explicit evolutionary scenarios, extending previous versions of the framework. The new capabilities include long-distance migration, spatially and temporally heterogeneous short-scale migrations, alternative hybridization models, simulation of serial samples of genetic data and a large variety of DNA mutation models. These implementations have been applied independently to various studies, but grouped together in the current version. AVAILABILITY AND IMPLEMENTATION: SPLATCHE3 is written in C++ and is freely available for non-commercial use from the website http://www.splatche.com/splatche3. It includes console versions for Linux, MacOs and Windows and a user-friendly GUI for Windows, as well as detailed documentation and ready-to-use examples.
SUMMARY: SPLATCHE3 simulates genetic data under a variety of spatially explicit evolutionary scenarios, extending previous versions of the framework. The new capabilities include long-distance migration, spatially and temporally heterogeneous short-scale migrations, alternative hybridization models, simulation of serial samples of genetic data and a large variety of DNA mutation models. These implementations have been applied independently to various studies, but grouped together in the current version. AVAILABILITY AND IMPLEMENTATION: SPLATCHE3 is written in C++ and is freely available for non-commercial use from the website http://www.splatche.com/splatche3. It includes console versions for Linux, MacOs and Windows and a user-friendly GUI for Windows, as well as detailed documentation and ready-to-use examples.
The simulation of evolutionary scenarios using computer programmes is based on the application of combined mathematical models defined by a series of parameters. It is a powerful tool for understanding the mechanisms that shape biodiversity. When compared with deterministic mathematical models, computer simulations have the advantage of considering stochastic processes that better mimic the evolution of species (e.g. Klopfstein ). In particular, they allow hypothesis testing by comparing empirical data to those simulated under models representing such hypotheses. When combined with approximate Bayesian computation approaches, they also allow selecting among alternative evolutionary scenarios and estimating parameters of those scenarios (Bertorelle ; Csillery ). They can also be used for comparing and evaluating different analytical tools (e.g. Lopes ; Westesson and Holmes, 2009).The spatial and temporal dynamics of populations, such as movements in space over time, demographic variation and interactions among populations, play an important role in evolution. The genetic consequences of those various factors are difficult to disentangle and spatially explicit simulators have been developed for understanding the effects of these combined processes. By spatially explicit, we mean that the simulation takes into account the geographic position of populations. To the best of our knowledge, the programmes SPLATCHE (Currat ) and EASYPOP (Balloux, 2001) were the first spatially explicit computer simulators of genetic diversity available to the scientific community. Other spatially explicit simulation programmes pre-dating SPLATCHE were not distributed (Barbujani ; Rendine ). SPLATCHE was initially designed to study the impact of spatial and ecological information on molecular diversity and it modelled migration as movements on a 2D stepping-stone lattice (Kimura, 1953). In a few words, it consists in a forward demographic simulation of population demography and migration, followed by a backward coalescent simulation step. In the coalescent step, the ancestry of a sample of gene lineages taken from one or several populations is simulated until the most recent common ancestor of these lineages. Then, genetic diversity of the sample is generated by adding mutations over the simulated coalescent tree. SPLATCHE can handle spatial and environmental complexity through the use of population carrying capacities (e.g. linked to available environmental parameters), migration rates (i.e. directional gene flow) and frictions (i.e. dispersal constraints in different environments) based on user-specified raster maps that can change over time. SPLATCHE and its sequel SPLATCHE2 (Ray ) have been used in many studies, i.e. those on human origin (Ray ) and evolution (Currat and Excoffier, 2005; Lao ; Mona ), on the genetic effects of past range shifts (Banks ; Currat ; Francois ; Gehara ; Ray ; Reid ; Schneider ; Sjodin and Francois, 2011) and climate changes (Brown and Knowles, 2012; Brown ), on admixture between species (Currat ; Durand ) including human and Neanderthals (Currat and Excoffier, 2004, 2011), as well as on other diverse species (Francois ; White ).Since the publication of SPLATCHE2 in 2010, new capabilities have been implemented following suggestions from users and authors to provide more realistic simulations and to mimic additional evolutionary scenarios. As a result, we present here a new version of the programme that brings together several innovative developments.
2 New capabilities implemented in SPLATCHE3
2.1 Long-distance dispersal
In its previous version, the programme could only simulate migration between neighbouring demes under a stepping-stone migration model. SPLATCHE3 implements three new demographic models allowing the simulation of long-distance dispersal (LDD) events. The three LDD models differ by the arrival positions of the LDD events that can be set to target (i) any deme (occupied or empty), (ii) empty demes only or (iii) occupied demes only. Other user-defined parameters are the proportion of LDD events, the shape of the Gamma distribution used to sample the distance of an LDD event, and the maximum distance of an LDD event. The effects of these models and associated parameters on genetic diversity during range expansion have been explored in Ray and Excoffier (2010). They have also been used to explore the human colonization of Eurasia (Alves ) and the Americas (Branco ).
2.2 Interspecific hybridization
SPLATCHE3 can simulate two interacting populations occupying the same area. Each population is modelled as a layer of interconnected demes spread over space and interactions between populations (competition and/or admixture) occur between demes located at the same position in their respective layers. The model of admixture implemented in the previous version of the programme was criticized on the ground that introgression was assumed as asymmetric (Zhang, 2014). To address this point, we have implemented an alternative model of admixture that is more realistic to simulate hybridization between species. Indeed, the previous model was originally designed to represent the assimilation of individuals from one population to the other one, which affected the size of both populations. In the new model, gene flow is simulated in a more symmetric fashion and it does not modify population densities. We renamed the previous admixture model as ‘assimilation’ and the new one as ‘hybridization’. Three variants for each model are provided in SPLATCHE3: (i) without inter-specific competition, (ii) with density-dependent inter-specific competition and (iii) with density-independent inter-specific competition. The new hybridization model was described in detail in Excoffier and used to study hybridization between wildcats and domestic cats in Nussberger .
2.3 Heterogeneous migration rates in space and time
In the previous version of the programme, a single migration rate could be assigned to each population layer. In SPLATCHE3, it is now possible to assign different migration rates toward each direction (anisotropic migration) and individually for each deme. It is also possible to vary these individual migration rates over time. This feature has been used to simulate population range contractions caused by glacial periods (Arenas , 2013; Mona ).
2.4 Full disappearance of a population within a deme
A reset of the population size when the carrying capacity of a deme is set to 0 was implemented in SPLATCHE3, allowing simulation of the full disappearance of a population (e.g. due to lack of resources caused by a climatic change). This capability was used in Arenas , 2013) and Mona .
2.5 Genetic data in serial sampling
Since the first successful retrieval of molecular material from ancient remains (Higuchi ; Paabo, 1985), the production of ancient DNA (aDNA) has increased at an almost exponential pace (e.g. Leonardi ) mainly thanks to rapid advances in sequencing techniques and bioinformatics data processing (e.g. Orlando ). SPLATCHE3 can now simulate genetic data for serial samples through the implementation of the serial coalescent algorithm (Rodrigo and Felsenstein, 1999). This capability of SPLATCHE3 may be used to study aDNA and has already been used in two studies investigating population continuity through time (Silva , 2018). If one wants to add aDNA post-mortem features, scripts (e.g. bash or R) need to be developed and applied to SPLATCHE3 DNA sequences outputted in ARLEQUIN format (Excoffier and Lischer, 2010).
2.6 Multiple mutation models for DNA evolution
The previous version of the programme assumes a single rate of change among nucleotide states when simulating DNA sequence evolution (Jukes and Cantor, 1969). However, more complex patterns of change among nucleotide states are observed in empirical data (Arbiza ) and can be considered to provide more realistic evolutionary analysis (Lemmon and Moriarty, 2004). SPLATCHE3 implements models where four nucleotide frequencies and the six relative rates of change among nucleotide states are taken into account. These parameters allow the specification of the most commonly used DNA models of evolution, from the simple Jukes and Cantor model (JC) (Jukes and Cantor, 1969) to the complex Generalised time-reversible model (GTR) (Tavaré, 1986).
2.7 MacOs version
SPLATCHE3 includes a console executable for running on Mac OS, which was not available in previous versions of the programme.
3 Conclusion
SPLATCHE3 simulates genetic data under a large combination of environmental and demographic features not available in other spatially explicit simulators. Its main strength is the combination of flexibility and efficiency (through the use of the backward coalescent algorithm), which allows huge gains in computing time when compared with alternative forward individual-based simulators (see Arenas, 2012 and references therein). In this respect, the great flexibility of forward individual-based simulations comes with a cost in computing power and time as they simulate all the individuals of all populations, whereas only a small number of individuals and populations are usually analysed. Forward simulations combined with the coalescent framework are much more efficient because they only simulate genetic information for the sample and its ancestral lineages. This approach is implemented in all versions of the programme SPLATCHE, including SPLATCHE3. Its implementation in C++ offers a faster computation speed than the alternative programme PHYLOGEOSIM coded in JAVA (Dellicour ). A simple model of range expansion during 1000 generations from a single deme in a layer composed of 25–1600 demes (Carrying capacity = 300–1000, one DNA sequence of 1000 bp) results in running-times 2–40 faster with SPLATCHE3 than with PHYLOGEOSIM using a 3.6 GHz CPU station running on Linux Ubuntu. Other implementations of the coalescent, such as FASTSIMCOAL (Excoffier ; Excoffier and Foll, 2011) or ms (Hudson, 2002), potentially allow performing spatially explicit simulations by specifying an arbitrary migration matrix between all pairs of populations, but for models with complex geographic features the migration matrix becomes huge and is difficult to set up.SPLATCHE3 is a flexible simulator allowing to investigate a large variety of evolutionary scenarios in a reasonable computational time. The simulation of full genomes would be too time consuming to explore complex evolutionary scenario with SPLATCHE3. However, it is possible to simulate thousands or tens of thousands of independent or partly linked loci and use them as a proxy for full genome data through the computation of summary statistics (e.g. Currat and Excoffier, 2011; Silva ). Running time is mostly affected by the number of simulated generations, number of demes and number of loci. In addition, applying recombination and LDD requires additional computational time and memory. SPLATCHE3 is able to integrate various sources of information, such as population genetics, demography, migration, environment and molecular evolution. It allows one to study the effects of population dynamics on the genetic diversity of populations by considering detailed and realistic elements of the landscape that can vary over time. Elements such as continental contours, rivers and mountains can be set to prevent migrations (entirely or partly), while environmental factors (e.g. vegetation types, coastline and deserts) are typically linked to different carrying capacities of demes. The inclusion in SPLATCHE3 of LDD, as well as of refined models of hybridization and mutation opens up new possibilities of research in evolutionary and conservation genetics, especially when it uses the direct testimony of past genetic diversity offered by aDNA.
Authors: Jason L Brown; Jennifer J Weber; Diego F Alvarado-Serrano; Michael J Hickerson; Steven J Franks; Ana C Carnaval Journal: Am J Bot Date: 2016-01-08 Impact factor: 3.844
Authors: Michela Leonardi; Pablo Librado; Clio Der Sarkissian; Mikkel Schubert; Ahmed H Alfarhan; Saleh A Alquraishi; Khaled A S Al-Rasheid; Cristina Gamba; Eske Willerslev; Ludovic Orlando Journal: Syst Biol Date: 2017-01-01 Impact factor: 9.160
Authors: Oscar Lao; Eveline Altena; Christian Becker; Silke Brauer; Thirsa Kraaijenbrink; Mannis van Oven; Peter Nürnberg; Peter de Knijff; Manfred Kayser Journal: Investig Genet Date: 2013-05-20
Authors: Jorge R Paredes-Montero; Muriel Rizental; Eliane Dias Quintela; Aluana Gonçalves de Abreu; Judith K Brown Journal: Ecol Evol Date: 2022-01-23 Impact factor: 2.912