Literature DB >> 30872320

How Linked Selection Shapes the Diversity Landscape in Ficedula Flycatchers.

Agnes Rettelbach¹, Alexander Nater^2,3, Hans Ellegren².

Abstract

There is an increasing awareness that selection affecting linked neutral sites strongly influences on how diversity is distributed across the genome. In particular, linked selection is likely involved in the formation of heterogenous landscapes of genetic diversity, including genomic regions with locally reduced effective population sizes that manifest as dips in diversity, and "islands" of differentiation between closely related populations or species. Linked selection can be in the form of background selection or selective sweeps, and a long-standing quest in population genetics has been to unveil the relative importance of these processes. Here, we analyzed the theoretically expected reduction of diversity caused by linked selection in the collared flycatcher (Ficedula albicollis) genome and compared this with population genomic data on the distribution of diversity across the flycatcher genome. By incorporating data on recombination rate variation and the density of target sites for selection (including both protein-coding genes and conserved noncoding elements), we found that background selection can explain most of the observed baseline variation in genetic diversity. However, positive selection was necessary to explain the pronounced local diversity dips in the collared flycatcher genome. We confirmed our analytical findings by comprehensive simulations. Therefore, our study demonstrates that even though both background selection and selective sweeps contribute to the heterogeneous diversity landscape seen in this avian system, they play different roles in shaping it.

Entities: Chemical Disease Gene Species

Keywords: background selection; diversity; flycatcher; selective sweeps

Mesh：

Year: 2019 PMID： 30872320 PMCID： PMC6499528 DOI： 10.1534/genetics.119.301991

Source DB: PubMed Journal: Genetics ISSN： 0016-6731 Impact factor: 4.562

NATURAL selection reduces genetic diversity at neutral sites mainly by two key concepts of population genetics: hitchhiking (Smith and Haigh 1974; Kaplan ; Fay and Wu 2000) and background selection (BGS) (Charlesworth ; Hudson and Kaplan 1995; Nordborg ). The former process means that positive selection for an advantageous variant also selects for the genetic background (haplotype) on which the beneficial allele resides. The latter process implies that purifying selection against recurring deleterious mutations decreases the diversity at linked neutral sites. An advantageous allele is expected to quickly increase in frequency and reach fixation in the population, thereby being a potent diversity-reducing force in large genomic regions linked to (hitchhiking with) the site under positive selection. Removal of deleterious variants is typically a slower process if the fitness effects are small and/or the effective population size is low (Crow and Kimura 1970). During the time a disadvantageous allele still segregates in the population, recombination breaks up linkage to nearby variants and thereby narrows the region in which genetic diversity will become reduced by purifying selection. Importantly, BGS only removes haplotypes on which deleterious alleles reside, leaving variants carried by other haplotypes free to segregate. Therefore, single episodes of purifying selection do not have the same diversity-reducing effect as positive selection (e.g., Stephan 2010). However, since the distribution of fitness effects is typically strongly biased toward deleterious mutations (Eyre-Walker and Keightley 2007), the combined effect of selection against many deleterious mutations means that BGS is likely to have an as strong, or even stronger, effect on genetic diversity as selective sweeps. Elucidating the relative importance of these two types of linked selection, and under which circumstances one or the other process dominates, has been a challenging question (Charlesworth 1996; Kim and Stephan 2000; McVicker ; Comeron 2014; Elyashiv ). The feasibility of obtaining large-scale population genomic data, together with data on recombination rate variation and detailed genome annotation, now offer exciting possibilities to understand the underlying processes behind linked selection. If linked selection is prevalent and acts genome-wide, this should be visible as correlations between diversity and factors affecting the extent of linked selection. First, genomic regions with high recombination rates are expected to experience less linked selection since, as indicated above, recombination decouples linked loci and restricts the area of effect of a selected mutation (e.g., Kaplan ; Nordborg ). Thus, diversity should be positively correlated with recombination rate [see for example Mackay ]. Second, regions with a high density of potential targets for selection are expected to experience more linked selection simply because selection occurs more often in such regions. Thus, diversity should be negatively correlated with the density of target sites for selection. Indeed, such correlations have been found in several organisms (e.g., Begun and Aquadro 1992; Nachman 2001; Tenaillon ; Nordborg ; Cutter and Payseur 2013; Burri ). Third, mutation rate variation influences the number of newly arising mutations under purifying selection and thus also the local strength of linked selection. Given these relationships and with access to appropriate data, it should be possible to predict genetic diversity under varying influence of BGS and selective sweeps (Charlesworth 1996; McVicker ; Comeron 2014, 2017; Elyashiv ). Linked selection is of particular relevance in the context of speciation (Seehausen ; Wolf and Ellegren 2017). During speciation with gene flow, divergent selection at loci underlying ecologically relevant traits or genetic incompatibilities will hinder gene flow in the vicinity of such loci. In turn, this will lead to localized signals of increased differentiation and reduced diversity, and the formation of so-called speciation islands (Nosil 2008; Nosil ; Feder and Nosil 2010). However, it has recently been recognized that linked selection can produce similar patterns even in the absence of gene flow (Charlesworth 1998; Noor and Bennett 2009; Turner and Hahn 2010; Cruickshank and Hahn 2014; Burri ). Linked selection will locally reduce the effective population size (N), and thereby not only reduce diversity but also lead to elevated measures of relative differentiation between diverging lineages. If the extent of linked selection varies across the genome, so too will the extent of differentiation. The significance of genomic islands of differentiation in speciation is thus disputed [see Ravinet for a review]. Ficedula flycatchers represent a well-studied model system for speciation research (Lundberg and Alatalo 1992) and constitute one of the most prominent examples of species with distinct differentiation islands (which fully coincide with the location of dips in genetic diversity) spread across the genome (Ellegren ; Burri ). Relative differentiation (FST) between these recently (<1–2 MY; Nater ) diverged species is negatively correlated with recombination rate, indicating a role of linked selection in generating heterogeneous diversity/differentiation landscapes (Burri ). Here, we aim to disentangle the forces responsible for shaping the genomic diversity landscape of Ficedula flycatchers by modeling the expected impact of linked selection under various scenarios of positive and purifying selection. We compare these results with extensive genome-wide resequencing data and augment the analyses with simulations. We benefit from access to recombination rate data obtained from linkage analysis (Kawakami ) and genome annotation, including not only protein-coding genes but also conserved noncoding elements (Craig ). While we find that BGS is generally sufficient to explain the baseline levels of genetic diversity in the flycatcher genome, our study suggests that selective sweeps are necessary to generate the most pronounced diversity dips.

Methods

Genomic and population genomic data

Estimates of nucleotide diversity (π) in nonoverlapping 200-kb windows of the collared flycatcher (Ficedula albicollis), obtained from whole-genome resequencing of extensive population samples (79 individuals from four populations), were taken from Burri . To estimate site frequency spectrum-based statistics, an approach was used that integrates over genotype likelihoods as implemented in the software ANGSD (Korneliussen ). This method accounts for genotype uncertainty and considerably improves local estimates of π for low- and medium-coverage data. Additionally, the sequence data were carefully filtered to avoid biases caused by poorly aligned reads or low sequencing coverage. Briefly, repetitive regions in the reference genome were masked with a custom flycatcher-specific repeat library. Only sites fulfilling the following criteria were considered for calculation of π: minimum mapping quality of 1, minimum base quality of 20, site coverage across all individuals < 5 SD above the mean coverage, and a minimum read coverage of 5× per individual with a minimum of 10 callable genotypes per population. We used a discrete time hidden Markov model implemented in the “HiddenMarkov” R package (https://CRAN.R-project.org/package = HiddenMarkov) to classify the window-based π estimates into background regions and diversity valleys. For the observed process dependent on the two hidden states (background and diversity valley), we assumed two normal distributions with SD fixed to the SD of the empirical distribution of π values. We then optimized the means of the distributions for the two hidden states with the Baum–Welch algorithm (Baum ) with a maximum number of 1000 iterations. After parameter estimation, we used the Viterbi algorithm to find the most likely sequence of hidden states and identify genomic regions with a predicted diversity valley state. Collared flycatcher recombination rate estimates in cM/Mb for 200-kb windows were taken from Kawakami and were based on a high-density genetic map containing 4302 markers. To generate this map, 609 collared flycatcher individuals from a four-generation pedigree were genotyped with a custom-designed SNP array for 37,262 polymorphic loci. The use of recombination rate estimates from a pedigree-based linkage analysis instead of higher-resolution LD-based recombination maps (e.g., Kawakami ) has the advantage that they, in contrast to the latter, are not affected by selection [i.e., it allows the direct estimation of recombination rate (r)]. Coordinates for conserved regions in the F. albicollis genome, including noncoding regions, were taken from Craig . Exon information was obtained from the Ensembl annotation of the collared flycatcher genome assembly version FicAlb 1.4.

Expected reduction of π due to linked selection

We compared three different models to calculate the expected deviation from neutral diversity due to selection. Model 1 was with only BGS, model 2 was model 1 plus recent sweeps, and model 3 was model 2 plus ancient sweeps. For an overview of model parameters, see Table 1.

Table 1

Model parameters

Parameter	Value	Reference
π₀	0.0048	This study
µ	4.6 × 10⁻⁹	Smeds et al. (2016)
r	Per window	Kawakami et al. (2014)
N_e	450,000	Nater et al. (2015)
T	320,000	Nater et al. (2015)
u_d	Per window	Depends on dcs, Craig et al. (2018)
s_d	DFE	Bolívar et al. (2018)
s_b	0.1/0.05/0.01	Variable model parameter
α	1/0.2	Variable model parameter

DFE: distribution of fitness effects. dcs: density of conserved sites.

DFE: distribution of fitness effects. dcs: density of conserved sites. Under neutrality, nucleotide diversity , where is the effective population size and μ the mutation rate. Mutations at neutral sites are assumed to have no effect on fitness. We used equations from Hudson and Kaplan (1995) and Nordborg to calculate the expected deviation from neutral diversity due to BGS, (McVicker ). B at a focal neutral site x can be approximated bywhere is without selection and with free recombination, the sum is over all selected sites, and is the deleterious mutation rate per site, which is the neutral mutation rate times the proportion of selection targets. is the selection coefficient against heterozygotes, and r(x,i) is the recombination probability between the focal neutral site and the selected site i. For , the distribution of fitness effects of new mutations at nonsynonymous sites estimated for collared flycatchers was used, but with distinct values instead of intervals [13% Ns = −1, 9% Ns = −10, 14% Ns = −100, and 64% Ns = −500; compare to Bolívar ]. As it would be too computationally intensive to calculate B for every site along a chromosome, we implemented a window-based approach with chromosomes divided into nonoverlapping 1-kb windows. To calculate genetic distances between two sites, all sites (neutral and selected) were assumed to be located at the midpoint of their respective window. B for each window was thus obtained by considering a neutral site in the midpoint and calculating the influence of selected sites in all linked windows. The smaller the window size, the more exact this approach. To compare the model outcome with π from the data, which was available in 200-kb windows to match the recombination rate data in turn, mean B for the respective 200-kb windows was calculated from the mean of the 1-kb windows. As a measure of the density of sites under purifying selection (subsequently dcs), we used the number of conserved sites (Craig ) per window and assumed that all mutations occurring at these sites were deleterious. As π estimates from the data include sites under selection, the model has to include direct selection as well. For each window, the allele frequencies at conserved sites were assumed to be in mutation–selection balance. The expected frequency for additive deleterious mutations is (Crow and Kimura 1970) and for each conserved site in the window . Using the distribution of fitness effects of new mutations estimated for collared flycatchers (Bolívar ) and a mutation rate of 4.6 × 10−9 (Smeds ), we estimated mean π = 0.0006 for sites under purifying selection. Note that this is simply a model assumption. Following Elyashiv , the effect of positive selection can be incorporated into the model. The rate of coalescence at a neutral position x due to a selective sweep at a selected position j iswhere T is the length of the lineage for which substitutions are considered, α the fraction of substitutions that are beneficial, is the selection coefficient, and τ is the expected time to fixation of a positively selected mutation, which depends on N and s [see Elyashiv ]. As candidates for sites that have undergone a selective sweep, we used nonsynonymous substitutions fixed in the collared flycatcher since the split from pied flycatcher (F. hypoleuca) for recent sweeps or from red-breasted flycatcher (F. parva, with F. hyperythra as an outgroup) for ancient sweeps. We set the divergence time between collared flycatcher and pied flycatcher to 320,000 generations, and between collared flycatcher and red-breasted flycatcher to 2,500,000 generations (Nater ). We used an estimate of the long-term in collared flycatcher of 450,000 (Nater ). In general, we used = 0.1 and α = 1 for the recent sweeps, thus treating all nonsynonymous substitutions as strong sweeps, as a contrast to the BGS-only model. For the ancient sweeps, we generally used α = 0.2 (∼0.18 estimated for flycatcher (Bolívar ). However, we also investigated the effect on the diversity landscape when using different values for α and s. Following Elyashiv , total π isTo calculate the BGS parameter B from the data , we assumed a neutral (genome-wide average is 0.0041). This was obtained by averaging over all 200-kb windows with dcs < 0.05 and recombination rate >3 cM/Mb, which should represent the genomic regions least influenced by linked selection. Importantly, while the choice of affects , it does not influence the correlation between and B calculated from the model , as it is only a scaling parameter. For five chromosomes, recombination rate was not known for some windows at the chromosome ends. For chromosome 8, these were 22,200-kb windows, which were excluded from all analyses. For chromosomes 1, 6, 18, and 28, this only concerned one-to-three windows, for which recombination rates were interpolated from adjacent windows. Regions with interpolated recombination rates are marked in Supplemental Material, Figure S1.

Simulations

To validate the models, we used individual-based simulations in fwdpp 0.4 (Thornton 2014) with the following scenario. We assumed a constant population size of 2000 diploid individuals. The mutation rate was scaled accordingly to get a similar π0 as the data. The probability of a recombination event at a certain position was determined by the local recombination rate, taken from Kawakami . Individuals for reproduction were sampled proportionally to their fitness, with a constant population size (soft selection). Mating occurred by random pairing of gametes after mutation and recombination. As above, the density of conserved sites in a window was used to infer the probability of deleterious mutations. We simulated entire chromosomes corresponding to the sizes in the collared flycatcher reference genome and ran simulations for 6N generations. To calculate π, we drew a random sample of 1% of the individuals from the population.

Data availability

All data used for this study were previously published. See Kawakami , Burri and Craig . Supplemental Material and C++ code used for the simulations and calculation of Bm are available at Figshare: https://figshare.com/s/5ee30bae0acfad563865.

Results

Correlations between π and recombination rate, and density of conserved sites

The association between nucleotide diversity and rate of recombination in the collared flycatcher is best explained by a curvilinear relationship ( with a power regression vs. r2 = 0.02 with a linear regression, Figure 1a). Nucleotide diversity increased with recombination rate up to ≈1.5 cM/Mb, but was then saturated. The variance in π among windows was higher at low than at high recombination rates. As expected, π was negatively correlated with the density of conserved sites (dcs; , linear regression, Figure 1b). The correlation was weaker when exon density was used as a proxy for the density of target sites for selection , indicating an important role of selection in noncoding regions [compare with Craig )]. However, note that the correlation is not exclusively due to linked selection but also due to direct selection. A multiple linear regression of π in dependence of dcs and recombination rate (linear factors) gave . Plots of the distribution of π, recombination rate, and dcs along chromosomes readily demonstrate that the overall shape of the diversity landscape resembles the recombination landscape with diversity valleys corresponding to regions with low recombination, while fine-scale differences in π tend to be due to variation in dcs (Figure S1).

Figure 1

Relationship between mean nucleotide diversity and (a) recombination rate in cM/Mb , and (b) proportion of conserved sites (r2 = 0.22) for 200-kb windows.

Modeling of linked selection

The genome-wide correlation between from model 1 (BGS) and in 200-kb windows was (Figure 2a). Correlations for individual chromosomes are given in Table S1. An example of the relationship between and using model 1 along a chromosome (chromosome 12) is shown in Figure 2b (for other chromosomes, see Figure S5). While the baseline of the modeled diversity landscape closely matches the data, the amount of reduction in the diversity valley of this chromosome deviates strongly. This was the case for basically all diversity valleys, of which there usually are one or two per chromosome [see Ellegren and Burri ]. Higher mutation rates or lower recombination rates than the estimates used could explain some of the difference between and in valleys (see File S1).

Figure 2

(a) Genome-wide correlation between (the modeled deviation from neutral diversity) and (B calculated from data) under two different linked-selection models. The gray line marks identity for better orientation. Colored lines are linear regressions with blue showing only background selection (model 1), and green showing BGS and recent sweeps (model 2). (b) A comparison between (black) and calculated with BGS (model 1, blue line) and BGS plus recent sweeps (model 2, green), with , α = 1 for chromosome 12. BGS, background selection. If we include recent strong positive selection as indicated by nonsynonymous substitutions fixed in the collared flycatcher lineage since its split from pied flycatcher (model 2), the overall correlation between and for s = 0.1 was (Figure 2a). As the number of sweeps modeled is likely to be higher than the actual number, model 2 underestimates B more often and has higher variance than model 1, but the correlation is much better than with model 1. This was consistent for most chromosomes (Table S1). now showed deep and pronounced diversity valleys (Figure 2b). With lower s, the valley in chromosome 12 is less deep and the correlation is lower, but still higher than with BGS alone (r2 = 0.55; for s = 0.01, r2 = 0.58; for s = 0.05, r2 = 0.81; and for s = 0.1 r2 = 0.85. Model 3 incorporates sweeps at sites of substitutions since the split from the more distantly related red-breasted flycatcher. The overall correlation using α = 0.2 for the ancestral sweeps was weaker than with both model 1 and 2. Nevertheless, as exemplified for the terminal diversity valley of chromosome 12, some dips were only explained by model 3 (Figure 3). For some chromosomes, model 3 in fact gave the best fit (e.g., in chromosome 19 vs. r = 0.03 with model 2; Figure 4 and Table S1). This is important since it means that while sweeps are needed to explain diversity valleys, they do not necessarily have to be recent. Using α = 1 like for recent sweeps, the correlation becomes lower for chromosome 12 (r2 = 0.51 vs. 0.76 with α = 0.2).

Figure 3

Figure 4

A comparison between (black) and from model 1 (blue), model 2 (green), and model 3 (purple) for chromosome 19. Parameters are as in Figure 2.

A comparison between (black) and calculated with background selection plus recent (, α = 1) and ancient (, α = 0.2) sweeps (model 3) (purple). The gray line represents B under model 1. Parameters for recent sweeps as in Figure 2. A comparison between (black) and from model 1 (blue), model 2 (green), and model 3 (purple) for chromosome 19. Parameters are as in Figure 2. To summarize, which of the three models of linked selection best explains the observed local reductions in diversity differs along the genome? While BGS largely explains the base level of genomic diversity, the π valleys are explained better with either recent or ancient sweeps. Across all chromosomes, 56% of the π valleys were explained by model 2 (including recent sweeps), 26% by model 3 (also including ancient sweeps), and 9% by model 1 (only BGS), while 9% were not covered by any of the models. See Figure S5 for a comparison across all chromosomes. Individual-based simulations of diversity along one example chromosome confirmed our analytical results. Even though the general shape of the diversity landscape and also the base level of simulated B closely matched the analytical expectation under BGS , the diversity valley was not visible if we only allowed for deleterious mutations (Figure 5, blue line). If, in addition, beneficial mutations occurred, a diversity valley emerged (Figure 5, green line), though not as pronounced as in the empirical data. Importantly, the valley formed even though beneficial mutations occurred randomly all over the chromosome and not only in the valley region.

Figure 5

Simulation of chromosome 12, population size N = 2000, run for 6N generations, selection coefficient for deleterious mutations , averaged over 40 runs. Blue line: only background selection; green line: background selection plus 1% beneficial mutations with . The black line shows B calculated from the empirical data.

Discussion

Both the simulations and the analytical results show that BGS can explain a large part of the base-level and small-scale variation of genetic diversity in the flycatcher genome. The correlation between our BGS model and π from the data lies within the range of correlations found in similar studies in human (McVicker ) and Drosophila (Comeron 2014; Elyashiv ). On the other hand, the deep diversity valleys present in the data cannot be explained with BGS alone. When we included selective sweeps at candidate sites in the model or when we added 1% positively selected mutations to the simulations, we generally obtained a better fit to the empirical data. However, positive selection in the model needs to be relatively high to generate valleys as pronounced as observed in the data. There are alternative explanations to why BGS alone, with the parameters used herein, does not explain the full magnitude of variation in diversity levels. Since the rate and fitness effects of newly arising deleterious mutations are key parameters in determining the strength of BGS, a major limitation of this study might be the restriction to consider only point mutations. Other mutation events, such as short insertions or deletions, might contribute a substantial part of the total input of deleterious mutations (e.g., Mills ). Additionally, we assumed a distribution of fitness effects for deleterious mutations that was estimated based on nonsynonymous sites and might not accurately represent the distribution of effects of other mutation types. However, as discussed in File S1, while other reasons than selective sweeps, like under- or overestimation of used parameters (density of functional sites, mutation and recombination rate), or demography, may contribute to the depth of diversity valleys, we found none that would likely explain it exclusively. It is interesting to consider that while other forces than linked selection could increase the valley depth, they also would likely distort the correlation of π with the linked-selection landscape formed by recombination and the density of functional sites. Additionally, there are several modes of selection, like balancing selection, that our models did not consider. As we only used nonsynonymous substitutions as candidate sites for positive selection, it is also possible that additional sweeps affected noncoding regions, which might result in diversity valleys not explained by any of our models. These could be possible explanations for the valleys that could not be explained by our models. Our results concerning the relative contributions of BGS and hitchhiking to linked selection go in the same direction as the findings of Elyashiv in Drosophila melanogaster. For 100-kb windows, they found a correlation between π and B with BGS + sweeps of 0.44, compared to 0.42 with only BGS. The difference between the models with and without sweeps was thus larger in flycatchers than Drosophila. Notably, the valleys in the heterogeneous diversity landscape so characteristic for flycatchers was better explained with sweeps at sites of nonsynonymous substitutions in the flycatcher lineage than with BGS alone. The simulations showed that even with a random location of sweeps (i.e., not restricted to sites of nonsynonymous substitutions), diversity valleys were better captured than with BGS alone (see Figure 5). This can be explained by the higher impact of positive selection in regions with low recombination. With a higher overall effect of linked selection, recombination rate has a higher impact on the diversity landscape, leading to the formation of pronounced diversity valleys in low-recombination regions. In general, model 3 (with recent and ancient sweeps) gave a worse fit than model 2 (recent sweeps). This is because more false sweeps are introduced, producing valleys that do not fit the data and negatively affect the correlation. Although the model with recent sweeps was better than other models in explaining diversity valleys, in some cases only the model that also incorporated ancient sweeps could explain the presence of such valleys (see Figure 3 and Figure 4). That sweeps do not need to be recent or specific to the focal species to generate locally reduced diversity levels is consistent with the finding that the locations of valleys are often conserved among Ficedula species (Burri ). Most diversity valleys are located at chromosome ends and/or at the position of presumed centromeres (Ellegren ; Burri ). Sweeps could occur relatively frequently in these regions due to a meiotically driven arms race in centromeres or telomeres (Henikoff ; Malik and Henikoff 2009). A role, yet not exclusive, of positive selection behind the heterogeneous flycatcher diversity landscape is also consistent with the observation that Fay and Wu’s H statistic often shows signatures of positive selection in diversity valleys (Burri ). Contrary to the expectations for speciation islands, our findings indicate that diversity valleys in flycatchers are not a direct consequence of locally reduced gene flow. However, the reduced effective population size caused by the effects of linked selection in these regions might still promote the rapid build-up of drift-induced Dobzhansky–Muller incompatibilities or other isolating mechanisms during an allopatric phase, therefore reducing subsequent gene flow in secondary contact (Dobzhansky 1936; Muller 1942). Thus, although reduced gene flow can be safely rejected as cause of differentiation peaks in flycatchers, fully rejecting them as “speciation islands” might be premature. Our results indicate that positive selection contributes to the base variation of genetic diversity across the genome rather than producing classical outliers. Recombination rate and the (spatial and temporal) abundance of selected mutations determine how selection influences nearby neutral sites. This is true for deleterious and beneficial mutations, though the impact varies. Thus, it is quite reasonable to conclude that recombination rate and functional site density form a linked-selection scaffold, which determines the impact of linked selection in a certain region of the genome. This is the shape that we can calculate with equation 1, while the total impact depends on selection strength and sign. This may be one explanation why selective sweeps are so hard to find: they are embedded in the general linked-selection landscape. But it is important to keep in mind that the reverse is also true: if there are diversity valleys in the data, it does not necessarily mean that there is something special happening, the impact of linked selection might just be stronger in this region. This is also true for summary statistics that depend on linked selection. Note that even for windows without sites under selection we do not find B = 1, which means that the whole genome is influenced by linked selection. Thus, we should be careful when using intergenic regions as neutral baseline. Our results thus add to the general demand for using BGS as a null hypothesis for diversity levels (Comeron 2017).

43 in total

1. Speciation with gene flow could be common.

Authors: Patrik Nosil
Journal: Mol Ecol Date: 2008-04-10 Impact factor: 6.185

2. Measures of divergence between populations and the effect of forces that reduce variability.

Authors: B Charlesworth
Journal: Mol Biol Evol Date: 1998-05 Impact factor: 16.240

Review 3. Making sense of genomic islands of differentiation in light of speciation.

Authors: Jochen B W Wolf; Hans Ellegren
Journal: Nat Rev Genet Date: 2016-11-14 Impact factor: 53.242

4. The effect of deleterious mutations on neutral molecular variation.

Authors: B Charlesworth; M T Morgan; D Charlesworth
Journal: Genetics Date: 1993-08 Impact factor: 4.562

Review 5. Genomic signatures of selection at linked sites: unifying the disparity among species.

Authors: Asher D Cutter; Bret A Payseur
Journal: Nat Rev Genet Date: 2013-03-12 Impact factor: 53.242

Review 6. Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species.

Authors: M A F Noor; S M Bennett
Journal: Heredity (Edinb) Date: 2009-12 Impact factor: 3.821

Review 7. Genomics and the origin of species.

Authors: Ole Seehausen; Roger K Butlin; Irene Keller; Catherine E Wagner; Janette W Boughman; Paul A Hohenlohe; Catherine L Peichel; Glenn-Peter Saetre; Claudia Bank; Ake Brännström; Alan Brelsford; Chris S Clarkson; Fabrice Eroukhmanoff; Jeffrey L Feder; Martin C Fischer; Andrew D Foote; Paolo Franchini; Chris D Jiggins; Felicity C Jones; Anna K Lindholm; Kay Lucek; Martine E Maan; David A Marques; Simon H Martin; Blake Matthews; Joana I Meier; Markus Möst; Michael W Nachman; Etsuko Nonaka; Diana J Rennison; Julia Schwarzer; Eric T Watson; Anja M Westram; Alex Widmer
Journal: Nat Rev Genet Date: 2014-03 Impact factor: 53.242

8. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers.

Authors: Reto Burri; Alexander Nater; Takeshi Kawakami; Carina F Mugal; Pall I Olason; Linnea Smeds; Alexander Suh; Ludovic Dutoit; Stanislav Bureš; Laszlo Z Garamszegi; Silje Hogner; Juan Moreno; Anna Qvarnström; Milan Ružić; Stein-Are Sæther; Glenn-Peter Sætre; Janos Török; Hans Ellegren
Journal: Genome Res Date: 2015-09-09 Impact factor: 9.043

9. Background selection as baseline for nucleotide variation across the Drosophila genome.

Authors: Josep M Comeron
Journal: PLoS Genet Date: 2014-06-26 Impact factor: 5.917

10. Biased Inference of Selection Due to GC-Biased Gene Conversion and the Rate of Protein Evolution in Flycatchers When Accounting for It.

Authors: Paulina Bolívar; Carina F Mugal; Matteo Rossi; Alexander Nater; Mi Wang; Ludovic Dutoit; Hans Ellegren
Journal: Mol Biol Evol Date: 2018-10-01 Impact factor: 16.240

8 in total

1. The Temporal Dynamics of Background Selection in Nonequilibrium Populations.

Authors: Raul Torres; Markus G Stetter; Ryan D Hernandez; Jeffrey Ross-Ibarra
Journal: Genetics Date: 2020-02-18 Impact factor: 4.562

2. Evolution of strong reproductive isolation in plants: broad-scale patterns and lessons from a perennial model group.

Authors: Huiying Shang; Jaqueline Hess; Melinda Pickup; David L Field; Pär K Ingvarsson; Jianquan Liu; Christian Lexer
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2020-07-13 Impact factor: 6.671

3. Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers.

Authors: Sean Stankowski; Madeline A Chase; Allison M Fuiten; Murillo F Rodrigues; Peter L Ralph; Matthew A Streisfeld
Journal: PLoS Biol Date: 2019-07-24 Impact factor: 8.029

4. Recent introgression between Taiga Bean Goose and Tundra Bean Goose results in a largely homogeneous landscape of genetic differentiation.

Authors: Jente Ottenburghs; Johanna Honka; Gerard J D M Müskens; Hans Ellegren
Journal: Heredity (Edinb) Date: 2020-05-26 Impact factor: 3.821

Review 5. Avian Introgression Patterns are Consistent With Haldane's Rule.

Authors: Jente Ottenburghs
Journal: J Hered Date: 2022-07-23 Impact factor: 2.679

6. Genetic load has potential in large populations but is realized in small inbred populations.

Authors: Samarth Mathur; J Andrew DeWoody
Journal: Evol Appl Date: 2021-04-10 Impact factor: 5.183

7. Dissecting the Effects of Selection and Mutation on Genetic Diversity in Three Wood White (Leptidea) Butterfly Species.

Authors: Venkat Talla; Lucile Soler; Takeshi Kawakami; Vlad Dincă; Roger Vila; Magne Friberg; Christer Wiklund; Niclas Backström
Journal: Genome Biol Evol Date: 2019-10-01 Impact factor: 3.416

8. Selective Sweeps Under Dominance and Inbreeding.

Authors: Matthew Hartfield; Thomas Bataillon
Journal: G3 (Bethesda) Date: 2020-03-05 Impact factor: 3.154

8 in total