Literature DB >> 18769722

Repeated adaptive introgression at a gene under multiallelic balancing selection.

Vincent Castric1, Jesper Bechsgaard, Mikkel H Schierup, Xavier Vekemans.   

Abstract

Recently diverged species typically have incomplete reproductive barriers, allowing introgression of genetic material from one species into the genomic background of the other. The role of natural selection in preventing or promoting introgression remains contentious. Because of genomic co-adaptation, some chromosomal fragments are expected to be selected against in the new background and resist introgression. In contrast, natural selection should favor introgression for alleles at genes evolving under multi-allelic balancing selection, such as the MHC in vertebrates, disease resistance, or self-incompatibility genes in plants. Here, we test the prediction that negative, frequency-dependent selection on alleles at the multi-allelic gene controlling pistil self-incompatibility specificity in two closely related species, Arabidopsis halleri and A. lyrata, caused introgression at this locus at a higher rate than the genomic background. Polymorphism at this gene is largely shared, and we have identified 18 pairs of S-alleles that are only slightly divergent between the two species. For these pairs of S-alleles, divergence at four-fold degenerate sites (K = 0.0193) is about four times lower than the genomic background (K = 0.0743). We demonstrate that this difference cannot be explained by differences in effective population size between the two types of loci. Rather, our data are most consistent with a five-fold increase of introgression rates for S-alleles as compared to the genomic background, making this study the first documented example of adaptive introgression facilitated by balancing selection. We suggest that this process plays an important role in the maintenance of high allelic diversity and divergence at the S-locus in flowering plant families. Because genes under balancing selection are expected to be among the last to stop introgressing, their comparison in closely related species provides a lower-bound estimate of the time since the species stopped forming fertile hybrids, thereby complementing the average portrait of divergence between species provided by genomic data.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18769722      PMCID: PMC2517234          DOI: 10.1371/journal.pgen.1000168

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


Introduction

The genomes of incipient species diverge at heterogeneous rates, and recently diverged model species are key systems to investigate the causes of this heterogeneity [1]–[3]. Hybridization followed by introgression between recently diverged plant and animal species with incomplete reproductive barriers is one of the main processes generating the genomic heterogeneity in species divergence [4]. Indeed, some regions appear to be crossing the species barriers more readily than the genomic background (in Helianthus [5], Anopheles [6], Quercus [7], Mytilus [8], Mus [9] and Drosophila [10]). Although much of this heterogeneity may be accounted for by stochasticity of the genetic drift process, natural selection may also play an important role. In particular, because introgressive hybridization brings genetic material from one species into the co-adapted background of another species, some chromosomal fragments are expected to be selected against and resist introgression [11]. On the other hand, selection can also promote introgression when a transferred chromosome fragment is advantageous in the recipient species. In such a situation, introgression can potentially mediate the transfer of adaptations. Examples of adaptive introgression involving the transfer of transgenes conferring adaptations such as herbicide or insect resistance via hybridization with close relatives of crop species [12] have been documented, but other examples in natural populations are strikingly rare [13]. In the Louisiana Iris species complex for instance, detailed experimental studies provided support for the transfer of adaptations (flood and shade tolerance) between Iris fulva and I. hexagona [14]. In Helianthus, a recent experimental study reported that herbivore resistance traits have introgressed from Heliantus debilis to H. annuus, thereby increasing adaptation of their naturally occurring hybrid H. annuus taxanus [15]. All these documented examples are thus associated with strong directional selection for adaptive traits recently evolved in one of the species and then transmitted horizontally. Theory predicts that adaptive introgression should also be a general property of alleles at genes evolving under multi-allelic balancing selection, such as the vertebrate MHC system, plant disease resistance or self-incompatibility (SI) genes [16]. In these systems, rare alleles enjoy a strong selective advantage [17]. Assuming that a given allele is absent from one of two related species, introgression of this allele would then be as strongly favored as a new allele arising by mutation, unless this is impeded by linked genes that are not well adapted to the recipient species. Thus, in multi-allelic systems evolving under balancing selection, repeated exchanges of alleles promoted by adaptive introgression may be expected between closely related species, as long as fertile hybrids can be formed. Therefore, in the course of evolution of strong reproductive isolation between incipient species, such genomic regions should be among the last to stop introgressing. In this study, we test whether multi-allelic balancing selection mediates introgression between closely related species. We do this by contrasting divergence of a portion of the gene controlling self-incompatibility specificity (SRK) with the background level of genomic divergence in two closely related plant species. The study system consists of two closely related Arabidopsis species, A. lyrata and A. halleri, whose genomes diverged approximately 2 million years ago [18]. The two species have overlapping distributions in Northern Europe [19] and relatively recent introgression has been demonstrated for a small fraction of nuclear genes [20]. SI prevents self-fertilization and some matings among relatives through recognition and rejection of pollen expressing identical specificity. Molecular and genetic analyses of the self-incompatibility locus (S-locus) in A. lyrata and A. halleri identified many specificities, and the SRK sequences often form monophyletic pairs of high sequence similarity, each of which probably represent the same SI specificity in the two species derived from one specificity in their common ancestor. We refer to these pairs as trans-specifically shared pairs of S-alleles. We use divergence at fourfold degenerate sites between alleles within trans-specifically shared pairs to estimate the divergence corresponding to the time of the last introgression event for S-alleles between the two species, and we find that introgression has occurred at a higher rate or continued over more extended periods of time at the S-locus than at the rest of the nuclear genome.

Results

Extent of trans-Specific Allele Sharing at SRK

Our species-wide survey of sequence diversity reveals that a large fraction of alleles at the pistil self-incompatibility specificity-determining gene SRK (S-locus receptor kinase) are trans-specifically shared between the two species (Figure 1). Overall, we find 30 sets of SRK sequences in A. halleri and 38 sets of SRK sequences in A. lyrata. As is typical for S-alleles [21], the sequences fall into sets of nearly identical ones (presumably representing the same specificity, [21]–[23]) and ones with many differences from all other sequences (presumably representing functionally distinct specificities), with the most similar pairs within A. halleri and A. lyrata showing 44 and 51 differences, respectively, over a total of about 570 nucleotides. We then compared nucleotide sequences between S-alleles from the two species and find that the mismatch distribution (Figure 2) is clearly bimodal. Most comparisons are in line with intraspecific comparisons and range between 45 and 218 differences over a total of about 570 nucleotides (see also Figure S5), but the distribution shows a distinct set of 18 highly similar interspecific pairs of sequences (indicated by brackets in Figure 1) with at most 12 nucleotide differences. The numbers of non-synonymous differences within the 18 highly similar pairs of S-alleles ranged from 0 to 9 over a total of 380 non-synonymous sites. These sequences are more similar than pairs of alleles known to have retained the same specificity when comparing the closely related Brassica oleracea and B. rapa [24]–[27]. Even if these sequences currently occur in two different (but closely related) species, we therefore hypothesize that these pairs have retained identical specificity. We refer to these 18 pairs of S-alleles as “trans-specifically shared” pairs of alleles and note that they represent 60% and 47% of S-alleles found to date in A. halleri and A. lyrata, respectively. Two of these pairs (AlSRK37/AhSRK04 and AlSRK16/AhSRK10) were previously identified and shown additionally to be shared trans-specifically with A. thaliana [28]. Phylogenetic reconstructions show that both synonymous (Figure S2) and non-synonymous (Figure S3) differences are strikingly low within trans-specifically shared pairs and high among pairs. Note that, by definition, SRK alleles we consider as trans-specific pairs are determined based on those S allele pairs that have the fewest differences, so the procedure could potentially lead to ascertainment bias. Yet, close examination of the next best candidates (AhSRK03/AlSRK28, AhSRK28/AlSRK03, AhSRK23/AlSRK06 and AhSRK20/AlSRK04, Figure 1) suggests that none of these pairs is likely to represent pairs of trans-specifically shared alleles (detailed arguments are presented in Text S1).
Figure 1

Phylogeny of the 68 SRK sequences of A. lyrata and A. halleri.

The phylogeny was obtained by the neighbour-joining method on pairwise proportion of nucleotide divergence after Jukes-Cantor's correction. Brackets indicate interspecific pairs of sequences assumed to represent “trans-specifically shared S-alleles”, i.e. alleles assumed to have evolved from a single S-allele in the direct ancestor of A. lyrata and A. halleri.

Figure 2

Distribution of the number of pairwise nucleotide differences for SRK sequences in interspecific comparisons between A. halleri and A. lyrata.

Note the distinct peak of highly similar sequences observed. The vertical arrow represents the chosen threshold to define “trans-specifically shared” pairs of sequences (≤12 nucleotide differences). Note also that the two pairs of sequences with intermediate nucleotide differences (45 between AlSRK03 and AhSRK28, and 50 between AlSRK28 and AhSRK03) cannot represent trans-specifically shared S-alleles because they are not monophyletic (see Figure 1).

Phylogeny of the 68 SRK sequences of A. lyrata and A. halleri.

The phylogeny was obtained by the neighbour-joining method on pairwise proportion of nucleotide divergence after Jukes-Cantor's correction. Brackets indicate interspecific pairs of sequences assumed to represent “trans-specifically shared S-alleles”, i.e. alleles assumed to have evolved from a single S-allele in the direct ancestor of A. lyrata and A. halleri.

Distribution of the number of pairwise nucleotide differences for SRK sequences in interspecific comparisons between A. halleri and A. lyrata.

Note the distinct peak of highly similar sequences observed. The vertical arrow represents the chosen threshold to define “trans-specifically shared” pairs of sequences (≤12 nucleotide differences). Note also that the two pairs of sequences with intermediate nucleotide differences (45 between AlSRK03 and AhSRK28, and 50 between AlSRK28 and AhSRK03) cannot represent trans-specifically shared S-alleles because they are not monophyletic (see Figure 1).

Divergence within trans-Specifically Shared Pairs of S-Alleles

Within SRK, several hypervariable (HV) regions have been identified in the domain responsible for binding the pollen protein (S-domain) and shown to be targets of positive selection, suggesting they are involved in determination of specificity [29],[30]. Accordingly, HV regions from different specificities within species typically show an excess of non-synonymous substitutions [29],[31],[32]. In sharp contrast, we find that as compared to synonymous differences, non-synonymous differences are relatively less frequent in HV regions than in non-HV regions (on average 0.7 and 2.3 differences in HV and non-HV regions respectively for non-synonymous differences, versus 1.1 and 1.6 differences respectively for synonymous differences, Table 1). This contrast is significant by Fisher's exact test of independence (odds ratio = 2.5, p = 0.029), suggesting that sequence pairs that putatively encode the same specificity tend to have similar HV region sequences for non-synonymous sites, but might differ at synonymous sites in these regions, whereas other regions may differ at both types of sites.
Table 1

Divergence between Arabidopsis halleri and A. lyrata at trans-specifically shared SRK alleles at synonymous (K S), non synonymous (K A) and fourfold degenerate sites (K 4fold).

Coding sequence lengthNucleotides in HVSynonymous differencesNon-synonymous differencesNucleotide divergence
in HVnot in HVin HVnot in HV K S K A K 4fold
AlSRK01AhSRK0157812401120.00800.00450
AlSRK03AhSRK03598124000100.00210
AlSRK08AhSRK1959312434130.05810.00860.0969
AlSRK11AhSRK1156512212270.02410.01850.0423
AlSRK13AhSRK295379123120.04440.00720.0330
AlSRK14AhSRK09598124000300.00640
AlSRK15AhSRK2159012413100.03310.00220.0163
AlSRK16AhSRK1059212412020.01580.00220
AlSRK17AhSRK0255711821130.02510.00930.0160
AlSRK20AhSRK2455110614040.04460.00930.0347
AlSRK21AhSRK0651612421000.026000
AlSRK22AhSRK2656812201070.00810.01370
AlSRK28AhSRK2854812430000.025800.0300
AlSRK31AhSRK165099102120.01940.00750.0174
AlSRK34AhSRK0553911101020.00840.00480.0163
AlSRK37AhSRK0459212431120.03150.00870.0287
AlSRK39AhSRK185731240000000
AlSRK42AhSRK1255212412320.02610.01160.0163
Average 569 118 1.1 1.6 0.7 2.3 0.0221 0.0065 0.0193

All estimates were Jukes & Cantor corrected. HV refers to hypervariable regions as defined by Nishio and Kusaba [60].

All estimates were Jukes & Cantor corrected. HV refers to hypervariable regions as defined by Nishio and Kusaba [60]. If introgression occurs, then divergence might also be affected by the dominance of the S-alleles. Indeed, complex patterns of dominance relationships generally occur among alleles in sporophytic SI systems [33] and Billiard et al. [34] reported asymmetric selective pressures for dominant and recessive S-alleles because rare dominant S-alleles will tend to express their specificity more often than rare recessive ones (a process similar to “Haldane's sieve”-the bias against the establishment of recessive beneficial mutations [35],[36]). Hypothesizing that the introgression rate thus differs between dominant and recessive S-alleles, we tested for an effect of dominance on divergence between the two species. The range of variation observed for nucleotide differences across pairs of trans-specifically shared S-alleles cannot be explained fully by the stochasticity of the substitution process (Fisher's dispersion index = 2.03, P = 0.0103), but there was no obvious relationship between number of nucleotide differences and level of dominance of the S-alleles, as inferred from the phylogeny of alleles as suggested by [37]. Thus, we find no evidence that dominance affects S-allele divergence between the two species.

Comparison of Introgression Rate between S-Locus and Genomic Background

To test whether balancing selection resulted in adaptive introgression of S-alleles between the two species, we compared levels of divergence at fourfold degenerate sites between trans-specifically shared S-alleles with that of the genomic background, estimated from twelve unlinked control genes and two S-gene family members. These two sets of control genes give similar mean values of divergence (K 4fold = 0.0743 and K 4fold = 0.0904, respectively, Table 2), which are about four times higher than the average for trans-specifically shared pairs of S-alleles (K 4fold = 0.0193, Table 1).
Table 2

Divergence between Arabidopsis halleri and A. lyrata at reference genes and members of the S-gene family at fourfold degenerate sites (K 4fold).

Coding sequence lengthNumber of sequences analysed K 4fold References
A. lyrata A. halleri
Genomic backgroundCAD956880.048320
CHI26410100.146820
CHS117712110.088120
DFR3461080.061020
F3H45010100.122320
FAH110541080.073920
GS90614120.086120
MAML38812110.031220
CAUL24618360.018439, 40
HAT434019340.053139, 40
ScADH51527340.032339, 40
Aly944312280.129639, 40
Average 0.0743
S-gene familyAly10.1936110.1043this study
Aly10.2466110.0765this study
Average 0.0904

All estimates were Jukes & Cantor corrected.

*: Note that Aly 9 is a member of the S-domain gene family, but polymorphism data for this gene was used here to increase the genomic background dataset.

All estimates were Jukes & Cantor corrected. *: Note that Aly 9 is a member of the S-domain gene family, but polymorphism data for this gene was used here to increase the genomic background dataset. Because a large number of S-alleles are actively maintained within species by balancing selection, each S-allele has individually a small effective population size [21]. Thus, estimates of divergence for S-alleles and reference genes cannot be compared directly because of differences in effective population sizes (Figure 3). To take this into account, we used coalescent simulations to test whether our data are compatible with a null model of speciation (the “isolation with migration” model of Nielsen and Wakeley, [38]) that assumes identical introgression rate for S-alleles and the genomic background. Under this model, we first used previously published species-wide polymorphism data in A. halleri and A. lyrata from [20],[39],[40] to estimate rates of introgression, splitting time t as well as θ = 4N, where N is the effective population size in their common ancestor and μ the substitution rate. The maximum likelihood estimates for directional rates of introgression are m = 2.775×10−7, m = 2.912×10−7, and θ = 1.7975 (Table 3). The t estimate is 2,533,980 years [1,307,952-5,166,833], which is entirely consistent with the previous 2 Myrs estimate by Koch & Matschinger [18]. All estimates converge satisfactorily based on 10 replicate runs with different random seeds. To single out the N estimate, we then used A. thaliana as outgroup to obtain a substitution rate at fourfold degenerate sites of μ = 1.296×10−8 substitutions per nucleotide per year ([9.218×10−9–1.781×10−8] as 95% credible interval). The resulting estimate for N is 253,892 with [13,772-663,510] as 95% credible interval. Based on these parameters, we then simulated the evolution of two species exchanging migrants at the rate estimated above. The simulations were entirely consistent with the data for the genomic background (K = 0.0678 [0.0423–0.0955], Figure 4). In sharp contrast, conservatively assuming a reduction of effective population size for S-alleles by a factor 50 (as expected if 50 different S-alleles segregate in each species) only led to a modest reduction in divergence (K = 0.0465, Figure 4), whose 95% credible interval [0.0305–0.0640] did not comprise the observed value for K (K 4fold = 0.0193). Hence, the data are not consistent with equal introgression for S-alleles and the genomic background. This result is robust to the conservative use of the lower boundary of the 95%CI for either N A or t. Increasing the rates of introgression for S-alleles led to a sharp reduction in divergence between A. halleri and A. lyrata. The simulations best fitted the data when the directional rates of introgression were empirically increased for S-alleles by a factor 5, with divergence value closely approaching the observed data (K = 0.0182, Figure 4). A simpler analysis also confirmed that average net interspecific divergence [41] for S-alleles was lower than that at the genomic background (Text S1).
Figure 3

Divergence process between Arabidopsis lyrata, A. halleri and A. thaliana at unlinked genes (genomic background) and trans-specifically shared pairs of S-alleles.

Divergence times were taken from Koch et al. [18],[58]. θ, θ and θ, refer to polymorphism (θ = 4Nμ· in A. lyrata, A. halleri and their common ancestor. As compared to unlinked genes, divergence between trans-specifically shared S-alleles is affected by two confounding factors: (1) lower effective population size than the genomic background reducing coalescence time in the ancestral species, and (2) expected higher introgression as represented by thicker dark grey arrows.

Table 3

Estimates of θ = 4Nμ, effective population sizes, splitting time and rates of introgression using the isolation with migration model [38].

ParameterML estimate95% CI
Common ancestor
 θA 1.79750.0975–4.6975
 NA 253,89213,772–663,510
A. lyrata
 θlyrata 1.22250.7635–1.8405
 Nlyrata 172,675107,842–259,966
A. halleri
 θhalleri 0.77850.4635–1.2195
 Nhalleri 109,96165,468–172,251
Splitting time
 t (years)2,533,9801,307,952–5,166,833
Rates of introgression
 mhal→lyr 2.775×10−7 5.186×10−7–7.510×10−7
 mlyr→hal 2.912 ×10−7 2.035×10−7–1.059×10−7

NEffective population size in the common ancestor of A. lyrata and A. halleri.

NEffective population size in A. lyrata.

NEffective population size in A. halleri.

mRate at which genes come into A. lyrata from A. halleri as time moves forward.

m = Rate at which genes come into A. halleri from A. lyrata as time moves forward.

Figure 4

Predicted nucleotide divergence between A. halleri and A. lyrata for the genomic background (grey line), S-alleles with the same rate of introgression as the genomic background (dotted line) and S-alleles with 5-fold increased rate of introgression relative to the genomic background (black line).

10,000 coalescent simulations were performed for each case using maximum likelihood parameter estimates obtained under the “isolation with migration” model, except for the dotted line, where the 2.5% low ancestral population size estimate was used in order to be conservative. Observed nucleotide divergence for the genomic background and S-alleles are represented by grey and black stars on the x-axis, respectively.

Divergence process between Arabidopsis lyrata, A. halleri and A. thaliana at unlinked genes (genomic background) and trans-specifically shared pairs of S-alleles.

Divergence times were taken from Koch et al. [18],[58]. θ, θ and θ, refer to polymorphism (θ = 4Nμ· in A. lyrata, A. halleri and their common ancestor. As compared to unlinked genes, divergence between trans-specifically shared S-alleles is affected by two confounding factors: (1) lower effective population size than the genomic background reducing coalescence time in the ancestral species, and (2) expected higher introgression as represented by thicker dark grey arrows.

Predicted nucleotide divergence between A. halleri and A. lyrata for the genomic background (grey line), S-alleles with the same rate of introgression as the genomic background (dotted line) and S-alleles with 5-fold increased rate of introgression relative to the genomic background (black line).

10,000 coalescent simulations were performed for each case using maximum likelihood parameter estimates obtained under the “isolation with migration” model, except for the dotted line, where the 2.5% low ancestral population size estimate was used in order to be conservative. Observed nucleotide divergence for the genomic background and S-alleles are represented by grey and black stars on the x-axis, respectively. NEffective population size in the common ancestor of A. lyrata and A. halleri. NEffective population size in A. lyrata. NEffective population size in A. halleri. mRate at which genes come into A. lyrata from A. halleri as time moves forward. m = Rate at which genes come into A. halleri from A. lyrata as time moves forward. For three pairs of S-alleles (AlSRK01/AhSRK01, AlSRK34/AhSRK05, AlSRK37/AhSRK04) we also surveyed intra-allelic variation in at least 10 copies from each species. We found very little diversity among allelic copies within each surveyed allele in each species (average synonymous diversity = 0.0064, data not shown) in accordance with their low expected effective population sizes. We examined the sequences for shared polymorphisms, and found none in any of these S-allele pairs. This suggests old and infrequent, rather than recent, introgression events since the separation of A. lyrata and A. halleri. Moreover, the estimated divergence among pairs of S-alleles was more heterogeneous than expected based on the Poisson distribution, suggesting that the last introgression event occurred at different times for different alleles.

Discussion

Impact of Founder Events at Speciation

The possibility of introgression of S-alleles may have important consequences for the extent of allelic diversity maintained within self-incompatible species. If introgression occurs, hybridizing species effectively share a common pool of S-alleles. If hybridization is restricted, the two species together can maintain more S-alleles than each species individually [42]. Such a process could be especially important in the first stages of the split because reproductive barriers may then be more leaky, and also because allelic diversity at the S-locus within incipient species may be decreased if founding events were associated with speciation. This process could be responsible for maintaining many highly divergent allelic lineages at the S locus within plant families, where trans-generic sharing of allelic lineages seems to be the rule, and loss of ancestral allelic lineages through strong bottlenecks within particular genera the exception, as has been described in the Solanaceae [43]. It can therefore be misleading to use a species' extant number of lineages at a gene under balancing selection to estimate the minimum population size at speciation. For instance, using polymorphism data for MHC in humans, Takahata [44] predicted that the number of breeding individuals in the human lineage could not be as small as 50-100 at any time of its evolutionary history, assuming two extant ancestral allelic lineages at HLA-B. According to our hypothesis of adaptive introgression mediated by balancing selection, variation can be efficiently “rescued”, and stronger founder events at speciation would still be compatible with extant variation at HLA-B, if some interbreeding occurred with the chimpanzee lineage after the split. Although identifying the functional types of alleles may not be simple in that case (and recombination may confine the effect of balancing selection to a small region around the selected sites themselves), a detailed analysis of MHC alleles in the great apes would be of great interest to survey whether adaptive introgression mediated by balancing selection has indeed occurred in primates.

Shared Chloroplast Haplotypes: Distinguishing between Introgression and Ancestral Polymorphism

A recent study by Koch and Matschinger [18] reported that, whereas A. lyrata and A. halleri were well separated in phylogenetic trees based on the nuclear encoded ITS region, several cpDNA haplotypes are shared between both species [18]. This was interpreted as ancestral polymorphism segregating for the chloroplast but not the nucleus. However, this interpretation is at odds with the smaller effective population size expected for the chloroplast (approximately 1/2 for hermaphroditic species, [45]) and the consequent low expected variability. Indeed most studies in plants have found low sequence diversity for chloroplast genes, taking into account their low mutation rate [46], and also stronger differentiation among populations for chloroplast than nuclear markers [47]. In line with our results from S-alleles, we suggest the alternative interpretation that introgression occurred more readily for the chloroplast than nuclear genes, as has been reported in several instances (e.g. [48],[49]). The haplotype network of chloroplast sequences reported by Koch and Matschinger [18] also showed greater sharing of more basal haplotypes, suggesting that chloroplast introgression has become less common in recent times.

Evolution of New Specificities of Self-Incompatibility Genes

Our results also shed light on the evolution of self-incompatibility specificities. Indeed, our data strongly suggest that purifying selection prevents the substitution of non-synonymous differences within HV regions, supporting a role for these regions in determining specificity. More specifically, the strength of purifying selection seems higher on the HV regions than on the rest of the sequence, and this could be related to strong selection against mutations altering specificities. Mechanisms selecting against mutant S-alleles with altered pistil specificities have been discussed by Uyenoyama et al. [50]. Inter-species exchanges of S-alleles may, however, be important in the evolution of new specificities. Chookajorn et al. [51] suggested that new specificities could evolve if sufficient variation could be maintained within the pollen (or pistil) S gene for enough time to allow variants of the other gene to co-evolve with them. Due to the small effective population size of individual S-alleles, this hypothesis requires population structure with very limited migration [16]. Speciation with some introgression of S-alleles leads to precisely the strongly subdivided population needed for this mechanism to work. Under this hypothesis, two alleles could slowly evolve to different specificities in two isolated species and then add to the number of S-alleles in each species after reciprocal introgression. Data testing the specificities of sequence pairs in the two species that differ at few amino acids might help determine whether new specificities have indeed arisen in one species or the other since they split.

Material and Methods

We surveyed sequence diversity at SRK in two species-wide samples in A. halleri and A. lyrata over a total of about 570 nucleotides from the 3′ end of the S-domain using the strategy detailed in [31]. We identified and sequenced five and eight new putative S-alleles in A. halleri and A. lyrata, respectively. Overall, we analyzed 30 SRK sequences in A. halleri and 38 sequences in A. lyrata. In each case, the nucleotide sequence was obtained as a consensus over three independently obtained sequence products. All identified sequences in A. halleri and A. lyrata were amino-acid translated and aligned by ClustalW in BioEdit 7.0.5 [52] and adjusted by eye. On the overall set of sequences at SRK, we used MEGA 4 [53] to reconstruct a phylogeny using the Neighbor-Joining method based on the total number of differences per site or on the number of either synonymous or non-synonymous differences. Within each pair of trans-specifically shared sequences at SRK, we estimated the number of synonymous nucleotide differences per synonymous site between the A. halleri and the A. lyrata copy using the method of [54] with MEGA 4. A homogeneous substitution process across all pairs is expected to result in an accumulation of nucleotide differences according to the Poisson distribution. We used Fisher's dispersion index to test whether the distribution of nucleotide differences across trans-specifically shared sequence pairs could be explained by the stochasticity of the substitution process alone. We used Fisher's exact test of independence to test whether synonymous and non-synonymous differences hit HV regions equally frequently.

Inference on Introgression Patterns at the S-Locus

Background genomic divergence was estimated by the species-wide average nucleotide divergence at fourfold degenerate sites (K 4fold) between the two species for 12 reference genes that had been previously sequenced [20],[39],[40] and two genes that are members of the S-domain gene family (Aly10.1, Aly10.2). To determine whether difference in effective population size and thus coalescence time between S-alleles and genomic background may suffice to explain the low divergence of S-alleles, we applied the isolation with migration model of Nielsen and Wakeley [38] to polymorphism at fourfold degenerate sites in both species for the eleven reference genes plus Aly9 (12 genes in total, see table 2) as implemented in the IM program [38]. We chose to focus on fourfold degenerate sites only because differences in substitution rates have been reported among codon positions [55]. The program DNAsp [56] was used to generate a datafile containing fourfold degenerate sites only. The procedure was run with 10 different random seeds to ensure proper convergence of the six free parameters, i.e. θ, θ, θ hallyr, m lyrhal (polymorphism θ = 4Nμ in the common ancestor of A. halleri and A. lyrata, polymorphism in A. lyrata, polymorphism in A. halleri, splitting time and the rate at which genes introgressed into A. lyrata from A. halleri and into A. halleri from A. lyrata as time moves forward, respectively). The HKY mutation model [57] was used. To single out the N estimate, we estimated the average per fourfold degenerate site mutation rate (μ) as follows. We used A. thaliana as outgroup to estimate the average net nucleotide divergence at fourfold degenerate sites between A. thaliana and A. halleri and between A. thaliana and A. lyrata for each reference gene. Assuming that the lineages leading to A. thaliana and the common ancestor of A. lyrata and A. halleri separated 5 million years ago [58], we obtained a mutation rate estimate per site per year for each reference gene. We computed an average mutation rate per site per year (μ) by taking the geometric mean over genes. A mutation rate per generation was computed assuming a mean generation time of two years. The maximum likelihood estimates were then used to simulate divergence between two species isolated since one million generations but still capable of introgression. Ten thousand replicates of pairs of genes with the same number of nucleotides as the real data were performed using SIMCOAL2 [59]. The genomic background divergence was first used to confirm that the simulations parameters were appropriate. We then determined whether the observed divergence for S-alleles was consistent with the overall genomic rate of introgression by simulating the evolution of S-alleles in this system assuming that 50 S-alleles segregate in the species, and thus that the effective population size of each allelic class is reduced by a factor 50. To remain conservative in this analysis, S-alleles were simulated under the 2.5% low boundary of the 95% credible interval for N obtained from IM. Using the maximum likelihood estimate for N, we then aimed to determine the extent to which introgression is increased for S-alleles relative to the genomic background. We did so by gradually increasing m hallyr and m lyrhal for S-alleles by a multiplicative factor from one to ten until the simulated data came close to the observed divergence. The sequences reported in this paper have been deposited in the GenBank database under accession numbers EU878008- EU878026. Phylogeny of SRK sequences from the species A. lyrata (n = 38), A. halleri (n = 30) and Capsella grandiflora (n = 7, shown in bold). The phylogeny was obtained by the neighbour-joining method on the proportion of amino-acid differences. Brackets indicate the position of two trans-specifically shared pairs of S-alleles between A. lyrata and A. halleri that are interrupted by the branching of one S-alleles from C. grandiflora (thick lines). (1.16 MB TIF) Click here for additional data file. Phylogenies of 68 SRK sequences from A. lyrata and A. halleri. The phylogeny was obtained by the neighbour-joining method on synonymous differences. Bootstrap support was obtained by 1,000 independent replicates. (1.30 MB TIF) Click here for additional data file. Phylogenies of 68 SRK sequences from A. lyrata and A. halleri. The phylogeny was obtained by the neighbour-joining method on non-synonymous differences. Bootstrap support was obtained by 1,000 independent replicates. (1.36 MB TIF) Click here for additional data file. Bootstrap distribution (10,000 replicates) of net divergence for SRK alleles (average across 18 S alleles pairs, in black) and the genomic background (average across 12 control genes, in grey). (0.86 MB TIF) Click here for additional data file. Distribution of the number of pairwise nucleotide differences for SRK sequences in interspecific comparisons between A. halleri and A. lyrata, excluding the 18 pairs of sequences considered as transspecific pairs. (0.43 MB TIF) Click here for additional data file. Net divergence estimation for the 12 control genes. (0.05 MB DOC) Click here for additional data file. Supplemental material. (0.05 MB DOC) Click here for additional data file.
  51 in total

1.  Specificity determinants and diversification of the Brassica self-incompatibility pollen ligand.

Authors:  Thanat Chookajorn; Aardra Kachroo; Daniel R Ripoll; Andrew G Clark; June B Nasrallah
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-23       Impact factor: 11.205

Review 2.  Pollen recognition and rejection during the sporophytic self-incompatibility response: Brassica and beyond.

Authors:  Simon J Hiscock; Stephanie M McInnis
Journal:  Trends Plant Sci       Date:  2003-12       Impact factor: 18.313

3.  SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history.

Authors:  Guillaume Laval; Laurent Excoffier
Journal:  Bioinformatics       Date:  2004-04-29       Impact factor: 6.937

4.  A general model to explore complex dominance patterns in plant sporophytic self-incompatibility systems.

Authors:  Sylvain Billiard; Vincent Castric; Xavier Vekemans
Journal:  Genetics       Date:  2007-01-21       Impact factor: 4.562

5.  An approach to population and evolutionary genetic theory for genes in mitochondria and chloroplasts, and some results.

Authors:  C W Birky; T Maruyama; P Fuerst
Journal:  Genetics       Date:  1983-03       Impact factor: 4.562

6.  Hitch-hiking to a locus under balancing selection: high sequence diversity and low population subdivision at the S-locus genomic region in Arabidopsis halleri.

Authors:  Maria Valeria Ruggiero; Bertrand Jacquemin; Vincent Castric; Xavier Vekemans
Journal:  Genet Res (Camb)       Date:  2008-02       Impact factor: 1.588

7.  Subdivision and haplotype structure in natural populations of Arabidopsis lyrata.

Authors:  Stephen I Wright; Beatrice Lauga; Deborah Charlesworth
Journal:  Mol Ecol       Date:  2003-05       Impact factor: 6.185

8.  Coevolution of the S-locus genes SRK, SLG and SP11/SCR in Brassica oleracea and B. rapa.

Authors:  Keiichi Sato; Takeshi Nishio; Ryo Kimura; Makoto Kusaba; Tohru Suzuki; Katsunori Hatakeyama; David J Ockendon; Yoko Satta
Journal:  Genetics       Date:  2002-10       Impact factor: 4.562

9.  Multilocus analysis of variation and speciation in the closely related species Arabidopsis halleri and A. lyrata.

Authors:  Sebastián E Ramos-Onsins; Barbara E Stranger; Thomas Mitchell-Olds; Montserrat Aguadé
Journal:  Genetics       Date:  2004-01       Impact factor: 4.562

10.  Evolution under strong balancing selection: how many codons determine specificity at the female self-incompatibility gene SRK in Brassicaceae?

Authors:  Vincent Castric; Xavier Vekemans
Journal:  BMC Evol Biol       Date:  2007-08-06       Impact factor: 3.260

View more
  55 in total

1.  Extensive recent secondary contacts between four European white oak species.

Authors:  Thibault Leroy; Camille Roux; Laure Villate; Catherine Bodénès; Jonathan Romiguier; Jorge A P Paiva; Carole Dossat; Jean-Marc Aury; Christophe Plomion; Antoine Kremer
Journal:  New Phytol       Date:  2017-01-13       Impact factor: 10.151

2.  Identification, genealogical structure and population genetics of S-alleles in Malus sieversii, the wild ancestor of domesticated apple.

Authors:  X Ma; Z Cai; W Liu; S Ge; L Tang
Journal:  Heredity (Edinb)       Date:  2017-06-21       Impact factor: 3.821

3.  Long-term balancing selection drives evolution of immunity genes in Capsella.

Authors:  Daniel Koenig; Jörg Hagmann; Rachel Li; Felix Bemm; Tanja Slotte; Barbara Neuffer; Stephen I Wright; Detlef Weigel
Journal:  Elife       Date:  2019-02-26       Impact factor: 8.140

Review 4.  The timetable for allopolyploidy in flowering plants.

Authors:  Donald A Levin
Journal:  Ann Bot       Date:  2013-08-21       Impact factor: 4.357

Review 5.  Next-generation hybridization and introgression.

Authors:  A D Twyford; R A Ennos
Journal:  Heredity (Edinb)       Date:  2011-09-07       Impact factor: 3.821

6.  Genetic variation during range expansion: effects of habitat novelty and hybridization.

Authors:  Amanda A Pierce; Rafael Gutierrez; Amber M Rice; Karin S Pfennig
Journal:  Proc Biol Sci       Date:  2017-04-12       Impact factor: 5.349

7.  Balancing selection and introgression of newt immune-response genes.

Authors:  Anna Fijarczyk; Katarzyna Dudek; Marta Niedzicka; Wiesław Babik
Journal:  Proc Biol Sci       Date:  2018-08-15       Impact factor: 5.349

8.  The genomics of adaptation.

Authors:  Jacek Radwan; Wiesław Babik
Journal:  Proc Biol Sci       Date:  2012-10-24       Impact factor: 5.349

9.  Cold tolerance in the genus Arabidopsis.

Authors:  Jessica J Armstrong; Naoki Takebayashi; Diana E Wolf
Journal:  Am J Bot       Date:  2020-02-24       Impact factor: 3.844

10.  The origin of populations of Arabidopsis thaliana in China, based on the chloroplast DNA sequences.

Authors:  Ping Yin; Juqing Kang; Fei He; Li-Jia Qu; Hongya Gu
Journal:  BMC Plant Biol       Date:  2010-02-08       Impact factor: 4.215

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.