| Literature DB >> 20978039 |
Darren J Obbard1, Francis M Jiggins, Nicholas J Bradshaw, Tom J Little.
Abstract
Antagonistic host-parasite interactions can drive rapid adaptive evolution in genes of the immune system, and such arms races may be an important force shaping polymorphism in the genome. The RNA interference pathway gene Argonaute-2 (AGO2) is a key component of antiviral defense in Drosophila, and we have previously shown that genes in this pathway experience unusually high rates of adaptive substitution. Here we study patterns of genetic variation in a 100-kbp region around AGO2 in three different species of Drosophila. Our data suggest that recent independent selective sweeps in AGO2 have reduced genetic variation across a region of more than 50 kbp in Drosophila melanogaster, D. simulans, and D. yakuba, and we estimate that selection has fixed adaptive substitutions in this gene every 30-100 thousand years. The strongest signal of recent selection is evident in D. simulans, where we estimate that the most recent selective sweep involved an allele with a selective advantage of the order of 0.5-1% and occurred roughly 13-60 Kya. To evaluate the potential consequences of the recent substitutions on the structure and function of AGO2, we used fold-recognition and homology-based modeling to derive a structural model for the Drosophila protein, and this suggests that recent substitutions in D. simulans are overrepresented at the protein surface. In summary, our results show that selection by parasites can consistently target the same genes in multiple species, resulting in areas of the genome that have markedly reduced genetic diversity.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20978039 PMCID: PMC3021790 DOI: 10.1093/molbev/msq280
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FGenomic positions and gene trees for loci surrounding AGO2. In each tree, the upper clade is Drosophila melanogaster and the lower clade is D. simulans. The size and position of amplified regions are shown by white boxes, and gray boxes show the corresponding genes (note that the amplified fragment from Yellow-k partially overlaps locus CG7945). The total length of the surveyed region was approximately 123 kbp, and the total length of amplified sequence per individual was approximately 13.5 kbp. Gene trees were constructed using neighbor joining (MEGA v. 3.1, Kumar et al. 2004), based on coding sites only and were rooted using D. yakuba. All trees are drawn to the same scale. The shallow within-species genealogy associated with recent selective sweeps in AGO2 is clear in both species.
FGenetic diversity around AGO2. Genetic diversity at all sites (upper row) and synonymous sites (middle row) is considerably reduced around AGO2 (positioned at zero on the x axis) in all three species. This is also reflected in the diversity/divergence ratio at synonymous sites (lower row: synonymous site diversity within species, θs, divided by divergence between species, KS). Loci that are significantly different from AGO2 in individual pairwise HKA tests (Hudson et al. 1987) are marked on the diversity/divergence graphs with asterisks: *P < 0.05, **P < 0.01, ***P < 0.001.
HKA Likelihood Ratio Tests.
| ln | 2ΔLnL | |||
| No selection | 1 | −49.2 | ||
| Selection on | 0.22 | −46.5 | 5.4 | 0.0200 |
| No selection | 1 | −62.6 | ||
| Selection on | 0.08 | −56.7 | 11.8 | 0.0006 |
| No selection | 1.00 | −53.3 | ||
| Selection on | 0.22 | −50.6 | 5.4 | 0.0206 |
Note.—k is the estimated reduction in diversity due to selection on AGO2 (Wright and Charlesworth 2004). lnL is the log-likelihood of the model, and 2ΔLnL is the log-likelihood test statistic.
Haplotype Configuration Tests.
| CG7275 | 12 (7, 12) | 1 (1, 5) | 0.917 (12, 0, 0, 0, ,0, 0, 0, 0, 0, , 0) | ns |
| yellow-k | 11 (7, 12) | 2 (1, 5) | 0.903 (10, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| CrebA | 11 (6, 12) | 2 (1, 5) | 0.903 (10, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| AGO2 | 7 (7, 12) | 3 (1, 4) | 0.820 (4, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0) | 0.025 |
| CG7739 | 9 (5, 10) | 2 (2, 6) | 0.875 (6, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| CG6498 | 7 (6, 11) | 5 (2, 5) | 0.764 (5, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0) | ns |
| CG12301 | 9 (6, 11) | 2 (2, 5) | 0.875 (6, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| RhoGAP71E | 8 (4, 9) | 3 (2, 7) | 0.847 (5, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| CG7372 | 9 (6, 12) | 3 (1, 5) | 0.861 (7, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0) | ns |
| CG7275 | 14 (14, 21) | 4 (1, 4) | 0.903 (10, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
| yellow-k | 16 (14, 20) | 3 (2, 4) | 0.921 (13, 1 , 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
| CrebA | 20 (13, 20) | 2 (2, 5) | 0.948 (19, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
| AGO2 | 8* (10, 18) | 13** (2, 8) | 0.594** (6, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…) | <0.005 |
| CG7739 | 6** (9, 17) | 14** (1, 8) | 0.530** (3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,…) | <0.005 |
| CG6498 | 18 (12, 20) | 3 (2, 5) | 0.934 (16, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
| CG12301 | 14 (14, 21) | 6** (1, 4) | 0.875* (11, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…) | 0.011 |
| RhoGAP71E | 17 (12, 19) | 4 (2, 6) | 0.921 (15, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
| CG7372 | 18 (16, 21) | 2 (1, 4) | 0.939 (15, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…) | ns |
Note.—K (95%) is the number of haplotypes (95% bounds under neutrality from simulation), M (95%) the frequency of the most common haplotype (95% bounds under neutrality from simulation), and H the haplotype diversity (as in Depaulis and Veuille 1998). Asterisks denote significance. The haplotype configuration is a vector that records the frequency of haplotypes occurring n times in the sample where n = (1,2,3, …, x) and x is the sample size (Innan et al. 2005). Note that the fragment from Yellow-k partly overlaps locus CG7945).
FComposite likelihood profile. The CLR between a standard neutral model and selective sweep model, considering each site in turn (Li and Stephan 2005). The region surrounding AGO2 is shown for Drosophila melanogaster (upper panel) and D. simulans (lower panel). Gray regions are those for which sequence data are available, and the thin horizontal line shows the most stringent 5% significance threshold for this statistic derived a range of plausible population-growth scenarios (see Materials and Methods). The maximum likelihood estimate of the focal site for the sweep is given by a vertical dashed line; note that under this model there is no significant evidence of a recent sweep in D. melanogaster but that D. simulans shows strong evidence of a sweep in AGO2.
FRecent amino acid substitutions in D. simulans AGO2. The surface structure of Drosophila AGO2 derived from published archean and Drosophila Argonaute structures by fold-recognition and homology modeling (see Materials and Methods). Moving down the figure, the four panels are successive 90° rotations about the vertical axis. The PAZ domain is indicated in green, the PIWI domain is indicated in blue, and the amino acid substitutions that occurred in D. simulans since the split from D. sechellia (ca. 250 kya) are shown in red. The two remaining substitutions at L106 and S404 are buried within the structure (see also supplementary fig. S2, Supplementary Material online).
Interspecific Divergence and McDonald–Kreitman Analysis.
| Ln | Ls | Ds | Ps | Dn | Pn | |||||||
| CG7275 | 12 | 941 | 280 | 0.002 | 0.054 | 0.042 | 9 | 21 | 2 | 1 | 0 | 0.000 |
| yellow-k | 12 | 882 | 252 | 0.005 | 0.059 | 0.082 | 9 | 16 | 3 | 6 | −6 | −0.007 |
| CrebA | 12 | 761 | 247 | 0.003 | 0.04 | 0.079 | 8 | 8 | 2 | 3 | −3 | −0.004 |
| AGO2 | 12 | 1884 | 567 | 0.032 | 0.068 | 0.476 | 34 | 5 | 54 | 4 | 48 | 0.025 |
| CG7739 | 12 | 1055 | 322 | 0.009 | 0.082 | 0.11 | 23 | 7 | 9 | 3 | 4 | 0.004 |
| CG6498 | 12 | 1074 | 336 | 0.004 | 0.038 | 0.103 | 11 | 6 | 4 | 2 | 1 | 0.001 |
| CG12301 | 12 | 1064 | 275 | 0.022 | 0.027 | 0.819 | 7 | 5 | 21 | 8 | 8 | 0.008 |
| RhoGAP71E | 12 | 537 | 171 | 0.001 | 0.062 | 0.017 | 8 | 6 | 0 | 2 | −3 | −0.006 |
| CG7372 | 12 | 974 | 265 | 0.017 | 0.065 | 0.263 | 15 | 5 | 14 | 11 | −3 | −0.003 |
| CG7275 | 21 | 941 | 280 | 0.005 | 0.081 | 0.061 | 8 | 56 | 4 | 7 | 2 | 0.002 |
| yellow-k | 21 | 883 | 251 | 0.008 | 0.071 | 0.11 | 7 | 43 | 3 | 17 | −3 | −0.003 |
| CrebA | 21 | 761 | 247 | 0.001 | 0.031 | 0.018 | 6 | 8 | 0 | 2 | −1 | −0.001 |
| AGO2 | 21 | 1889 | 565 | 0.041 | 0.067 | 0.632 | 37 | 6 | 75 | 1 | 75 | 0.040 |
| CG7739 | 21 | 1058 | 328 | 0.011 | 0.04 | 0.27 | 11 | 10 | 11 | 3 | 10 | 0.009 |
| CG6498 | 21 | 1072 | 338 | 0.002 | 0.033 | 0.064 | 6 | 26 | 2 | 6 | 0 | 0.000 |
| CG12301 | 21 | 1121 | 295 | 0.009 | 0.051 | 0.179 | 10 | 36 | 4 | 33 | −7 | −0.006 |
| RhoGAP71E | 21 | 531 | 171 | 0.007 | 0.046 | 0.162 | 2 | 28 | 1 | 15 | −4 | −0.008 |
| CG7372 | 21 | 948 | 258 | 0.028 | 0.077 | 0.359 | 7 | 64 | 6 | 94 | −26 | −0.027 |
| CG7275 | 7 | 939 | 282 | 0.014 | 0.149 | 0.091 | 29 | 37 | 11 | 3 | 4 | 0.005 |
| yellow-k | 8 | 880 | 254 | 0.019 | 0.166 | 0.113 | 34 | 13 | 14 | 9 | −5 | −0.006 |
| CrebA | 7 | 759 | 249 | 0.017 | 0.091 | 0.183 | 21 | 4 | 12 | 2 | 7 | 0.010 |
| AGO2 | 11 | 1882 | 566 | 0.04 | 0.215 | 0.186 | 11 | 8 | 70 | 4 | 63 | 0.033 |
| CG7739 | 8 | 1056 | 327 | 0.015 | 0.19 | 0.079 | 52 | 12 | 15 | 1 | 13 | 0.012 |
| CG6498 | 8 | 1009 | 323 | 0.01 | 0.15 | 0.068 | 37 | 19 | 10 | 1 | 8 | 0.008 |
| RhoGAP71E | 7 | 557 | 178 | 0.015 | 0.184 | 0.079 | 26 | 12 | 8 | 0 | 8 | 0.014 |
| CG7372 | 7 | 1034 | 283 | 0.112 | 0.245 | 0.458 | 59 | 22 | 96 | 17 | 58 | 0.056 |
Note.—n is the number of alleles sampled; Ln and Ls the number of nonsynonymous and synonymous sites, respectively; KA and KS are the nonsynonymous and synonymous divergence; Ds, Ps, Dn, and Pn are counts of fixed differences (D) and polymorphisms (P) at synonymous and nonsynonymous sites; and “a” is the maximum-likelihood estimate of the number of nonsynonymous adaptive substitutions per gene under the model described in Obbard, Welch, et al. (2009) and Materials and Methods (above). Divergence is measured from their common ancestor in the case of D. melanogaster and D. simulans and from D. erecta in the case of D. yakuba.