| Literature DB >> 23275874 |
Jacob E Crawford1, Emmanuel Bischoff, Thierry Garnier, Awa Gneme, Karin Eiglmeier, Inge Holm, Michelle M Riehle, Wamdaogo M Guelbeogo, N'Fale Sagnon, Brian P Lazzaro, Kenneth D Vernick.
Abstract
Host-pathogen interactions can be powerful drivers of adaptive evolution, shaping the patterns of molecular variation at the genes involved. In this study, we sequenced alleles from 28 immune-related loci in wild samples of multiple genetic subpopulations of the African malaria mosquito Anopheles gambiae, obtaining unprecedented sample sizes and providing the first opportunity to contrast patterns of molecular evolution at immune-related loci in the recently discovered GOUNDRY population to those of the indoor-resting M and S molecular forms. In contrast to previous studies that focused on immune genes identified in laboratory studies, we centered our analysis on genes that fall within a quantitative trait locus associated with resistance to Plasmodium falciparum in natural populations of A. gambiae. Analyses of haplotypic and genetic diversity at these 28 loci revealed striking differences among populations in levels of genetic diversity and allele frequencies in coding sequence. Putative signals of positive selection were identified at 11 loci, but only one was shared by two subgroups of A. gambiae. Striking patterns of linkage disequilibrium were observed at several loci. We discuss these results with respect to ecological differences among these strata as well as potential implications for disease transmission.Entities:
Keywords: Anopheles gambiae; Plasmodium falciparum; host−pathogen interaction; immune gene; positive selection
Mesh:
Year: 2012 PMID: 23275874 PMCID: PMC3516473 DOI: 10.1534/g3.112.004473
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1 Barplot distributions of genetic differentiation among groups of A. gambiae at immune genes. Genetic differentiation was estimated using Weir and Cockerham’s unbiased estimator of FST between groups of A. gambiae at each gene separately. Loci are arranged according to their genomic coordinates with the 2La inversion presented in the inverted arrangement. The vertical bar colors specify genomic region according to the legend. The schematic indicates the physical distribution of the genes on 2L, with C and T representing the centromere and telomere, respectively. The PRI region and 2La are also indicated on the chromosome schematic. Asterisks indicate FST values significantly greater than zero as determined by permutation tests (see Methods). Both the M and S molecular forms were compared to GOUNDRY separately and exhibited qualitatively similar levels and patterns of differentiation, so only the S-form comparison is presented here.
Summary of nucleotide diversity and the site-frequency spectrum among populations of A. gambiae
| θ | Tajima’s | ||||
|---|---|---|---|---|---|
| Population | n | All | 2La | All | 2La |
| M form | 94 | 0.0253 | 0.0278 | −1.28 | −1.24 |
| S Form | 136 | 0.0366 | 0.0412 | −1.41 | −1.39 |
| GOUNDRY 2La | 56 | 0.0164 | 0.0186 | −0.07 | −0.13 |
| GOUNDRY 2La+/2La+ | 170 | 0.0114 | 0.0125 | 0.22 | 0.17 |
Average number of chromosomes sequenced per gene fragment.
Average θ calculated for each gene fragment using only synonymous sites.
Average D calculated for each gene fragment using only synonymous sites.
Statistic calculated using only genes inside 2La inversion.
Figure 2 Distribution of Tajima’s D. Tajima’s D was calculated for all genes using synonymous sites and both boxplots and data points are presented. The dotted line indicates the expected value of D under neutral equilibrium population models. GNDRY a/a and GNDRY +/+ refer to GOUNDRY 2Laa/2Laa and 2La+/2La+, respectively.
Population genetic summary statistics and test results for loci with a significant HEW test corrected p-value (statistics for all loci are presented in Table S4)
| Locus | n | Ssyn | ||||
|---|---|---|---|---|---|---|
| M form | ||||||
| 64 | 25 | −0.1096 | −2.4951 | 0.1221 | 0.0336 | |
| 64 | 10 | −2.1817 | −2.0996 | 0.7979 | 0.0233 | |
| 62 | 110 | −1.1500 | −2.5176 | 0.3002 | 0.0140 | |
| 100 | 88 | −0.5362 | −1.7003 | 0.0898 | 0.0294 | |
| 100 | 50 | −1.8131 | −1.6246 | 0.1480 | 0.0302 | |
| 64 | 55 | −1.4194 | −2.4365 | 0.0557 | 0.0140 | |
| GOUNDRY 2La | ||||||
| 100 | 70 | 0.2394 | −2.4022 | 0.2408 | 0.0252 | |
| 34 | 22 | −1.3252 | −1.5011 | 0.1211 | 0.0482 | |
| 100 | 38 | −2.0531 | −2.2083 | 0.3992 | 0.0280 | |
| 100 | 15 | −1.2392 | −1.6624 | 0.5080 | 0.0482 | |
| 100 | 72 | −2.3283 | −2.5965 | 0.2512 | 0.0252 | |
| GOUNDRY 2La+/2La+ | ||||||
| 100 | 50 | −0.1716 | −3.0625 | 0.0868 | 0.0224 |
Number of chromosomes in the sample. Loci with n = 100 had more than 100 in the original sample, but were down-sampled to 100 for this analysis.
Number of synonymous segregating sites.
Tajima’s D calculated using only synonymous sites.
Normalized Fay and Wu’s H calculated using only synonymous sites.
Ewens-Watterson’s haplotype homozygosity statistic calculated using only synonymous sites.
HEW p-value with Benjamini and Hochberg correction for multiple tests. Statistical significance of HEW was evaluated by comparison to 105 neutral coalescent simulations of each sample (see Materials and Methods).
For simplicity of presentation, the unnamed LRR genes are labeled according to a shortened form of their AGAP identifier.
Figure 3 The distribution of Fay and Wu’s H and the Ewens-Watterson statistic for all loci in A. gambiae populations. Normalized Fay and Wu’s H and Ewens-Watterson statistics were calculated on synonymous variation at each gene and evaluated by the use of coalescent simulations (see Materials and Methods). Red dots indicate that the HEW statistic is significant at a 5% threshold level after correcting for multiple tests. Each panel presents the results for a different group and the gene names are presented next to the corresponding data point.
Figure 4 LD plot and neighbor-joining tree of FBN32 in GOUNDRY 2Laa/2Laa. (A) Linkage disequilibrium (r2) plotted among variant sites in the sequenced fragment of FBN32. Each pixel represents an r2 value according to the shade of gray, as indicated in the scale. Greater r2 values indicate increased linkage among those sites. The exon structure of FBN32 is placed on the diagonal of the plot to indicate the physical location of each variant site. The frequency of the derived allele frequency (DAF) relative to A. merus is plotted in the barplots above and to the side of the LD plot. (B) Neighbor-joining tree of all FBN32 sequences from GOUNDRY 2Laa/2Laa as well as three additional taxa: A. merus, A. arabiensis, and A. quadriannulatus. The scale bar indicates genetic distance. Large clades of genetically similar taxa were collapsed for presentation.
Population genetic summary statistics for haplogroups of FBN32 in GOUNDRY 2Laa/2Laa and Toll9 in GOUNDRY 2La+/2La+
| Clade | S | π | ||||||
|---|---|---|---|---|---|---|---|---|
| All | 110 | 10 | 18 | 0.0046* | −1.2392 | −1.6624 | 0.508 | ** |
| A | 80 | 8 NS | 17 | 0.0021*** | −2.16** | −2.60* | 0.81** | ** |
| B | 30 | 2 NS | 1** | 0.0002* | NA | NA | NA | NA |
| All | 122 | 42 | 42 | 0.0183NS | −0.1716 | −3.0625** | 0.0868* | *** |
| A | 20 | 3 NS | 3** | 0.0013* | NA | NA | NA | NA |
| B | 102 | 39 NS | 39 | 0.0001*** | −1.4 | −1.79* | 0.09** | ** |
NA, not available.
Number of chromosomes in each clade.
Number of haplotypes in each clade.
For h, S, and π, statistical significance was evaluated by comparison to 105 coalescent simulations conditioned on clade structure (see Materials and Methods).
Number of segregating sites in each clade.
Per site nucleotide diversity calculated on all segregating sites.
Tajima’s D calculated on synonymous sites.
Normalized Fay and Wu’s H calculated on synonymous sites.
For the neutrality tests, statistical significance was evaluated by comparison to 105 neutral coalescent simulations of each sample sub-set/clade (see Materials and Methods). Statistical significance is indicated as * < 0.05, ** < 0.005, and *** < 0.0005.
Ewens-Watterson’s haplotype homozygosity statistic calculated on synonymous sites.
p-value of the HEW test corrected for multiple tests.
Figure 5 LD plot and neighbor-joining tree of Toll9 in GOUNDRY 2La+/2La+. Linkage disequilibrium (r2) plotted among variant sites in the sequenced fragment of Toll9. Each pixel represents an r2 value according to the shade of gray, as indicated in the scale. Greater r2 values indicate increased linkage among those sites. The exon structure of Toll9 is placed above and beside the plot to indicate the structural location of each variant site. The frequency of the derived allele relative to A. merus is plotted in the barplots above and to the side of the LD plot. The red bars indicate the three nonsynonymous sites and the triangle delineates the block of linked sites. (B) Neighbor-joining tree of all Toll9 sequences from GOUNDRY 2La+/2La+ as well as three additional taxa: Anopheles merus, Anopheles arabiensis, and Anopheles quadriannulatus. The scale bar indicates genetic distance. Large clades of genetically similar taxa were collapsed for presentation.
Figure 6 Clade specific patterns of divergence at Toll9 intron. Sliding window analysis of Jukes-Cantor corrected divergence (KJC, range 0 to 1) at all sites relating the two haplotype groups (A and B) to three sister species. Divergence was calculated for 50-bp physical windows shifting 10 bp for every consecutive window. The top horizontal dotted line indicates the average maximum divergence per window for Toll9 in the M and S molecular forms as well as GOUNDRY 2Laa/2Laa. The lower horizontal dashed line indicates the average maximum divergence per window for all other sequenced loci that included an intron in the GOUNDRY 2La+/2La+ population. The legend indicates the color and line style for each clade/sister taxa comparison. The schematic under the plot depicts the exon structure in the sequenced region of Toll9.