| Literature DB >> 19497100 |
Darren J Obbard1, John J Welch, Tom J Little.
Abstract
BACKGROUND: Mosquitoes of the Anopheles gambiae species complex are the primary vectors of human malaria in sub-Saharan Africa. Many host genes have been shown to affect Plasmodium development in the mosquito, and so are expected to engage in an evolutionary arms race with the pathogen. However, there is little conclusive evidence that any of these mosquito genes evolve rapidly, or show other signatures of adaptive evolution.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19497100 PMCID: PMC2698913 DOI: 10.1186/1475-2875-8-117
Source DB: PubMed Journal: Malar J ISSN: 1475-2875 Impact factor: 2.979
Locus Details and location
| NAME | Identifier | Putative function | Genomic location |
| SRPN1 | AGAP006909 | Inhibitory Serine Protease inhibitor | 2L:39892128-39893864 |
| SRPN2 | AGAP006911 | Plasmodium-related Inhibitory Serine Protease inhibitor | 2L:39897002-39899744 |
| SRPN3 | AGAP006910 | Inhibitory Serine Protease inhibitor | 2L:39895229-39896338 |
| SRPN4C | AGAP009670 | Inhibitory Serine Protease inhibitor | 3R:38145527-38154288 |
| SRPN5 | AGAP009221 | Inhibitory Serine Protease inhibitor | 3R:28858000-28859778 |
| SRPN6 | AGAP009212 | Plasmodium-related Inhibitory Serine Protease inhibitor | 3R:28811997-28818217 |
| SRPN7 | AGAP007693 | Inhibitory Serine Protease inhibitor | 2L:49090665-49091915 |
| SRPN8 | AGAP003194 | Inhibitory Serine Protease inhibitor | 2R:33744972-33746720 |
| SRPN9 | AGAP003139 | Inhibitory Serine Protease inhibitor | 2R:33148444-33154607 |
| SRPN10 | AGAP005246 | Plasmodium-related Inhibitory Serine Protease inhibitor | 2L:12996143-13001508 |
| SRPN11 | AGAP001377 | Non-inhibitory Serine Protease inhibitor | 2R:4017728-4019706 |
| SRPN12 | AGAP001375 | Non-inhibitory Serine Protease inhibitor | 2R:4010431-4012512 |
| SRPN14 | AGAP007692 | Non-inhibitory Serine Protease inhibitor | 2L:49084812-49086463 |
| SRPN16 | AGAP009213 | Inhibitory Serine Protease inhibitor | 3R:28824548-28826209 |
| SRPN17 | AGAP001376 | Inhibitory Serine Protease inhibitor | 2R:4015617-4016537 |
| SRPN18 | AGAP007691 | Non-inhibitory Serine Protease inhibitor | 2L:49086842-49088278 |
| Control1 | AGAP006906 | Adenosine deaminase-related growth factor | 2L:39852471-39854636 |
| Control2 | AGAP006904 | Matrix metalloproteinase | 2L:39831595-39836700 |
| Control3 | AGAP006918 | Putative NADH:ubiquinone dehydrogenase | 2L:39995907-39997095 |
| Control4 | AGAP009673 | glutaminyl-peptide cyclotransferase | 3R:38248845-38249780 |
| Control5 | ENSANGG8091 | (retrotransposon) | 3R:28965847-28968579 |
| Control6 | AGAP009207 | Mitogen-activated protein kinase ERK | 3R:28697030-28708787 |
| Control7 | AGAP007712 | Putative RHO guanyl-nucleotide exchange factor | 2L:49181235-49190516 |
| Control8 | AGAP003205 | Similar to Drosophila CG8468 | 2R:33825401-33827998 |
| Control9 | AGAP003143 | Similar to Drosophila CG9904 | 2R:33211906-33213476 |
| Control10 | AGAP005247 | no annotation | 2L:13062962-13067750 |
| Control11 | AGAP001384 | cAMP-dependent protein kinase, beta-catalytic subunit | 2R:4098545-4103634 |
| Control12 | AGAP001371 | Similar to Drosophila CG18643 | 2R:3885127-3885956 |
| Control14 | AGAP007713 | Similar to human solute carrier family 39 | 2L:49196817-49198177 |
| Control16 | AGAP900209 | DNA-directed RNA polymerase II subunit J | 3R:28746535-28747425 |
| Control17 | AGAP001388 | Similar to human mab-3-related transcription factor 3 | 2R:4120810-4122586 |
| Control18 | AGAP007717 | Similar to Drosophila CAP CG18408-PE | 2L:49212258-49224889 |
Figure 1Genetic diversity at synonymous and non-synonymous sites. Genetic diversity (the percentage of sites that differ on average between haplotypes) at synonymous (π) and nonsynonymous (π) sites measured at 32 loci in populations of An. gambiae, An. arabiensis and An. melas. Diversity is shown separately for control loci (dark bars) and serpins (pale bars), and is shown for the species as a whole, and for each population separately. Note that although only two individuals (4 haplotypes) were sampled for S-form An. gambiae in population BK, ~19 Kbp of sequence will provide a good estimate of π if mating within the population is random. Diversity was significantly higher in An. gambiae than in An. arabiensis, and significantly lower in An. melas. For non-synonymous sites, serpins had significantly higher diversity than control loci, but this trend was non-significant at synonymous sites. See main text for details, and Additional File 1 for the raw data.
Figure 2Genetic differentiation between populations. Arrows indicate approximate sampling locations within Africa, and letters identify species (A- An. arabiensis, G- An. gambiae, M-form and S-form). Dashed lines indicate pairs of populations for which genetic differentiation was calculated, and numbers are Kstatistics, averaged across all 32 loci.
Figure 3Genetic divergence within the . (a) An un-rooted neighbour-joining tree, calculated from pairwise Kbetween species averaged across loci. Branch lengths are to scale. The filled triangles illustrate the relative scale of diversity and divergence within the complex, such that the length of the triangle is half the divergence between haplotypes within species (i.e π/2) and net divergence (K-π) corresponds to branch-lengths that are not part of the triangle. (Note that population samples and thus πwere not available for An. quadriannulatus A and An. merus). (b)-(d) Neighbour-joining cladograms (i.e. topology only, branch-lengths uninformative) showing the unique alleles sequenced from three loci. Note that in all cases An. gambiae and An. arabiensis alleles are intermixed. (b) to (d) are control locus 14, SRPN7 and SRPN11, selected to illustrate a wide range of Kvalues between An. gambiae and An. arabiensis (K= 0.51, 0.10 and 0.09, respectively).
Estimates of the proportion of adaptive substitutions
| α = 0 | 29 | -336.49 | 730.98 | [0] | [0] | |||
| α ~ (all loci) | 30 | -335.76 | 1.46 | 0.23 | 731.52 | 0.323 | 0.23 | [0.23] |
| α ~ (control, serpin) | 31 | -335.66 | 0.19 | 0.91 | 733.33 | 0.131 | 0.18 | 0.25 |
| α ~ (other, immune) | 31 | -335.73 | 0.06 | 0.97 | 733.46 | 0.123 | 0.24 | 0.07 |
| α = 0 | 29 | -329.36 | 716.73 | [0] | [0] | |||
| α ~ (all loci) | 30 | -329.31 | 0.10 | 0.75 | 718.63 | 0.191 | -0.05 | [-0.05] |
| α ~ (control, serpin) | 31 | -328.07 | 2.48 | 0.29 | 718.15 | 0.243 | -0.54 | 0.10 |
| α ~ (other, immune) | 31 | -329.30 | 0.02 | 0.99 | 720.61 | 0.071 | -0.03 | -0.15 |
| α = 0 | 34 | -333.36 | 734.72 | [0] | [0] | |||
| α ~ (all loci) | 35 | -332.60 | 1.52 | 0.22 | 735.21 | 0.274 | 0.15 | [0.15] |
| α ~ (control, serpin) | 36 | -331.59 | 2.02 | 0.36 | 735.18 | 0.277 | 0.34 | 0.03 |
| α ~ (other, immune) | 36 | -332.60 | 0.00 | 1.00 | 737.21 | 0.101 | 0.15 | 0.17 |
| α = 0 | 34 | -311.04 | 690.07 | 0.053 | [0] | [0] | ||
| α ~ (all loci) | 35 | -307.96 | 6.15 | 0.01 | 685.92 | 0.34 | [0.34] | |
| α ~ (control, serpin) | 36 | -307.22 | 1.48 | 0.48 | 686.44 | 0.325 | 0.47 | 0.25 |
| α ~ (other, immune) | 36 | -307.71 | 0.50 | 0.78 | 687.42 | 0.200 | 0.36 | 0.10 |
αa and αb are estimates of the proportion of adaptive substitutions in each of the two classes of gene (control/serpin or non-immune/immune, respectively), Par is the number of parameters in the model. Where the value of α is constrained by the model it is marked in square brackets. Negative values arise from an 'excess' of non-synonymous polymorphism, and could represent sampling error, or mildly deleterious polymorphisms [7]. AIC is the Akaike Information Criterion. The Akaike weighting can be interpreted as the weight of evidence in favour of the corresponding model, given the relative support for all the available models [52].
Figure 4The power to estimate α using . The relative log-likelihood of α (the proportion of amino-acid substitutions that are adaptive) estimated using the modified McDonald-Kreitman approach [30]. The grey curve is calculated from all 102 genes for which both An. arabiensis and An. gambiae population samples were available in the dataset of Cohuet et al. [6]. The black curve shows an equivalent dataset of 102 genes from Drosophila melanogaster and D. simulans, with genes selected to be the same average length as those in the Cohuet dataset (D. J. Obbard, J. J. Welch and F. M. Jiggins, unpublished data). Despite both pairs of species having similar levels of diversity (πfrom 1.6% to 2.9%), for the Anopheles dataset the bounds (2 units of log Likelihood) stretch from -0.33 to 0.32 (and include zero) while for Drosophila the bounds only stretch from 0.26 to 0.44, and the maximum-likelihood estimate of α is 35%. The low precision in the second estimate reflects the very low power available due to the low divergence in An. gambiae-An. arabiensis comparisons