| Literature DB >> 33590926 |
Chris S Clarkson1, Alistair Miles1,2, Nicholas J Harding2, Andrias O O'Reilly3, David Weetman4, Dominic Kwiatkowski1,2, Martin J Donnelly1,4.
Abstract
Resistance to pyrethroid insecticides is a major concern for malaria vector control. Pyrethroids target the voltage-gated sodium channel (VGSC), an essential component of the mosquito nervous system. Substitutions in the amino acid sequence can induce a resistance phenotype. We use whole-genome sequence data from phase 2 of the Anopheles gambiae 1000 Genomes Project (Ag1000G) to provide a comprehensive account of genetic variation in the Vgsc gene across 13 African countries. In addition to known resistance alleles, we describe 20 other non-synonymous nucleotide substitutions at appreciable population frequency and map these variants onto a protein model to investigate the likelihood of pyrethroid resistance phenotypes. Thirteen of these novel alleles were found to occur almost exclusively on haplotypes carrying the known L995F kdr (knock-down resistance) allele and may enhance or compensate for the L995F resistance genotype. A novel mutation I1527T, adjacent to a predicted pyrethroid-binding site, was found in tight linkage with V402L substitutions, similar to allele combinations associated with resistance in other insect species. We also analysed genetic backgrounds carrying resistance alleles, to determine which alleles have experienced recent positive selection, and describe ten distinct haplotype groups carrying known kdr alleles. Five of these groups are observed in more than one country, in one case separated by over 3000 km, providing new information about the potential for the geographical spread of resistance. Our results demonstrate that the molecular basis of target-site pyrethroid resistance in malaria vectors is more complex than previously appreciated, and provide a foundation for the development of new genetic tools for insecticide resistance management.Entities:
Keywords: insecticides; malaria; mosquitoes; resistance; voltage-gated sodium channel; whole genome sequencing
Mesh:
Substances:
Year: 2021 PMID: 33590926 PMCID: PMC9019111 DOI: 10.1111/mec.15845
Source DB: PubMed Journal: Mol Ecol ISSN: 0962-1083 Impact factor: 6.622
Non‐synonymous nucleotide variation in the voltage‐gated sodium channel gene
| Variant | Population allele frequency (%) | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Position |
|
| Domain | AO | GH | BF | CI | GN | GW | GM | CM | GH | BF | GN | GA | UG | GQ | FR | KE |
| 2,390,177 G > A | R254K | R261 | IL45 | 0.0 | 0.009 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.313 | 0.0 | 0.0 | 0.0 | 0.203 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,391,228 G > C | V402L | V410 | IS6 | 0.0 | 0.127 | 0.073 | 0.085 | 0.125 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,391,228 G > T | V402L | V410 | IS6 | 0.0 | 0.045 | 0.06 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,399,997 G > C | D466H | ‐ | LI/II | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.069 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,400,071 G > A | M490I | M508 | LI/II | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.031 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.188 |
| 2,400,071 G > T | M490I | M508 | LI/II | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.003 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,402,466 G > T | G531V | G549 | LI/II | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.007 | 0.0 | 0.056 | 0.0 | 0.0 |
| 2,407,967 A > C | Q697P | Q724 | LI/II | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.056 | 0.0 | 0.0 |
| 2,416,980 C > T | T791M | T810 | IIS1 | 0.0 | 0.009 | 0.02 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.292 | 0.147 | 0.112 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,422,651 T > C | L995S | L1014 | IIS6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.157 | 0.0 | 0.0 | 0.0 | 0.674 | 1.0 | 0.0 | 0.0 | 0.76 |
| 2,422,652 A > T | L995F | L1014 | IIS6 | 0.84 | 0.818 | 0.853 | 0.915 | 0.875 | 0.0 | 0.0 | 0.525 | 1.0 | 1.0 | 1.0 | 0.326 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,429,556 G > A | V1507I | ‐ | IIIL56 | 0.0 | 0.0 | 0.0 | 0.0 | 0.125 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,429,617 T > C | I1527T | I1532 | IIIS6 | 0.0 | 0.173 | 0.133 | 0.085 | 0.125 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,429,745 A > T | N1570Y | N1575 | LIII/IV | 0.0 | 0.0 | 0.267 | 0.0 | 0.0 | 0.0 | 0.0 | 0.057 | 0.167 | 0.207 | 0.088 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,429,897 A > G | E1597G | E1602 | LIII/IV | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.065 | 0.062 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,429,915 A > C | K1603T | K1608 | IVS1 | 0.0 | 0.055 | 0.047 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,430,424 G > T | A1746S | A1751 | IVS5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.292 | 0.141 | 0.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,430,817 G > A | V1853I | V1858 | COOH | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.542 | 0.049 | 0.062 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,430,863 T > C | I1868T | I1873 | COOH | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.261 | 0.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,430,880 C > T | P1874S | P1879 | COOH | 0.0 | 0.027 | 0.207 | 0.345 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,430,881 C > T | P1874L | P1879 | COOH | 0.0 | 0.0 | 0.073 | 0.007 | 0.25 | 0.0 | 0.0 | 0.0 | 0.0 | 0.234 | 0.475 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,431,061 C > T | A1934V | A1939 | COOH | 0.0 | 0.018 | 0.107 | 0.465 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2,431,079 T > C | I1940T | I1945 | COOH | 0.0 | 0.118 | 0.04 | 0.0 | 0.0 | 0.0 | 0.0 | 0.067 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
AO, Angola; GH, Ghana; BF, Burkina Faso; CI, Côte d'Ivoire; GN, Guinea; GW, Guinea‐Bissau; GM, Gambia; CM, Cameroon; GA, Gabon; UG, Uganda; GQ, Bioko; FR, Mayotte; KE, Kenya; Ac, An. coluzzii; Ag, An. gambiae. Species status of specimens from Guinea‐Bissau, Gambia and Kenya is uncertain (The Anopheles gambiae 1000 Genomes Consortium, 2020). All variants are at 5% frequency or above in one or more of the 16 Ag1000G phase 2 populations, with the exception of 2,400,071 G > T which is only found in the CMAg population at 0.3% frequency but is included because another mutation is found at the same position (2,400,071 G > A) at >5% frequency and which causes the same amino acid substitution (M490I).
Position relative to the AgamP3 reference sequence, chromosome arm 2L.
Codon numbering according to Anopheles gambiae transcript AGAP004707‐RD in geneset AgamP4.12.
Codon numbering according to Musca domestica EMBL accession X96668 (Williamson et al., 1996).
Location of the variant within the protein structure. Transmembrane segments are named according to domain number (in Roman numerals) followed by ‘S’ then the number of the segment; for example, ‘IIS6’ means domain two, transmembrane segment six. Internal linkers between segments within the same domain are named according to domain (in Roman numerals) followed by ‘L’ then the numbers of the linked segments; for example, ‘IL45’ means domain one, linker between transmembrane segments four and five. Internal linkers between domains are named ‘L’ followed by the linked domains; for example, ‘LI/II’ means the linker between domains one and two. ‘COOH’ means the internal carboxyl tail.
FIGURE 1Voltage‐gated sodium channel protein structure and non‐synonymous variation. The An. gambiae voltage‐gated sodium channel (AGAP004707‐RD AgamP4.12) is shown as a transmembrane topology map (top) and as a homology model (bottom) in cartoon format coloured by domain. Variant positions are shown as red circles in the topology map and as red space‐fill in the 3D model. Purple circles in the map show amino acids absent from the model due to the lack of modelled structure in this region
FIGURE 2Linkage disequilibrium (D′) between non‐synonymous variants. A value of 1 indicates that two alleles are in perfect linkage, meaning that one of the alleles is only ever found in combination with the other. Conversely, a value of −1 indicates that two alleles are never found in combination with each other. The bar plot at the top shows the frequency of each allele within the Ag1000G phase 2 cohort. See Table 1 for population allele frequencies
FIGURE 3Haplotype networks. Median‐joining network for haplotypes carrying L995F (labelled F1‐F5) or L995S variants (S1–S5) with a maximum edge distance of two SNPs. Labelling of network components is via concordance with hierarchical clusters discovered in The Anopheles gambiae 1000 Genomes Consortium (2017). Node size is relative to the number of haplotypes contained, and node colour represents the proportion of haplotypes from mosquito populations/species—AO = Angola; GH = Ghana, BF = Burkina Faso; CI = Côte d'Ivoire; GN = Guinea; CM = Cameroon; GA = Gabon; UG = Uganda; KE = Kenya. Non‐synonymous edges are highlighted in red, and those leading to non‐singleton nodes are labelled with the codon change; arrow head indicates direction of change away from the reference allele. Network components with fewer than three haplotypes are not shown
FIGURE 4Map of haplotype frequencies. Each pie shows the frequency of different haplotype groups within one of the populations sampled. The size of the pie is proportional to the number of haplotypes sampled. The size of each wedge within the pie is proportional to the frequency of a haplotype group within the population. Haplotypes in groups F1‐F5 carry the L995F kdr allele. Haplotypes in groups S1–S5 carry the L995S kdr allele. Haplotypes in group other resistant (OR) carry either L995F or L995S but did not cluster within any of the haplotype groups. Wild‐type (wt) haplotypes do not carry any known resistance alleles
FIGURE 5Evidence for positive selection on haplotypes carrying known or putative resistance alleles. Each panel plots the decay of extended haplotype homozygosity (EHH) for a set of core haplotypes centred on Vgsc codon 995. Core haplotypes F1‐F5 carry the L995F allele; S1–S5 carry the L995S allele; L1 carries the I1527T allele; L2 carries the M490I allele. Wild‐type (wt) haplotypes do not carry known or putative resistance alleles. A slower decay of EHH relative to wild‐type haplotypes implies positive selection (each panel plots the same collection of wild‐type haplotypes)
FIGURE 6Informative SNPs for haplotype surveillance. a, Each data point represents a single SNP. The information gain value for each SNP provides an indication of how informative the SNP is likely to be if used as part of a genetic assay for testing whether a mosquito carries a resistance haplotype, and if so, which haplotype group it belongs to. b, Number of SNPs required to accurately predict which group a resistance haplotype belongs to. Each data point represents a single decision tree. Decision trees were constructed using either the LD3 (left) or CART (right) algorithm for comparison. Accuracy was evaluated using 10‐fold stratified cross‐validation