Literature DB >> 17204152

Identification and analysis of single nucleotide polymorphisms (SNPs) in the mosquito Anopheles funestus, malaria vector.

Charles S Wondji1, Janet Hemingway, Hilary Ranson.   

Abstract

BACKGROUND: Single nucleotide polymorphisms (SNPs) are the most common source of genetic variation in eukaryotic species and have become an important marker for genetic studies. The mosquito Anopheles funestus is one of the major malaria vectors in Africa and yet, prior to this study, no SNPs have been described for this species. Here we report a genome-wide set of SNP markers for use in genetic studies on this important human disease vector.
RESULTS: DNA fragments from 50 genes were amplified and sequenced from 21 specimens of An. funestus. A third of specimens were field collected in Malawi, a third from a colony of Mozambican origin and a third form a colony of Angolan origin. A total of 494 SNPs including 303 within the coding regions of genes and 5 indels were identified. The physical positions of these SNPs in the genome are known. There were on average 7 SNPs per kilobase similar to that observed in An. gambiae and Drosophila melanogaster. Transitions outnumbered transversions, at a ratio of 2:1. The increased frequency of transition substitutions in coding regions is likely due to the structure of the genetic code and selective constraints. Synonymous sites within coding regions showed a higher polymorphism rate than non-coding introns or 3' and 5'flanking DNA with most of the substitutions in coding regions being observed at the 3rd codon position. A positive correlation in the level of polymorphism was observed between coding and non-coding regions within a gene. By genotyping a subset of 30 SNPs, we confirmed the validity of the SNPs identified during this study.
CONCLUSION: This set of SNP markers represents a useful tool for genetic studies in An. funestus, and will be useful in identifying candidate genes that affect diverse ranges of phenotypes that impact on vector control, such as resistance insecticide, mosquito behavior and vector competence.

Entities:  

Mesh:

Year:  2007        PMID: 17204152      PMCID: PMC1781065          DOI: 10.1186/1471-2164-8-5

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Anopheles funestus and Anopheles gambiae are the major malaria vectors in Africa. Due to the difficulty of laboratory colonization, An. funestus has not received the same attention as An. gambiae and as a consequence there are few molecular markers for this species. However, the recent successful colonization of two strains of An. funestus [1] and the identification of a number of microsatellite markers [2,3] have facilitated more detailed studies of this species. Microsatellite markers particularly have been used to study population structure and gene flow between An. funestus populations [4-6] and a subset of these microsatellite markers were used to build the first linkage map of this species [7]. However, microsatellite markers are not evenly distributed across the genome, and their low number so far is an obstacle to the development of high resolution linkage maps needed for QTL mapping or association studies in An. funestus. Therefore, this study was initiated to increase the availability of characterized and mapped markers for An. funestus. Physically mapped ESTs were used to identify SNPs. Such ESTs have been used to study the genetic variability in a number of species such as Aedes aegypti, Drosophila melanogaster or Homo sapiens [8-10] and should be a source of DNA polymorphisms for An. funestus as well. Single nucleotide polymorphisms (SNPs) are by far the most common type of molecular variation in all organisms. They are extremely abundant with an occurrence of about one SNP per kb in human [11] and about one SNP every 125 bp in An. gambiae [10]. Significant progress has been made in the development of tools for detection and genotyping of SNPs and they are now becoming the markers of choice for association studies, high-resolution linkage mapping and population genomics studies [12]. SNPs located in non-coding regions of the genome and synonymous SNPs (sSNPs) in coding regions, which have no impact on the phenotype, may provide useful markers for population genetics studies. Non-synonymous SNPs (nsSNPs) which alter the structure (change of amino acid sequence) and potentially the function of encoded proteins are useful markers for association studies to detect genetic variations linked with phenotypic traits. Patterns of genetic diversity in An. funestus have not been studied to the same extent as in An. gambiae or Drosophila species. Nucleotide diversity in these species has been used to compare patterns of nucleotide variation, such as the relative occurrence of transitions/tranversions in different regions of the genome [8,13]. These surveys have established codon usage and usage bias patterns in many species, with bias hypothesized to occur as a result of selection for efficient translation [14,15]. The sequencing of the 278 million base pairs (Mbp) constituting the An. gambiae genome has revealed more than 400,000 SNPs indicating a high level of polymorphism in mosquito species [16]. We hypothesize that by sequencing DNA fragments of different genes of An. funestus, a similar level of polymorphism should be encountered and will allow the identification of a significant set of SNPs. Here, we describe the detection and characterization of a set of genome-wide SNP markers from 50 nuclear genes using two laboratory strains and field samples of An. funestus. We also examined patterns of polymorphism and nucleotide diversity in coding and non-coding regions of the genome and define the pattern of codon usage in An. funestus. The utility of the SNPs was assessed by genotyping a subset of these SNPs during a linkage mapping study.

Results and discussion

Gene amplification

In total, 70 primer pairs were tested by PCR, 55 of which gave reliable amplification with PCR products ranging from 194 to 1342 bp. Sequence data from a total of 21 specimens of An. funestus was obtained for 50 of these genes (see Table 1) from laboratory and field samples. Overall, we sequenced a total of 20,547 bp consisting of 14,671 bp of coding region and 5,876 bp of non-coding region. We identified 494 SNPs consisting of 303 coding SNPs (cSNPs) and 191 non-coding SNPs. Each gene contained at least one polymorphism with BU73 having one and BU88 having 29. The distribution of SNPs among the 50 genes is presented in Table 1. All information concerning the location and the nature of each individual SNP have been submitted to dbSNP, the SNP database of GenBank. These SNPs with their respective reference SNP number (rs) are publicly available in dbSNP Build N°127. The NCBI ss (submitted SNP) numbers of these SNPs are ss65917063 to ss65917416.
Table 1

Nucleotide polymorphism in An. funestus genes

Coding regionNon-coding region

Polymorphic sitesNucleotide diversityPolymorphic sitesNucleotide diversity

TransitionTransversion
GenenHapL (bp)1st2nd3rdΣ1st2nd3rdΣSynRepΣππnKsKaL(bp)TsTvΣπ

4G17 3190011200001120.00520.00360.02000.007200000.0000
9K1 23199313710125490.01930.01070.10100.027000000.0000
4J10 14141103400115050.01160.00000.13000.0000732240.0157
6P3 180000000000000.00000.00000.00000.00001796280.0188
6P4 17201002200002020.00410.00000.04500.000018983110.0233
6P5 5189001100001010.00160.00000.02000.00001122020.0063
6Z1 144441056005583110.00790.00220.07500.0088291010.0159
6Z3 144531013141056182200.01690.00110.18600.005700000.0000
9J12 9156210310120550.01170.01560.00000.051000000.0000
9J14 14174003320023250.01400.00770.07600.015000000.0000
BU01 2162001100001010.00110.00000.02300.0000260000.0000
BU08 12394002221254370.00420.00260.04700.0098160000.0000
Ache 182490178012383110.01490.00340.14600.0160852240.0180
BU10 16294103400225160.00840.00240.06900.00452453140.0056
BU11 4177010100000110.00290.00390.00000.0074530110.0099
BU12 7504111301123250.00320.00110.02400.0053380000.0000
BU13 19363201300001230.00260.00290.01100.00731482570.0133
BU19 225263148011264100.00720.00350.04900.010000000.0000
BU021 17267002200224040.00440.00000.06100.0000672240.0164
BU21 16417012310234260.00730.00180.05100.0077814150.0256
BU25 2234000000000000.00000.00000.00000.00001112020.0036
BU29 11258002200223140.00380.00080.05400.00491520220.0029
BU34 11231002202244260.01010.00510.07400.0113280000.0000
BU35 12264007700118080.00710.00000.12800.00001081110.0039
BU40 14255211410011450.00740.00760.01700.02004153250.0016
BU56 4156012310233360.02140.01320.07100.02631265270.0331
BU58 7261102301123250.00530.00190.04460.0103200000.0000
BU62 22213001100001010.00130.00000.02070.000026075120.0143
BU66 25282920111203311140.01740.01940.05030.04941794480.0099
BU70 5228002200222240.00660.00150.05060.0059620110.0069
BU71 95730066112482100.00440.00050.06150.0045981010.0051
BU72 20354212521143690.00820.00620.03680.02201951120.0052
BU73 5277000012030330.00390.00360.00000.013500000.0000
BU76 153300077111391100.00870.00080.11040.00411543360.0095
BU77 18261301412036170.01250.01360.01660.029819897160.0345
BU82 13228002202022240.00590.00250.04020.0112882240.0156
BU85 14299218110011111120.01080.03630.11550.00331212020.0087
BU88 164412012140022160160.01350.00000.16010.000021294130.0179
BU90 40000000000000.00000.00000.00000.00002611230.0021
BU92 113453137111355100.00750.00310.06500.01861105380.0226
BU93 8510005500337180.00350.00040.06260.00251370220.0023
BU98 4369005500227070.01010.00000.08430.00001582680.0257
BU883 213541056212574110.01200.00520.09130.014518883110.0189
BU897 19303002231263580.00800.00780.04580.02101674370.0180
BU901 12231003300003030.00350.00000.05520.00002292240.0065
BU973 5471110200002020.00150.00190.00000.0054671010.0028
BU974 3459002200113030.00360.00000.02650.0000521010.0096
BU982 4240011200001120.00300.00160.01900.0053511010.0107
BU996 18543402611136390.00730.00190.05180.00708766120.0312
Kdr 13201000011020220.00200.00260.00000.01275012570.0039

Total14671451713920125235410220697303587610685191

Average0.00720.00410.05370.00970.0123

nHap, number of haplotypes; L, length of the nucleotide sequence; ∑, total; Syn, synonymous substitutions; Rep, replacement substitutions; π, average number of nucleotide substitution per site; πn, average number of non-synonymous nucleotide substitution per site; Ks, average number of nucleotide substitution per synonymous site; Ka, per non-synonymous site; Ts, transitions; Tv, transversions.

Type of polymorphism

For all sequenced DNA fragments, transition substitutions were more predominant than transversions (62 % vs 38%). Transitions C↔T and A↔G are over-represented with 35.4 and 27.2 % of the total substitutions respectively while the four transversion classes occurred at similar levels (Figure 1). The higher frequency of C↔T and A↔G SNPs is probably partly related to 5-methylcytosine deamination reactions that occur frequently, particularly at CpG dinucleotides [17]. The preponderance of transitions is more obvious for coding regions where out of the 303 SNPs identified, 201 were transitions (66.3%) and 102 were transversions (33.7%). The ratio of transitions/transversions observed here is close to the 2 : 1 ratio observed for Drosophila and humans [13,18]. For polymorphism in non-coding regions, transitions accounted for 55% (106) and transversion for 45% (86). The frequency of transitions between coding and non-coding regions were significantly different (66.3% vs 55% respectively; χ2 = 5.86, P < 0.01). This confirms that SNPs occur more frequently as transitions in coding regions than in non-coding regions. There is also a higher frequency of SNPs occurring at the third codon position (63.7%) than at the 1st or 2nd position (Table 1). Similar results have been observed for Aedes aegypti [10] and in three species of Drosophila [13]. The degeneracy of the genetic code and the selective pressure for gene conservation have been suggested as the main reasons for the preponderance of transitions over transversions [13]. Synonymous or silent substitutions are more often transitions than transversions and there is a stronger selection against replacement substitutions than against synonymous, leading to an increase of the relative frequency of transitions [13]. For fourfold degenerate codons, selection should be neutral, since no amino acid change is induced by a nucleotide substitution at the third position, and each of the 4 codons will produce the same amino acid. We tested this hypothesis by comparing the proportions of transversions at fourfold degenerate codon positions and at non-coding positions for all the 50 genes (Table 2). The result shows that there is no significant difference between the frequency of transversions at fourfold degenerate codon positions (36.8%) and at non-coding regions (44.5%) (χ2 = 2.55 P = 0.11), while this difference is significant between coding and non-coding regions (χ2 = 5.33; P = 0.021). The fact that fourfold degenerate sites have a similar ratio of transitions/tranversions to non-coding regions is consistent with an hypothesis that the structure of the genetic code and selection against replacement polymorphisms accounts for the preponderance of transition substitutions in coding regions.
Figure 1

Distribution of transitions and transversions among SNPs.

Table 2

Transition (Ts) and transversion (Tv) polymorphisms for different classes of DNA

PolymorphismProbability
TsTv%TvCoding Region3rd coding positionFourfold

Non coding regions1068544.5P = 0.021P = 0.0P = 0.11
Coding regions (Cd-R)20110233.6P = 0.204P = 0.507
Third coding position1395427.9P = 0.047
Fourfold degenerate sites603536.8
Five insertion/deletion polymorphisms (indels) were observed in four genes ranging from 1 to 4 bp in coding, intronic and 5'UTR regions (Table 3). Two indels of 2 and 4 bp were observed in the BU10 intron. The frequency of indels (8% for 4/50) is lower than that reported in Ae aegypti of 24% [10] or 25% in An. gambiae [19]. Only one indel, in the BU93 gene, was located in a coding region. This indel was a triplet that did not cause a frame shift. The four indels identified can serve as molecular makers for mapping studies.
Table 3

Indel polymorphism

Non coding region
GeneCoding regionIntrons5' UTR3'UTR

6P54 bp
BU102 bp, 4 bp
BU661 bp
BU933 bp
Approximately 2/3 (206) of the 303 cSNPs were synonymous substitutions (no modification in amino acid) while around 1/3 (97) were non-synonymous or replacement SNPs leading to a change of amino acid. As approximately two-thirds of random coding substitutions change an amino acid, the fact that only 1/3 of cSNPs are non-synonymous implies strong selection against changes that alter amino acid. This ratio of synonymous and replacement cSNPs is similar to that observed in An. gambiae [19] and Ae. aegypti [10].

Genetic diversity

We estimated the nucleotide diversity for each of the 50 genes in coding and non-coding regions (Table 1). The average nucleotide diversity per gene in coding regions was 7.2 × 10-3 or around 1 SNP every 138 bp similar to that observed in An. gambiae (1 SNPs every 125 bp) [19] but much higher than the frequency of 1 SNP/kb observed in humans [20]. SNPs were observed in non-coding regions at a frequency of 1 SNP per 100 bp, corresponding to π = 10 × 10-3. Figure 2 shows that there is a positive correlation in the level of polymorphism between coding and non coding regions of An. funestus genome within a gene (r = 0.48, P < 0.01). This positive correlation may be the consequence of many factors notably the correlated genealogies existing between coding regions and their surrounding non-coding regions. This correlation may also be strengthened by the presence of indirect selection (hitchhiking or background selection) and probably by variable recombination rate, as it is the case in Drosophila [21]. Mutational effect of recombination or biased gene conversion can also operate, but this needs to be confirmed as even in Drosophila, the effect of biased gene conversion is only suspected but unwarranted [22,23]. The average nucleotide diversity in non-coding DNA (0.010) was lower than in synonymous sites of the coding regions (0.0207), P < 0.01. This pattern was also observed in An gambiae, Ae aegypti and Drosophila species [10,13,19]. This is an indication that non-coding regions are under greater purifying selection than synonymous sites within coding regions. This is not surprising, given that non-coding regions may be involved in gene regulation. The non-coding 5'-flanking sequence of a gene may contain regulatory elements such as the promoter that control the expression of that gene, and single-base mutations can affect essential structures for splicing and processing [24].
Figure 2

Correlation of nucleotide diversity in coding (πc) and non-coding regions (πnc) πc: nucleotide diversity of coding region, πnc: nucleotide diversity of non-coding region.

Nucleotide diversity varies greatly from one gene to another (Table 1) and this is likely related to individual gene function and potentially to differences in selective constraints. However, non-synonymous diversities need to be compared in order to definitely estimate the influence of differences in selective constraints. Among the most polymorphic genes sequenced were cytochrome P450 genes, lysozyme, translation initiation factor and ubiquitin conjugating genes. The non-synonymous nucleotide diversity of these genes varied from 14 to 36.3 × 10-3. Most of these genes are involved in specific mechanisms that evolve very rapidly, such as detoxification of xenobiotics for cytochrome P450s or defense mechanisms against bacteria like lysozyme. For example, P450s present a high level of redundancy with less genetic constraints and therefore more polymorphism. In contrast some genes showed very low level of variation particularly those involved in transcriptional or translational regulation (BU973 and BU25, BU93) or in signaling processes (BU01, BU08, BU13). Examples of selective constraints have been observed as well in Drosophila spp. where substitution rate between conservative genes and fast evolving genes differ by around 10-fold [25]. Nucleotide diversity was not statistically different between laboratory strains and field collected mosquitoes (7.4 × 10-3 and 6.9 × 10-3; P = 0.21 by Student's t-test), despite an apparent low level of heterozygosity (fewer heterozygote SNPs) observed in the two laboratory strains compared to the field sample. This result could be due to the fact that FUMOZ and FANG (the two laboratory strains used in this study), were only recently colonized in laboratory and therefore still largely retain the polymorphism of natural populations of An. funestus. The ratio of synonymous to non-synonymous changes (Ka/Ks) gives an indication of the magnitude of the purifying selection against deleterious mutations in a species. The rate of non-synonymous nucleotide substitution per non-synonymous site (Ka) is generally expected to be much lower than the rate of synonymous substitution per synonymous site (Ks), because random amino acid changes are usually deleterious, whereas synonymous changes are likely to be neutral or nearly so [26]. Thus, the expectation is Ka << Ks, except when positive selection is involved favouring particular amino acid replacements, in which case Ka will increase. For An. funestus the Ka/Ks ratio was equal to 0.181 and is similar to the ratio of 0.192 observed in An. gambiae [19] or 0.204 in Ae. aegypti [10] but, higher than the ratio of 0.115 reported in D. melanogaster [13]. This result indicates that the purifying selection against deleterious mutations is acting in An. funestus. Indeed species with large effective population size such Drosophila or Anopheles species are generally more effective at purging deleterious mutations [26].

Clustering pattern of the SNPs

We analyzed the distribution of SNPs identified in this study. We found 16 clusters of two directly neighboring SNPs, one cluster of 3 consecutive SNPs and 13 clusters of two SNPs separated by just 1 bp. For some SNP genotyping methods based on allele-specific amplification, ligation or single base extension principles for which primers need to be designed immediately adjacent to the SNP, it is important that the SNPs are not too close together to prevent primer designing. The presence of a polymorphism within approximately 20 bp will limit the possibilities for designing a robust primer. Most of the 494 SNPs identified in this study do not have a SNP within 20 bp on either or one side thus, and should be easily genotyped by one of these methods.

Genomic position of the SNPs

Among the 50 genes amplified for SNP detection in this study, 45 are already physically mapped to the An. funestus genome by in situ hybridization [27], and the remaining 5 genes were genetically located to their respective chromosome by linkage mapping [7]. Overall, 29 SNPs were located on the X chromosome, 334 on chromosome 2 and 131 on chromosome 3. The higher number of SNPs observed on chromosome 2 is also a consequence of the fact that most of the studied genes are located on that chromosome. Table 4 gives the chromosomal location of the 50 genes across the genome of An. funestus.
Table 4

Characteristics of genes amplified for SNP detection

GenesChromosomal LocationAccession no.FunctionForward primerReverse primerProduct lengthNo of SNPs
4G21 XAY648704Cytochrome P450GGCGATAGCAAACGTAAAGCCGCGGTAAACGGAATATAGC3032
9K1 XAY987362Cytochrome P450GTACGAGCTGGCCGTTAATCCCTTTCTGTAGCTGCACCTTG2439
4J12 3RAY648706Cytochrome P450CCAACAAATCAGTTCATCAGCTTGTAAAAGTGCTTAAAATG2709
6P9 2R: 9A-12CAY729661Cytochrome P450GCGCCTTAGACAAGAGATCAAAGGGATGTCGCTTCTTCTC3508
6P4 2R: 9A-12CAY987359Cytochrome P450GTACGAGACTGGCAAAGAATAAGGAAGACGTATGGATGG43013
6P5 2R: 9A-12CAY987360Cytochrome P450CTGGCTTTGAAACTTCCTCAGATACACGTAGGGATGTCG5503
6Z1 2L: 25A-27DCytochrome P450ACGATCCGTTCCGGGTAGGCTAGCGCAGGATACATTCG55012
6Z3 2L: 25A-27DCytochrome P450GACGATCCGTTCCTGAAGACATCGGTAAGCCCGGATATTT55020
9J12 3LAY729663Cytochrome P450TACCGGTGTGCAGCTTGACTTTGGCGCGAAGGTAAA1945
9J14 3LAY729665Cytochrome P450CGGACAACGTATGATCGATTTTTTGGCTTGCATTAAAAGGTG2145
BU01 X:2BBU039001type II transforming growth factor-beta receptorGTGTGTTTGCTTGGGTGTTGGGCATCGGTAATCAGGATGT5251
BU08 2R:7C-10BBU038908rhodopsinCATTTGTGGAACCCCATTTCGGTCATTGGTTTACCCGAGA5007
Ache 2R:9C-12CDQ534435AcetylcholinesteraseGGGTACGGGACAACATTCACCGTTAACGTACGGGTCGAGT105015
BU10 2L:28ABU039010Cyt-c-p-P1AAGCACAGTTAAACCTTTCGACCTAGCCCAATCTCTGTCT65010
BU11 3L:43BBU038911protein transporterATCTGCTTGCGCTAGATCGTATCGCCAAATTTCATCTTCG2
BU12 2R:7BBU038912Alpha tubilinAAGCTCGAGTTCGCCATCTACTCCAATCCTTTCCGACGTA8005
BU13 2R:15CBU038913signal sequence receptorACCCTGAGAAATCGTAACAACCGATAGTTGAGAGCAATGT63010
BU19 2R:12BBU038919ChitinaseCTGTTGCTGCTGCTACATACCCGGTCACGTACAAATAGTC67010
BU021 3L:38C-40BBU039021Tubilin beta-3 chainGAGTTGGTTGATGCCGTGTTCGTCCGGAAACAAATATCGT4008
BU21 X:3ABU038921Phosphoribosylaminoimida-zole carboxylaseTTTCAAGGTGAACGGTGTGACCATCAAGATGACGACCAGA47511
BU25 2R 12BBU038925ferritin heavy chain-like proteinGCGTAAAGCTGTCGTCCTTCATTCCCCCGTCAGGTAGTCT12002
BU29 2L:27BBU038929sensory appendage proteinCACCAAGTACGATGGTGTCGAGGCACTTGGTTTTGCAGTT4106
BU34 X:1CBU038934NADH dehydrogenaseGGCAGGTAGCAGCAGTTTTCCAGTACCAACCGCAACACAC4006
BU35 2R 12BBU038935CG6846 gene productTTCAGCAAACACGTTTCGTCACTTGCCCTTGTCCTTGTTG4009
BU40 2R:14BBU038940Glutathion peroxydaseAGGCAAAATCAATTTTTGAACGTAACAATTTCTCGACCAT115010
BU56 2R:7BBU038956novel An. gambiae salivary proteinAATCTAGAAGCTGCGCCAGAAATTCTAGGACGGCGATTCC13
BU58 2R:12DBU038958translation initiation factorACTTCCACGCCCAGTGTATCCGTGCAGAGTTCGAAAACAA6505
BU62 2L:23ABU038962cAMP responsive element binding proteinCAATCGGAGCGTAAGGAAAGCGTTCTCCCGCAAAAACTAA47513
BU66 3R:30ABU038966LysozymeTAGCTCATAGTGGCGGTTATACTACAACATGTCGTGCAAA65022
BU70 2R:7CBU038970Ubiquitin fusion 80GTGGACTCCGTACCTGGTCACTGTAGAATTACAGGAGGGCGTA5
BU71 3L:39ABU038971structural protein of peritrophic membraneGGGAAGTCGGTGTAGGGAATACGTTTGGGTCAGGTAGTCG75011
BU72 2R:12BBU038972RHO small monomeric GTPaseGATGAAGCTGCCAAAGATCCTGCCTCGTCGAAAACTTCTT90011
BU73 2R:7ABU038873actin bindingAGTAAGAAACGAACGCAAAGCGGAAAAGTTGGAATGTAAC4303
BU76 2R:10BBU038976translation initiation factorTGCCTACGAACGACGTAATGGGCTCGTAGCTGGTCACTTC50016
BU77 2R:10CBU038877ubiquitin conjugating enzymeCAACACACTAGCCAGCAAGGTTTGGTTCGGCCAACATACT40823
BU82 2R:14DBU038882UnknownAGGGCGGTACAACAAAATCTGCATCGGAGCGTTTCCTA4008
BU85 2R:12EBU038885phosphoglycerate mutaseAAAAAGAATGGCCGGAAAGTCTCATCGCCCAGAATTTCAT80014
BU88 2R:11BBU038988translation initiation factorGTGGCCTCCCACTTTGTTAGTACCGGATACGGTTGACGAT80029
BU90 3R:35BBU038990gustatory receptorGGGACATCATCATCATCGACTTTCGCTTCTCGCGTTAAAT3003
BU92 3L:39ABU038892Microtubule bindingCATGCGACCGAAGAGAAGTTATCCTGATTCTGGCTCATGG55018
BU93 2R:7C-10CBU038893prefoldin subunit 2CACCGGAAACTCGGCTATTATATCGGTTCCATCCGAAAAG55010
BU98 3L:46BBU038898CG7630 gene productTGCGTCACCCGTTACAAATAACGTGTACGCTTTCCACCTC55015
BU883 3R:32BBU038883peritrophinTTCGTGACACAGTTATACGCGCACACTTCAGACTTCCTGT65022
BU897 3R:36CBU038897NADH dehydrogenase (ubiquinone)GGGAATTCCGTGATTTTTGGCAGAAATATCCATAATCG70015
BU901 2L:20CBU038901CG18397 gene productAAAGACACTCCCGCATTACGCTCGTGTCTGTTTGGCTTGA4807
BU973 3R:36FBU038973polyA-binding protein IIAGTAAGAAACGAACGCAAAGCGGAAAAGTTGGAATGTAAC6303
BU974 3L:40ABU038974serine-type peptidaseACTGGCGGAGAACGTACAACTGCTGCACATTAATCAAAGGTT4
BU982 2R:12BBU038982ferritin 2 light chain homologueCTAGTTTCCTGTCGCGTTCCCATCGTCTCCTCCATTACCG4003
BU996 2R:8DBU038996vacuolar hydrogen-transporting ATPaseGTTCGCCTACATGTGCTTCAACAAAGGGTGTGCAAAAAGG80021
Kdr 3R: 36A-37EDQ534436Sodium channel geneTGCAAAATAGAGTCATTGGTGAAATCATCTTCATCTTTGC13429

Polymorphism reliability

To assess the validity of the SNPs identified in this study, 30 SNPs were tested for segregation in isofemale lines. These SNPs were tested using different methods (pyrosequencing, HOLA, SBE and AS-PCR) [7,28]. The Mendelian segregation ratio of each of these SNP loci at F0, F1 and F2 generations was examined in four families from reciprocal crosses between a pyrethroid resistant strain (FUMOZ-R) and a susceptible strain (FANG). Homozygous and heterozygous genotypes for each of these SNPs were observed. Importantly, the expected Mendelian ratio of 1:2:1 was respected in 27 of these 30 SNPs [7], confirming the polymorphism observed at these different positions. We can conclude from this result that the SNPs described in this study are then likely to be true polymorphisms rather than sequence artifacts and our scoring results indicate that they are suitable for use as genetic markers.

Relevance of the SNPs

The set of SNPs identified in this study provide a very useful tool for future genetic studies in An. funestus. These markers are of immediate use for association and QTL mapping studies. Some of these SNPs have been used for linkage mapping and identification of QTL involved in pyrethroid resistance in An. funestus [7]. This set of SNPs can be used as tools for population genetic studies in An. funestus. Genotyping large number of SNP markers will facilitate the study of genetic structure of natural populations and provide independent estimates of gene flow. It may provide additional markers to study the speciation process observed between the Folonzo and Kiribina chromosomal forms of An. funestus [29]. These markers may also be invaluable in monitoring insecticide resistance genes or genes involved in vector competence.

Conclusion

Through the sequencing of DNA fragments from 50 genes of An. funestus, we identified a set of 494 SNP markers and studied the pattern of genetic variability in this species. The distribution of SNPs in An. funestus was not neutral but under the influence of regional factors such as recombination, the degeneracy of the genetic code and selective constraints for gene conservation. The SNP markers described constitutes an important resource for more genetic studies in this important malaria vector.

Methods

Mosquito samples used for polymorphism discovery

We used adult female specimens of An. funestus from two laboratory strains, FANG and FUMOZ-R (seven specimens for each strain) as well as seven field specimens. FANG is a pyrethroid susceptible strain from Calueque, southern Angola and FUMOZ-R is a pyrethroid resistant strain from southern Mozambique [1]. Field specimens of An. funestus were collected from Kela village in Chikwawa district in southern Malawi.

Selection of gene sequences for SNP identification

Target genes were selected among cytochrome P450 genes for their putative involvement in insecticide resistance [30] or among genes of a broad range of functions that had been physically mapped to An. funestus polytene chromosomes [27] (see Table 4; Figure 3). They were also chosen to be distributed across the genome of An. funestus. The sequences of the physically mapped cDNAs were retrieved from Genbank. Determination of coding sequence, UTRs and intronic regions were done using the BLAST procedure through NCBI.
Figure 3

Relative location of studied genes on the An. funestus genome. For definitions of genes, see Table 4. This figure was adapted from [37].

Gene amplification and sequencing

Genomic DNA was extracted using the LIVAK method as described previously [31]. Primers were designed using Primer3 software [32] to flank putative intron sites to maximize the chance of SNP identification. Genomic DNA from 21 individuals (7 from FUMOZ-R, 7 from FANG and 7 from Kela) was amplified for each gene. PCR was performed with 10 ng of genomic DNA in a final volume of 25 μl containing, 2.5 μl Taq buffer, 0.2 mM of dNTPs, 10 pmoles of each primer, 2.5 mM of MgCl2, 0.2 unit of Taq polymerase (Qiagen). Amplification was performed with the following conditions: 1 cycle at 94°C for 3 min; 35 cycles of 94°C for 30 s, 57°C for 30 s and elongation at 72°C for 30 s; followed by 1 cycle at 72°C for 10 min. The annealing temperature was optimized for each primer pair and varied between 53°C to 62°C. PCR products were purified using the QIaquick PCR purification kit (Qiagen) and directly sequenced on both strands using a Beckman CEQ 8000 automatic sequencer.

Sequence analysis and SNP detection

SNPs were detected as sequence differences in multiple alignments using Clustalw [33]. Electrophoregrams were visually inspected using BioEdit and heterozygotes were identified [12]. SNPs were identified as transitions or transversions in coding and non-coding regions. SNPs located within coding regions were classified as synonymous or non-synonymous and their codon position determined. Nucleotide diversity analyses were performed using DnaSP 4.0 [34]. The average number of nucleotide substitutions per site between two sequences, π was calculated for each gene as well as the haplotype diversity. The average number of synonymous substitutions per synonymous site (Ks) and non-synonymous substitutions per non-synonymous site (Ka) was computed according to [35].

SNP validation

Many of the SNPs discovered in this study were validated by different methods. As a part of an effort to construct a genetic map and to identify QTL involved in pyrethroid resistance, 30 SNP loci were genotyped in several families generated from a cross between FANG and FUMOZ-R strains of An. funestus. These SNPs were scored using a HOLA (Hot Oligonucleotides Ligation Assay) method [36], single base extension (SBE) using Beckman CEQ8000 and a pyrosequencing method [7].

Authors' contributions

CSW (corresponding author) carried out the experiments; analyzed the data and wrote the manuscript. JH is the PI of the program that funded the work and contributed to the critical review of the draft manuscript. HR contributed to the design of the study and critical review of the draft manuscript. All authors read and approved the final manuscript.
  36 in total

1.  Characterization of single-nucleotide polymorphisms in coding regions of human genes.

Authors:  M Cargill; D Altshuler; J Ireland; P Sklar; K Ardlie; N Patil; N Shaw; C R Lane; E P Lim; N Kalyanaraman; J Nemesh; L Ziaugra; L Friedland; A Rolfe; J Warrington; R Lipshutz; G Q Daley; E S Lander
Journal:  Nat Genet       Date:  1999-07       Impact factor: 38.330

2.  Primer3 on the WWW for general users and for biologist programmers.

Authors:  S Rozen; H Skaletsky
Journal:  Methods Mol Biol       Date:  2000

Review 3.  Population genomics: genome-wide sampling of insect populations.

Authors:  W C Black; C F Baer; M F Antolin; N M DuTeau
Journal:  Annu Rev Entomol       Date:  2001       Impact factor: 19.686

4.  DnaSP, DNA polymorphism analyses by the coalescent and other methods.

Authors:  Julio Rozas; Juan C Sánchez-DelBarrio; Xavier Messeguer; Ricardo Rozas
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

5.  Chromosomal and bionomic heterogeneities suggest incipient speciation in Anopheles funestus from Burkina Faso.

Authors:  C Costantini; N Sagnon; E Ilboudo-Sanogo; M Coluzzi; D Boccolini
Journal:  Parassitologia       Date:  1999-12

Review 6.  Genome-wide variation in the human and fruitfly: a comparison.

Authors:  C F Aquadro; V Bauer DuMont; F A Reed
Journal:  Curr Opin Genet Dev       Date:  2001-12       Impact factor: 5.578

7.  Single nucleotide polymorphism markers for genetic mapping in Drosophila melanogaster.

Authors:  R A Hoskins; A C Phan; M Naeemuddin; F A Mapa; D A Ruddy; J J Ryan; L M Young; T Wells; C Kopczynski; M C Ellis
Journal:  Genome Res       Date:  2001-06       Impact factor: 9.043

8.  The genome sequence of the malaria mosquito Anopheles gambiae.

Authors:  Robert A Holt; G Mani Subramanian; Aaron Halpern; Granger G Sutton; Rosane Charlab; Deborah R Nusskern; Patrick Wincker; Andrew G Clark; José M C Ribeiro; Ron Wides; Steven L Salzberg; Brendan Loftus; Mark Yandell; William H Majoros; Douglas B Rusch; Zhongwu Lai; Cheryl L Kraft; Josep F Abril; Veronique Anthouard; Peter Arensburger; Peter W Atkinson; Holly Baden; Veronique de Berardinis; Danita Baldwin; Vladimir Benes; Jim Biedler; Claudia Blass; Randall Bolanos; Didier Boscus; Mary Barnstead; Shuang Cai; Angela Center; Kabir Chaturverdi; George K Christophides; Mathew A Chrystal; Michele Clamp; Anibal Cravchik; Val Curwen; Ali Dana; Art Delcher; Ian Dew; Cheryl A Evans; Michael Flanigan; Anne Grundschober-Freimoser; Lisa Friedli; Zhiping Gu; Ping Guan; Roderic Guigo; Maureen E Hillenmeyer; Susanne L Hladun; James R Hogan; Young S Hong; Jeffrey Hoover; Olivier Jaillon; Zhaoxi Ke; Chinnappa Kodira; Elena Kokoza; Anastasios Koutsos; Ivica Letunic; Alex Levitsky; Yong Liang; Jhy-Jhu Lin; Neil F Lobo; John R Lopez; Joel A Malek; Tina C McIntosh; Stephan Meister; Jason Miller; Clark Mobarry; Emmanuel Mongin; Sean D Murphy; David A O'Brochta; Cynthia Pfannkoch; Rong Qi; Megan A Regier; Karin Remington; Hongguang Shao; Maria V Sharakhova; Cynthia D Sitter; Jyoti Shetty; Thomas J Smith; Renee Strong; Jingtao Sun; Dana Thomasova; Lucas Q Ton; Pantelis Topalis; Zhijian Tu; Maria F Unger; Brian Walenz; Aihui Wang; Jian Wang; Mei Wang; Xuelan Wang; Kerry J Woodford; Jennifer R Wortman; Martin Wu; Alison Yao; Evgeny M Zdobnov; Hongyu Zhang; Qi Zhao; Shaying Zhao; Shiaoping C Zhu; Igor Zhimulev; Mario Coluzzi; Alessandra della Torre; Charles W Roth; Christos Louis; Francis Kalush; Richard J Mural; Eugene W Myers; Mark D Adams; Hamilton O Smith; Samuel Broder; Malcolm J Gardner; Claire M Fraser; Ewan Birney; Peer Bork; Paul T Brey; J Craig Venter; Jean Weissenbach; Fotis C Kafatos; Frank H Collins; Stephen L Hoffman
Journal:  Science       Date:  2002-10-04       Impact factor: 47.728

9.  A microsatellite map of the African human malaria vector Anopheles funestus.

Authors:  I Sharakhov; O Braginets; O Grushko; A Cohuet; W M Guelbeogo; D Boccolini; M Weill; C Costantini; N'F Sagnon; D Fontenille; G Yan; N J Besansky
Journal:  J Hered       Date:  2004 Jan-Feb       Impact factor: 2.645

10.  Inversions and gene order shuffling in Anopheles gambiae and A. funestus.

Authors:  Igor V Sharakhov; Andrew C Serazin; Olga G Grushko; Ali Dana; Neil Lobo; Maureen E Hillenmeyer; Richard Westerman; Jeanne Romero-Severson; Carlo Costantini; N'Fale Sagnon; Frank H Collins; Nora J Besansky
Journal:  Science       Date:  2002-10-04       Impact factor: 47.728

View more
  31 in total

1.  Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing.

Authors:  Christopher W Wheat
Journal:  Genetica       Date:  2008-10-18       Impact factor: 1.082

2.  High Variation in Single Nucleotide Polymorphisms (SNPs) and Insertions/Deletions (Indels) in the Highly Invasive Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) Middle East-Asia Minor 1 (MEAM1).

Authors:  Z C Lü; H B Sun; F H Wan; J Y Guo; G F Zhang
Journal:  Neotrop Entomol       Date:  2013-07-31       Impact factor: 1.434

3.  High nucleotide diversity and limited linkage disequilibrium in Helicoverpa armigera facilitates the detection of a selective sweep.

Authors:  S V Song; S Downes; T Parker; J G Oakeshott; C Robin
Journal:  Heredity (Edinb)       Date:  2015-07-15       Impact factor: 3.821

4.  Genome-wide comparative analysis of four Indian Drosophila species.

Authors:  Sujata Mohanty; Radhika Khanna
Journal:  Mol Genet Genomics       Date:  2017-06-28       Impact factor: 3.291

5.  Development of a SNP resource and a genetic linkage map for Atlantic cod (Gadus morhua).

Authors:  Sophie Hubert; Brent Higgins; Tudor Borza; Sharen Bowman
Journal:  BMC Genomics       Date:  2010-03-22       Impact factor: 3.969

6.  Positive selection drives accelerated evolution of mosquito salivary genes associated with blood-feeding.

Authors:  B Arcà; C J Struchiner; V M Pham; G Sferra; F Lombardo; M Pombi; J M C Ribeiro
Journal:  Insect Mol Biol       Date:  2013-11-17       Impact factor: 3.585

7.  The characterization of the Phlebotomus papatasi transcriptome.

Authors:  J Abrudan; M Ramalho-Ortigão; S O'Neil; G Stayback; M Wadsworth; M Bernard; D Shoue; S Emrich; P Lawyer; S Kamhawi; E D Rowton; M J Lehane; P A Bates; J G Valenzeula; C Tomlinson; E Appelbaum; D Moeller; B Thiesing; R Dillon; S Clifton; N F Lobo; R K Wilson; F H Collins; M A McDowell
Journal:  Insect Mol Biol       Date:  2013-02-07       Impact factor: 3.585

8.  The genome of Anopheles darlingi, the main neotropical malaria vector.

Authors:  Osvaldo Marinotti; Gustavo C Cerqueira; Luiz Gonzaga Paula de Almeida; Maria Inês Tiraboschi Ferro; Elgion Lucio da Silva Loreto; Arnaldo Zaha; Santuza M R Teixeira; Adam R Wespiser; Alexandre Almeida E Silva; Aline Daiane Schlindwein; Ana Carolina Landim Pacheco; Artur Luiz da Costa da Silva; Brenton R Graveley; Brian P Walenz; Bruna de Araujo Lima; Carlos Alexandre Gomes Ribeiro; Carlos Gustavo Nunes-Silva; Carlos Roberto de Carvalho; Célia Maria de Almeida Soares; Claudia Beatriz Afonso de Menezes; Cleverson Matiolli; Daniel Caffrey; Demetrius Antonio M Araújo; Diana Magalhães de Oliveira; Douglas Golenbock; Edmundo Carlos Grisard; Fabiana Fantinatti-Garboggini; Fabíola Marques de Carvalho; Fernando Gomes Barcellos; Francisco Prosdocimi; Gemma May; Gilson Martins de Azevedo Junior; Giselle Moura Guimarães; Gustavo Henrique Goldman; Itácio Q M Padilha; Jacqueline da Silva Batista; Jesus Aparecido Ferro; José M C Ribeiro; Juliana Lopes Rangel Fietto; Karina Maia Dabbas; Louise Cerdeira; Lucymara Fassarella Agnez-Lima; Marcelo Brocchi; Marcos Oliveira de Carvalho; Marcus de Melo Teixeira; Maria de Mascena Diniz Maia; Maria Helena S Goldman; Maria Paula Cruz Schneider; Maria Sueli Soares Felipe; Mariangela Hungria; Marisa Fabiana Nicolás; Maristela Pereira; Martín Alejandro Montes; Maurício E Cantão; Michel Vincentz; Miriam Silva Rafael; Neal Silverman; Patrícia Hermes Stoco; Rangel Celso Souza; Renato Vicentini; Ricardo Tostes Gazzinelli; Rogério de Oliveira Neves; Rosane Silva; Spartaco Astolfi-Filho; Talles Eduardo Ferreira Maciel; Turán P Urményi; Wanderli Pedro Tadei; Erney Plessmann Camargo; Ana Tereza Ribeiro de Vasconcelos
Journal:  Nucleic Acids Res       Date:  2013-06-12       Impact factor: 16.971

9.  High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols.

Authors:  Craig S Wilding; David Weetman; Keith Steen; Martin J Donnelly
Journal:  BMC Genomics       Date:  2009-07-16       Impact factor: 3.969

10.  Comparative analysis of the global transcriptome of Anopheles funestus from Mali, West Africa.

Authors:  Andrew C Serazin; Ali N Dana; Maureen E Hillenmeyer; Neil F Lobo; Mamadou B Coulibaly; Michael B Willard; Brent W Harker; Igor V Sharakhov; Frank H Collins; Jose M C Ribeiro; Nora J Besansky
Journal:  PLoS One       Date:  2009-11-19       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.