| Literature DB >> 25567803 |
Jonathon C Marshall1, João Pinto2, Jacques Derek Charlwood3, Gabriele Gentile4, Federica Santolamazza5, Frèdèric Simard6, Alessandra Della Torre5, Martin J Donnelly7, Adalgisa Caccone8.
Abstract
The evolutionary processes at play between island and mainland populations of the malaria mosquito vector Anopheles gambiae sensu stricto are of great interest as islands may be suitable sites for preliminary application of transgenic-based vector control strategies. São Tomé and Príncipe, located off the West African coast, have received such attention in recent years. This study investigates the degree of isolation of An. gambiae s.s. populations between these islands and the mainland based on mitochondrial and ribosomal DNA molecular data. We identify possible continental localities from which these island populations derived. For these purposes, we used F ST values, haplotype networks, and nested clade analysis to estimate migration rates and patterns. Haplotypes from both markers are geographically widespread across the African continent. Results indicate that the populations from São Tomé and Príncipe are relatively isolated from continental African populations, suggesting they are promising sites for test releases of transgenic individuals. These island populations are possibly derived from two separate continental migrations. This result is discussed in the context of the history of the African slave trade with respect to São Tomé and Príncipe.Entities:
Keywords: Anopheles gambiae; island colonization; malaria; mitochondrial DNA; phylogeography; ribosomal DNA
Year: 2008 PMID: 25567803 PMCID: PMC3352388 DOI: 10.1111/j.1752-4571.2008.00048.x
Source DB: PubMed Journal: Evol Appl ISSN: 1752-4571 Impact factor: 5.183
Localities sampled in this study by country. Locality numbers listed correspond to those illustrated in Fig. 1. Samples found in close proximity were pooled into localities in conformity to the large-scale sampling scheme of this study. Sample sizes (N) for both the ND5 and ITS samples are given separately and where possible molecular forms are given. GPS coordinates are given in degrees and minutes. Sources of the original samples and/or sequence data are given in the last column
| Country and population | GPS coordinates | Locality no.-molecular form | Sample source | ||
|---|---|---|---|---|---|
| Angola | |||||
| Namibe | 15°10′S, 12°09′E | Ang01-M | 15 | 19 | |
| Luanda area | 08°50′S, 13°14′E | Ang02-M | 41 | 36 | |
| Lunda sud | 09°39′S, 20°26′E | Ang02-S | 2 | 2 | |
| Prov. of Zaire | 06°07′S, 12°22′E | Ang03-S | 14 | 22 | |
| Benin | |||||
| Bohicon, Dassa, Lema, etc. | 07°01′N, 02°17′E | Ben04-M/S | 59 | 58 | |
| Cameroon | |||||
| Tiko, Buea, Doula | 04°04′N, 09°36′E | Cam05-M | 56 | 60 | F. Simard |
| Ebebda, Obala | 04°10′N, 11°25′E | Cam06-S | 31 | 30 | F. Simard |
| Simbock, Mbanjock | 03°45′N, 10°95′E | Cam07-M | 32 | 29 | F. Simard |
| CAR | |||||
| Bayanga | 02°55′N, 16°15′E | CAR08-M/S | 8 | 9 | V. Medjibe |
| DRC | |||||
| Kinshasa | 04°19′S, 15°17′E | CG09-M/S | 8 | 6 | |
| EG | |||||
| Malabo | 03°45′N, 08°46′E | EG10-M | 30 | 30 | F. Simard |
| Bata | 01°51′N, 09°45′E | EG11-M | 29 | 13 | F. Simard |
| Gabon | |||||
| Dienga | 01°52′S, 12°41′E | Gab12-S | 36 | 38 | |
| Libreville | 00°23′N, 09°27′E | Gab13-S | 22 | 26 | |
| Ghana | |||||
| Navrongo | 10°53′N, 01°05′W | Gh14-M | 6 | 8 | |
| Mampong area | 05°24′N, 00°37′W | Gh15-S | 30 | 20 | |
| Ivory Coast | |||||
| Danta, M'be, Ziglo, etc. | 07°07′N, 07°10′W | IC16-M/S | 24 | 25 | |
| Mozambique | |||||
| Furvela | 23°43′S, 35°18′E | Mz17-S | 21 | 21 | J. D. Charlwood |
| Nigeria | |||||
| Kobape | 07°07′N, 03°18′E | Nig18-M/S | 36 | 28 | |
| Gwamlar | 08°50′N, 07°52′E | Nig19-S | 9 | – | |
| São Tomé | |||||
| Puerto Alegre | 00°02′N, 06°32′E | St20-M | 57 | 56 | |
| Riboque | 00°20′N, 06°44′E | St21-M | 60 | 58 | |
| Príncipe | |||||
| Trabajodores | 01°39′N, 07°25′E | Pr22-M | 62 | 48 | |
| Kenya | |||||
| Asembo, Kisian, Muhroni, Nyakoch, Wathrego | 00°11′S, 34°54′E | Ken23-S | 80 | 21 | |
| Jego | 03°13′S, 40°07′E | Ken24-S | 5 | 9 | |
| Malawi | |||||
| Mkali, Mangochi | 14°28′S, 35°00′E | Mal25-S | 2 | 9 | |
| Thyolo, Seseo | 16°04′S, 35°08′E | Mal26-S | 17 | – | |
| Senegal | |||||
| Barkedji | 15°10′N, 14°52′W | Sen27-M | 13 | – | |
| Dielmo, Ndiop | 13°43′N, 16°25′W | Sen28-M | 19 | – | |
| Ndialakar | 16°08′N, 16°27′W | Sen29-M | – | 5 | |
| Tanzania | |||||
| Kyela | 09°34′S, 33°51′E | Tz30-S | 9 | 11 | |
| Nyakariro | 03°40′S, 33°26′E | Tz31-S | – | 8 | |
| The Gambia | |||||
| Kaur, Bassé, Farafenni | 13°28′N, 15°50′W | Gam32-M | – | 17 | |
| GC | |||||
| Sombili,Timbi Madina | 11°24′N, 12°16′W | Gc33-S | – | 10 | |
| Mali | |||||
| Banambani, Moribabougou | 12°47′N, 08°02′W | Mali34-M/S | – | 33 | |
| Pimperena | 11°28′N, 05°42′W | Mali35-S | – | 5 | |
| Burkina Faso | |||||
| Goundry | 12°30′N, 01°20′W | Bf36-M | – | 5 | |
| Bf36-S | – | 1 | |||
| Madagascar | |||||
| Beforona | 18°58′S, 46°34′E | Mad37-S | – | 5 | |
| Totals | 833 | 781 | |||
GPS coordinates were taken by hand in field, taken from previous publications, or attained via Global Gazetteer Version 2.1 (http://www.fallingrain.com/world/).
Côte d'Ivoire is labeled by the English name Ivory Coast. CAR, Central African Republic; DRC, Democratic Republic of Congo; EG, Equatorial Guinea; GC, Guinea Conakry. Chromosome forms are identified as follows: FOR, forest; MOP, Mopti; SAV, Savanna. Molecular forms are either not known, not scored, or uncertain.
DNA sequences that have been added from previously published papers.
Figure 1Map of Africa showing the localities where Anopheles gambiae sensu stricto were sampled. Population numbers correspond to population numbers given in Table 1 and the shade of the dots (black or white) indicates whether ND5 and/or ITS sequences were generated from individuals from said locality. Dashed arrows show general historical slave route to STP.
Figure 2(A) Neighbor-joining tree based on pair-wise FST values between populations of Anopheles gambiae sensu stricto estimated using the ND5 mitochondrial gene. To better distinguish the M and S populations, M-form populations are written in italics. Triangles indicate populations that have been split into an S-form population and an M-form population. (B) Neighbor-joining tree based on pair-wise FST values between populations of Anopheles gambiae estimated using the ITS sequence data. NJ trees are not intended to infer phylogenetic relationships but rather to show clusters of FST values. Taxa are labeled as in Table 1.
Statistical permutation tests of population subdivision between STP populations and the continental population with smallest FST values between STP and continental population comparisons
| St21(60) | Pr22(62) | GH14M(6) | IC16M(3) | Sen27M(13) | CAR08(7) | Ben04M(41) | |
|---|---|---|---|---|---|---|---|
| St20(57) | ns/*/ns | ***/***/*** | ***/*/** | ***/*/ns | ***/**/** | ***/***/*** | ***/***/*** |
| St21 | – | ***/***/*** | ***/*/** | ***/ns/ns | ***/***/*** | ***/**/*** | ***/***/*** |
| Pr22 | – | – | ***/***/*** | ***/***/*** | ***/***/*** | ***/***/*** | ***/***/*** |
Three statistics are given χ2, a haplotype frequency-based statistic, KST*, a sequence-based statistic, and Snn. Levels of significance are coded as follows: ns, not significant; *0.01 < P < 0.05; **0.001 < P < 0.01; ***P < 0.001. For each population, comparison the results are given as χ2/KST*/Snn. Populations are coded as in Table 1 and Fig. 2A. Numbers in parenthesis are sample sizes.
One-thousand permutations were made per test. Here, we perform 18 pair-wise comparisons, with 18 tests there is a 60.28% probability of finding a significant result by chance. Calculation of a strict Bonferroni adjustment without correlation, something we view as overly conservative in our situation, would require an alpha of 0.05, a P-value of 0.0027.
Figure 3tcs network for the ITS sequence data labeled following Gentile et al. 2002. Number of taxa = 781, number of haplotypes = 24. Haplotypes n1–n5 indicate new haplotypes. The black square and the ovals represent M molecular types; grey ovals are S molecular types. Haplotypes labeled II and III are M forms. Haplotypes labeled I are S forms. Small empty ovals represent missing haplotypes. The square haplotype (IIIB), representing the inferred ancestral haplotype, was found in populations St20 (frequency = 0.21), St21 (0.34), Pr22 (0.21), Ang01 (1.00), and Ang02 (0.97). Size of haplotype roughly corresponds to frequency. Lines separating symbols represent mutational steps or single base-pair substitutions. Dotted lines show where loops were broken. Haplotypes are nested into hierarchical clades in preparation for the NCA analysis. Thin lines group haplotypes clustered into one-step clades, dashed lines into two-step clades, and thick lines into three-step clades. Clades are labeled by clade level and hyplotype number, for example, 1-2 is the second haplotype in the one-step clade.
Figure 4Geographical distribution of ITS haplotypes that were found in multiple localities. Where distributions of two haplotypes overlap one haplotype distribution is represented by shaded area. Haplotypes are widespread across large distances. Haplotype numbers correspond to those in Fig. 3. A single haplotype IIIB is found in both island (São Tomé and Príncipe) and continental populations in Angola.
NCA results for all significant clades
| Nested clade permutation analysis | |||
|---|---|---|---|
| Clade 3-1 (observed χ2-statistic = 466.0855, | |||
| Type of distance | Distance | PSS | PSL |
| Clade 2-1 (Interior) | |||
| Within clade | 753.3448 | 0.0000 | 1.0000 |
| Nested clade | 1099.3743 | 0.1480 | 0.8520 |
| Clade 2-2 (Tip) | |||
| Within clade | 1135.2346 | 0.3770 | 0.6230 |
| Nested clade | 1186.7130 | 0.7290 | 0.2710 |
| Interior versus tip clades | I-T distance | PSS | PSL |
| Within clade | −381.8898 | 0.0010 | 0.9990 |
| Nested clade | −87.3387 | 0.2030 | 0.7970 |
| Inference = range expansion for clades 2-1 and 2-2 | |||
| Clade 2-1 (observed χ2-statistic = 91.6938, | |||
| Type of distance | Distance | PSS | PSL |
| Clade 1-1 (interior) | |||
| Within clade | 692.0376 | 0.1230 | 0.8770 |
| Nested clade | 865.0450 | 1.0000 | 0.0000 |
| Clade 1-2 (Tip) | |||
| Within clade | 84.2306 | 0.0000 | 1.0000 |
| Nested clade | 613.7558 | 0.0000 | 1.0000 |
| Clade 1-3 (Tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 691.1035 | 0.9220 | 0.3240 |
| Interior versus tip clades | I-T distance | PSS | PSL |
| Within clade | 608.5208 | 1.0000 | 0.0000 |
| Nested clade | 250.6338 | 1.0000 | 0.0000 |
| Inference = isolation by distance | |||
| Clade 2-3 (Observed χ2-statistic = 102.4387, | |||
| Type of distance | Distance | PSS | PSL |
| Clade 1-5 (interior) | |||
| Within clade | 2071.4351 | 0.9480 | 0.0520 |
| Nested clade | 2062.5834 | 0.9060 | 0.0940 |
| Clade 1-6 (Tip) | |||
| Within clade | 1546.9352 | 0.0440 | 0.9560 |
| Nested clade | 1711.8136 | 0.0760 | 0.9240 |
| Interior versus tip clades | I-T distance | PSS | PSL |
| Within clade | 524.5000 | 0.9580 | 0.0420 |
| Nested clade | 350.7698 | 0.9230 | 0.0770 |
| Inference = isolation by distance | |||
| Clade 1-5 (observed χ2-statistic = 762.4985, | |||
| Type of distance | Distance | PSS | PSL |
| Clade IA (Interior) | |||
| Within clade | 1903.6805 | 0.0000 | 1.0000 |
| Nested clade | 1974.1171 | 0.0000 | 1.0000 |
| Clade In1 (Tip) | |||
| Within clade | 0.0000 | 0.0730 | 1.0000 |
| Nested clade | 602.0199 | 0.0320 | 0.9880 |
| Clade IE (tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 1875.2362 | 0.4620 | 0.5970 |
| Clade IF (Tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 2015.3519 | 0.6390 | 0.4420 |
| Clade In3 (tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 602.0199 | 0.1680 | 0.9670 |
| Clade In2 (Tip) | |||
| Within clade | 0.0000 | 0.0010 | 1.0000 |
| Nested clade | 1923.7367 | 0.4870 | 0.5130 |
| Clade IG (tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 3020.2976 | 0.8230 | 0.1940 |
| Clade IB (Tip) | |||
| Within clade | 0.0000 | 0.0000 | 1.0000 |
| Nested clade | 1711.7679 | 0.2560 | 0.7440 |
| Clade IC (tip) | |||
| Within clade | 0.0000 | 0.0010 | 1.0000 |
| Nested clade | 3629.0913 | 1.0000 | 0.0000 |
| Clade IH (Tip) | |||
| Within clade | 0.0000 | 1.0000 | 1.0000 |
| Nested clade | 2975.8615 | 0.8290 | 0.2050 |
| Interior versus tip clades | I-T distance | PSS | PSL |
| Within clade | 1903.6805 | 1.0000 | 0.0000 |
| Nested clade | −160.3560 | 0.1260 | 0.8740 |
| Inference = restricted gene flow/dispersal with some long distance dispersal for clade IC. | |||
Within and nested clade distances are given, in miles, for both interior (I) and tip (T) clades. PSS, P-value for a significantly small value; PSL, P-value for a significantly large value. I-T distance is the interior clade distance minus the tip clade distance. Inferences based on Templeton key are given for each clade.