| Literature DB >> 34053186 |
Alex Makunin1, Petra Korlević1,2, Naomi Park1, Scott Goodwin1, Robert M Waterhouse3, Katharina von Wyschetzki1, Christopher G Jacob1, Robert Davies1, Dominic Kwiatkowski1, Brandyce St Laurent1, Diego Ayala4,5, Mara K N Lawniczak1.
Abstract
Anopheles is a diverse genus of mosquitoes comprising over 500 described species, including all known human malaria vectors. While a limited number of key vector species have been studied in detail, the goal of malaria elimination calls for surveillance of all potential vector species. Here, we develop a multilocus amplicon sequencing approach that targets 62 highly variable loci in the Anopheles genome and two conserved loci in the Plasmodium mitochondrion, simultaneously revealing both the mosquito species and whether that mosquito carries malaria parasites. We also develop a cheap, nondestructive, and high-throughput DNA extraction workflow that provides template DNA from single mosquitoes for the multiplex PCR, which means specimens producing unexpected results can be returned to for morphological examination. Over 1000 individual mosquitoes can be sequenced in a single MiSeq run, and we demonstrate the panel's power to assign species identity using sequencing data for 40 species from Africa, Southeast Asia, and South America. We also show that the approach can be used to resolve geographic population structure within An. gambiae and An. coluzzii populations, as the population structure determined based on these 62 loci from over 1000 mosquitoes closely mirrors that revealed through whole genome sequencing. The end-to-end approach is quick, inexpensive, robust, and accurate, which makes it a promising technique for very large-scale mosquito genetic surveillance and vector control.Entities:
Keywords: high-throughput sequencing; malaria; population genetics; species identification; vector surveillance
Mesh:
Year: 2021 PMID: 34053186 PMCID: PMC7612955 DOI: 10.1111/1755-0998.13436
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 8.678
Figure 1Positions of 62 amplicons in three Anopheles genomes: An. albimanus (top), An. gambiae (center), and An. funestus (bottom). Colours indicate marker types based on the AgamP3 gene set: exonic (red), intronic (yellow), or intergenic (blue). The An. albimanus X chromosome is not represented due to a lack of amplicon homologues. No translocations between chromosome arms were observed
Figure 2Amplicon sequence recall and variation.
(a) Amplicon recovery across Anopheles species in 135 sequenced samples and 28 reference genomes, dots correspond to individual samples; colours indicate lineages of Anopheles genus. (b) Allele counts per sample for 135 sequenced samples across 62 amplicons. (c) Number of unique sequences in the alignment of sequenced data and reference genomes across 62 amplicons. (d) Number of gaps in the alignment of sequenced data and reference genomes across 62 amplicons; values averaged across aligned unique sequences. Vertical lines in b, d, and d denote chromosome arms (An. gambiae: 2L, 2R, 3L, 3R, X). Colour in c and d indicates amplicon position relative to AgamP3 genes
Figure 3Species identification using the amplicon panel. Species tree cladogram based on 62 amplicons reconstructed in ASTRAL. Support values are given above branches. Groups of closely related species frequently sharing sequence similarity clusters are indicated by colours: blue – unambiguously resolved, red – unresolved. To the right of each species name are the numbers of sequenced samples and reference genomes, respectively, that contributed to this tree (e.g., 2 + 0 indicates two sequenced samples, 0 reference genomes). Series and subgenera are labelled with the same colours as in Figure 2a
Figure 4Population structure determined using the amplicon panel.
(a) UMAP dimensionality reduction on biallelic sites of Ag1000g Phase 2 data set of 1142 An. gambiae and An. coluzzii samples overlapping with positions of 62 amplicons with added amplicon sequencing samples (ANO_SPP) and reference genomes. Colours indicate populations and species, shapes indicate species. (b) Variant counts per amplicon for An. gambiae and An. coluzzii samples (seven amplicon sequenced, three reference genomes) compared to Ag1000g Phase 2 biallelic sites (1142 samples)