| Literature DB >> 36034708 |
Johanna Helena Kattenberg1, Hong Van Nguyen2, Hieu Luong Nguyen2, Erin Sauve1, Ngoc Thi Hong Nguyen3, Ana Chopo-Pizarro1, Hidayat Trimarsanto4, Pieter Monsieurs1, Pieter Guetens1, Xa Xuan Nguyen5, Marjan Van Esbroeck6, Sarah Auburn4,7,8, Binh Thi Huong Nguyen2, Anna Rosanas-Urgell1.
Abstract
Although the power of genetic surveillance tools has been acknowledged widely, there is an urgent need in malaria endemic countries for feasible and cost-effective tools to implement in national malaria control programs (NMCPs) that can generate evidence to guide malaria control and elimination strategies, especially in the case of Plasmodium vivax. Several genetic surveillance applications ('use cases') have been identified to align research, technology development, and public health efforts, requiring different types of molecular markers. Here we present a new highly-multiplexed deep sequencing assay (Pv AmpliSeq). The assay targets the 33-SNP vivaxGEN-geo panel for country-level classification, and a newly designed 42-SNP within-country barcode for analysis of parasite dynamics in Vietnam and 11 putative drug resistance genes in a highly multiplexed NGS protocol with easy workflow, applicable for many different genetic surveillance use cases. The Pv AmpliSeq assay was validated using: 1) isolates from travelers and migrants in Belgium, and 2) routine collections of the national malaria control program at sentinel sites in Vietnam. The assay targets 229 amplicons and achieved a high depth of coverage (mean 595.7 ± 481) and high accuracy (mean error-rate of 0.013 ± 0.007). P. vivax parasites could be characterized from dried blood spots with a minimum of 5 parasites/µL and 10% of minority-clones. The assay achieved good spatial specificity for between-country prediction of origin using the 33-SNP vivaxGEN-geo panel that targets rare alleles specific for certain countries and regions. A high resolution for within-country diversity in Vietnam was achieved using the designed 42-SNP within-country barcode that targets common alleles (median MAF 0.34, range 0.01-0.49. Many variants were detected in (putative) drug resistance genes, with different predominant haplotypes in the pvmdr1 and pvcrt genes in different provinces in Vietnam. The capacity of the assay for high resolution identity-by-descent (IBD) analysis was demonstrated and identified a high rate of shared ancestry within Gia Lai Province in the Central Highlands of Vietnam, as well as between the coastal province of Binh Thuan and Lam Dong. Our approach performed well in geographically differentiating isolates at multiple spatial scales, detecting variants in putative resistance genes, and can be easily adjusted to suit the needs in other settings in a country or region. We prioritize making this tool available to researchers and NMCPs in endemic countries to increase ownership and ensure data usage for decision-making and malaria policy.Entities:
Keywords: Plasmodium vivax; drug resistance; genetic surveillance; malaria; molecular epidemiology and population genetics; next generation sequencing (NGS); use case
Mesh:
Substances:
Year: 2022 PMID: 36034708 PMCID: PMC9403277 DOI: 10.3389/fcimb.2022.953187
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 6.073
Figure 1Flow-chart of sample selection for Pv AmpliSeq and WGS analysis. A database with details of samples selected for WGS can be found in and for samples and controls included in the Pv AmpliSeq meta data can be downloaded from https://microreact.org/project/k86kAAWw9Z8PNeUYBj9bvh-plasmodium-vivax-ampliseq-vietnam-and-global.
Figure 6Geographical separation of selected haplotypes in drug resistance associated genes genotyped by Pv AmpliSeq assay. (A) Global spread of pvdhps and pvdhfr validated mutations for SP resistance (dhfr: F57L,S58R/K,T61M,N117T/S; dhps: A383G, A553G). Map background created in https://www.mapchart.net/ (B) Spread of pvmdr1 haplotypes (V221L, D500N, S513R, S698G, L845F, L908M, T958M, Y976F, F1076L) and (C) pvcrt haplotypes (intron 357 + 83G>A, R121K, E207Q, I319M, D328D(A>G), intron 1003-46G>A) at a provincial level in Central Vietnam. Background map created in QGIS v3.16 with spatial data from https://www.diva-gis.org/.
List of potential drug resistance genes targeted in the Pv Ampliseq assay.
| Chromosome | Gene name | Gene ID | Drug resistance | references |
|---|---|---|---|---|
| PvP01_01_v1 |
| PVP01_0109300/PVX_087980 | CQ (putative) | ( |
| PvP01_02_v1 |
| PVP01_0203000/PVX_097025 | CQ, PIP (both putative) | ( |
| PvP01_03_v1 |
| PVP01_0312700/PVX_000585 | CQ (putative) | ( |
| PvP01_05_v1 |
| PVP01_0526600/PVX_089950 | PYR | ( |
| PvP01_10_v1 |
| PVP01_1010900/PVX_080100 | CQ (putative) | ( |
| PvP01_10_v1 |
| PVP01_1018600/PVX_080480 | ART (putative) | ( |
| PvP01_11_v1 |
| PVP01_1103800/PVX_115370 | In IBD region recurrent infections after CQ | ( |
| PvP01_12_v1 |
| PVP01_1211100/PVX_083080 | ART (putative) | ( |
| PvP01_12_v1 |
| PVP01_1259100/PVX_118100 | CQ (putative) | ( |
| PvP01_14_v1 |
| PVP01_1429500/PVX_123230 | SULF; CQ (putative) | ( |
| PvP01_14_v1 |
| PVP01_1447300/PVX_124085 | CQ (putative) | ( |
The assay targets the full length genes listed above with several amplicons per gene. CQ, chloroquine; PIP, piperaquine; PYR, pyrimethamine; ART, artemisinins; SULF, sulfadoxine.
Figure 2Chromosomal position of amplicons in the Pv AmpliSeq design for Vietnam. Amplicons are depicted on the 14 nuclear chromosomes of the PvP01 reference genome. Amplicons targeting drug resistance associated genes are colored in yellow, amplicons targeting the 42-SNP Barcode position in blue, the 33-SNP global barcode in red, and ama1 in green.
Overview of good quality (coverage >15 and ≤50% genotypes missing) samples included in the Pv AmpliSeq analysis.
|
|
|
|
|
|
| unknown | 2 | 0.51 | 2020 (1) | |
| AFR (n=29; 7.4%) | Burundi | 1 | 0.25 | 2019 |
| DRC | 1 | 0.25 | 2019 | |
| Eritrea | 12 | 3.05 | 2014, 2015, 2016 | |
| Ethiopia | 13 | 3.30 | 2012, 2013, 2014, 2015, 2016, 2019 | |
| Mauritania | 1 | 0.25 | 2013 | |
| Senegal | 1 | 0.25 | 2012 | |
| AMR (n=24; 6.1%) | Brazil | 2 | 0.51 | 2016, 2019 |
| Colombia | 4 | 1.02 | 2014, 2015, 2019 | |
| Guyana | 3 | 0.76 | 2013, 2014, 2015 | |
| Panama | 2 | 0.51 | 2013 | |
| Peru | 13 | 3.30 | 2008 | |
| EMR (n=37; 9.4%) | Afghanistan | 16 | 4.06 | 2012, 2013, 2015, 2016, 2017, 2018, 2019 |
| Pakistan | 13 | 3.30 | 2014, 2015, 2016, 2017,2 018, 2019 | |
| Somalia | 1 | 0.25 | 2016 | |
| Sudan | 7 | 1.78 | 2015, 2019 | |
| EUR (n=1; 0.3%) | Spain | 1 | 0.25 | 2016 |
| SEAR (n= 27; 6.9%) | Bangladesh | 3 | 0.76 | 2014, 2015 |
| Bhutan | 2 | 0.51 | 2014 | |
| India | 15 | 3.81 | 2011, 2012, 2013, 2014, 2015, 2017, 2018 | |
| Indonesia | 5 | 1.27 | 2010, 2012, 2013, 2016, 2017 | |
| Thailand | 2 | 0.51 | 2006, 2007 | |
| WPRO (n=274; 69.5%; without VTN n= 13; 3.3%) | Cambodia | 1 | 0.25 | 2018 |
| China | 2 | 0.51 | 2009, 2011 | |
| Malaysia | 3 | 0.76 | 2014 | |
| Papua New Guinea | 5 | 1.27 | 2010, 2012, 2013, 2014, 2019 | |
| Philippines | 1 | 0.25 | 2015 | |
| Singapore | 1 | 0.25 | 2015 | |
| Vietnam | 261 | 66.24 | 2015, 2016, 2017, 2018, 2019 | |
| ALL | 28 countries | 394 | 100% | 2006 - 2020 |
Vietnam sample characteristics and epidemiological data from 2018 and 2019 (from NIMPE annual malaria report).
| Province | N per province | Year | N per year | % of N total | % of N province | Recorded Pv cases NMCP | % of annual cases genetically analyzed | Pv incidence rate per 1000 pers. yr at risk | % male | median age | % with fever |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Binh Phuoc | 44 | 2016 | 5 | 1.9 | 11.4 | 20 | |||||
| 2018 | 28 | 10.7 | 63.6 | 460 | 6.1% | 0.56 | 89.3% | 21 | 100% | ||
| 2019 | 11 | 4.2 | 25.0 | 209 | 5.3% | 0.28 | 100.0% | 25 | 91% | ||
| Binh Thuan | 28 | 2018 | 5 | 1.9 | 17.9 | 73 | 6.8% | 0.11 | 80.0% | 17 | 80% |
| 2019 | 23 | 8.8 | 82.1 | 118 | 19.5% | 0.21 | 95.7% | 26 | 91% | ||
| Dak Lak | 6 | 2015 | 6 | 2.3 | 100.0 | ||||||
| Dak Nong | 12 | 2016 | 1 | 0.4 | 8.3 | ||||||
| 2019 | 11 | 4.2 | 91.7 | 60 | 18.3% | 0.11 | 90.9% | 30 | 82% | ||
| Gia Lai | 102 | 2016 | 10 | 3.8 | 9.8 | 90.0% | 19 | ||||
| 2018 | 47 | 18.0 | 46.1 | 336 | 14.0% | 0.35 | 93.6% | 24 | 85% | ||
| 2019 | 45 | 17.2 | 44.1 | 448 | 10.0% | 0.44 | 84.4% | 26 | 89% | ||
| Khanh Hoa | 21 | 2017 | 6 | 2.3 | 28.6 | ||||||
| 2018 | 4 | 1.5 | 19.0 | 34 | 11.8% | 0.18 | 75.0% | 32 | 100% | ||
| 2019 | 11 | 4.2 | 52.4 | 61 | 18.0% | 0.54 | 81.8% | 24 | 91% | ||
| Kon Tum | 3 | 2018 | 3 | 1.1 | 100.0 | 100.0% | 33 | 100% | |||
| Lam Dong | 37 | 2018 | 15 | 5.7 | 40.5 | 144 | 10.4% | 0.22 | 93.3% | 22 | 80% |
| 2019 | 22 | 8.4 | 59.5 | 102 | 21.6% | 0.47 | 95.5% | 25.5 | 86% | ||
| Quang Tri | 8 | 2019 | 8 | 3.1 | 100.0 | 59 | 13.6% | 0.45 | 62.5% | 22.5 | 100% |
| ALL | 261 |
Detection of an artificially created mixed clone infection of two previously genotyped samples from the same period and area in Vietnam.
| Ratio of mixed isolates | Mean depth | % of heterozygotes detected | |
|---|---|---|---|
| n detected of total variant loci | % variant loci detected | ||
| 50:50 | 212.0 | 56/58 | 96.6% |
| 80:20 | 192.5 | 56/58 | 96.6% |
| 90:10 | 182.2 | 54/58 | 93.1% |
| 95:5 | 197.3 | 44/58 | 75.9% |
| 98:2 | 174.4 | 46/58 | 79.3% |
The minority clone could be detected down to 2%, although at ≥5% some genotypes of the minority clone could no longer be detected.
Figure 3Density plot of minor allele frequencies of 33-SNP vivaxGEN-geo barcode (yellow) and 42-SNP within-country Vietnam barcode (purple) in samples tested with the Pv AmpliSeq assay. Minor allele frequencies were calculated in study samples in all regions (n=394, left) or in Vietnam only (n=261, right).
Figure 4Discriminant analysis of principal components (DAPC) of global isolates (n = 148), incl. 15 samples randomly selected from Vietnam. Scatter plot of discriminant analysis (DA) eigenvalues 1 and 2 (A) using all biallelic SNPs detected by the Pv AmpliSeq showed a differentiation from east to west from the Pacific into Asia along the x-axis and a separation between Africa and the Americas along the y-axis. Scatter plot of DA eigenvalues 3 and 4 (B) groups African and American samples together, close to South East Asian and Vietnam isolates and separately cluster the WPR islands and Central Asia and India. SNPs contributing most to the DAPC are listed in , . DAPC was performed with 20 principal components and 20 discriminants as determined through cross-validation.
Figure 5Heatmap of predicted origin in the likelihood model vs. expected origin of samples based on collection site and travel history using the 72-SNP barcode. Predicted origin of the samples was calculated with the vivaxGEN-geo likelihood classifier (Trimarsanto et al., 2019). Countries of origin that were not represented in the reference dataset and hence could not be directly predicted are indicated in red on the origin axis; these countries can be included in future iterations of the classifier to improve its accuracy. Only samples from patients who presented in malaria-endemic countries are illustrated (i.e., excluding samples from Spain, Singapore and Unknown origin samples that were present in the larger dataset). The strong diagonal trend illustrates the regional accuracy of the predictor, with most infections mapping directly to the country of origin or to neighboring countries e.g., with Vietnamese samples mapping to either Vietnam or Cambodia. In border regions with extensive parasite gene flow, many infections may not be classifiable by national (political) boundaries, and regional boundaries may prove more useful administrative units for classification.
Figure 7Discriminant analysis of principal components (DAPC) of Vietnam isolates (n = 261) grouped by province. Scatter plot of DA eigenvalues 1 and 2 (A) using all biallelic SNPs detected by the Pv AmpliSeq showed a differentiation of Lam Dong and Binh Thuan along the x-axis and a separation of Khanh Hoa province along the y-axis. Scatter plot of DA eigenvalues 3 and 4 (B) groups separately clusters isolates from the most Northern province of Quang Tri. SNPs contributing most to the DAPC are listed in , . DAPC was performed with 60 principal components and 8 discriminants as determined through cross-validation.
Figure 8Genetic diversity and complexity of infection in nine provinces in Vietnam 2015-2019. (A). Expected heterozygosity of 42-SNP Vietnam barcode positions (A) and proportion of multiple clone infections (B) in samples (n = 261) in Vietnam. He = Expected heterozygosity; COI = complexity of infection.
Genetic differentiation among P. vivax population in different provinces in Vietnam.
| Kon Tum | Gia Lai | Dak Lak | Dak Nong | Binh Phuoc | Khanh Hoa | Lam Dong | Binh Thuan | |
|---|---|---|---|---|---|---|---|---|
| Quang Tri | 0.1224 | 0.1791 | 0.0293 | 0.0687 | 0.0344 | 0.0598 | 0.1805 | 0.1963 |
| Kon Tum | 0.1623 | 0.0887 | 0.0519 | 0.0390 | 0.1029 | 0.1294 | 0.1709 | |
| Gia Lai | 0.1473 | 0.1576 | 0.0931 | 0.1413 | 0.2425 | 0.2318 | ||
| Dak Lak | 0.0475 | 0.0253 | 0.1132 | 0.2222 | 0.2498 | |||
| Dak Nong | 0.0230 | 0.0630 | 0.1023 | 0.1419 | ||||
| Binh Phuoc | 0.0591 | 0.1420 | 0.2100 | |||||
| Khanh Hoa | 0.1884 | 0.1542 | ||||||
| Lam Dong | 0.0281 |
Pairwise FST values (Weir and Cockerham) were estimated with 1,000 bootstraps using the diveRsity package in R.
Figure 9P. vivax parasite connectivity in Vietnam. (A) Summed pairwise IBD-sharing of isolates between and within provinces. (B) Connectivity networks inferred by IBD between P. vivax isolates from Gia Lai province. Edges connecting parasite pairs indicate that >60% of their genomes descended from a common ancestor without intervening recombination. Node colors indicate the year of collection.