| Literature DB >> 34039976 |
Ernest Diez Benavente1, Emilia Manko1, Jody Phelan1, Monica Campos1, Debbie Nolder1,2, Diana Fernandez3, Gabriel Velez-Tobon3, Alberto Tobón Castaño3, Jamille G Dombrowski4, Claudio R F Marinho4, Anna Caroline C Aguiar4, Dhelio Batista Pereira5, Kanlaya Sriprawat6, Francois Nosten6,7, Robert Moon1, Colin J Sutherland1,2, Susana Campino8, Taane G Clark9,10.
Abstract
Despite the high burden of Plasmodium vivax malaria in South Asian countries, the genetic diversity of circulating parasite populations is not well described. Determinants of antimalarial drug susceptibility for P. vivax in the region have not been characterised. Our genomic analysis of global P. vivax (n = 558) establishes South Asian isolates (n = 92) as a distinct subpopulation, which shares ancestry with some East African and South East Asian parasites. Signals of positive selection are linked to drug resistance-associated loci including pvkelch10, pvmrp1, pvdhfr and pvdhps, and two loci linked to P. vivax invasion of reticulocytes, pvrbp1a and pvrbp1b. Significant identity-by-descent was found in extended chromosome regions common to P. vivax from India and Ethiopia, including the pvdbp gene associated with Duffy blood group binding. Our investigation provides new understanding of global P. vivax population structure and genomic diversity, and genetic evidence of recent directional selection in this important human pathogen.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34039976 PMCID: PMC8154914 DOI: 10.1038/s41467-021-23422-3
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Population structure analysis using the whole set of 558 P. vivax isolates.
Analysis of the whole set supports one unique non-differentiated subpopulation for the South Asian isolates. a Neighbour-Joining tree for the 558 isolates, constructed using a genetic distance based on 388,933 high-quality SNPs, and branches coloured based on (b); b ADMIXTURE prediction of a subpopulation (K = 9) visualised using a bar plot. c Principal Components Analysis (PCA) plot of the 558 isolates, with colours based on (b). d Neighbour-Joining tree for South Asia (India 36, Pakistan 32, Afghanistan 22), and East Africa (Ethiopia 29, Eritrea 12, Sudan 5, Uganda 3, Madagascar 1). Thailand isolates (128) were used as a representative population from South East Asia. Tree branches are coloured based on the assignment of the subpopulation from (e). e An ADMIXTURE bar plot illustrating the population structure (K = 4), highlighting a distinction between South Asian countries and Thailand, and East Africa and Thailand. The plot also highlights a degree of mixing between East African and South Asian populations. f PCA plot using the same data as in (d).
SNPs with high FST ( > 0.8) that differentiate P. vivax isolates from South East Asia (SEA), East Africa and South Asia regions.
| Region | Chr. | Position | Ref. | Alt | Effect* | AA change | Gene name | |
|---|---|---|---|---|---|---|---|---|
| SEA | 2 | 155305 | G | T | Non Syn | L1207I | 0.83 | |
| SEA | 2 | 156287 | C | A | Syn | V879 | 0.82 | |
| SEA | 2 | 158223 | G | A | Non Syn | T234M | 0.81 | |
| SEA | 14 | 1070933 | A | T | Non Syn | S310R | 0.86 | |
| SEA | 14 | 1071214 | C | T | Non Syn | V217I | 0.90 | |
| South Asia | 5 | 286241 | A | T | Non Syn | I83F | 0.83 | |
| South Asia | 5 | 618291 | T | G | Non Syn | F165V | 0.95 | |
| South Asia | 6 | 668360 | G | T | Non Syn | T65K | 0.83 | |
| South Asia | 12 | 323703 | G | C | Non Syn | S57T | 0.81 | |
| South Asia | 13 | 336960 | C | A | Non Syn | T1455K | 0.81 | |
| East Africa | 4 | 308246 | A | G | Intron | – | 0.83 | |
| East Africa | 4 | 401595 | C | T | Syn | Y670 | 0.81 | |
| East Africa | 4 | 401608 | T | C | Syn | L675 | 0.85 | |
| East Africa | 4 | 401622 | C | T | Syn | N679 | 0.90 | |
| East Africa | 5 | 722792 | T | G | Non Syn | I172L | 0.86 | |
| East Africa | 5 | 1067476 | C | T | Non Syn | E158K | 0.87 | |
| East Africa | 5 | 1071079 | T | G | Intron | – | 0.92 | |
| East Africa | 6 | 646355 | C | T | Non Syn | M236I | 0.96 | |
| East Africa | 6 | 646864 | C | T | Non Syn | D67N | 0.88 | |
| East Africa | 6 | 661527 | C | G | Non Syn | S273C | 0.85 | |
| East Africa | 7 | 499047 | T | G | Non Syn | I1234S | 0.93 | |
| East Africa | 7 | 507890 | T | A | Syn | P582 | 0.90 | |
| East Africa | 7 | 509284 | C | T | Non Syn | D118N | 0.96 | |
| East Africa | 7 | 509311 | A | G | Non Syn | Y109H | 0.95 | |
| East Africa | 7 | 509312 | G | A | Syn | L108 | 0.95 | |
| East Africa | 7 | 514538 | A | G | Syn | L14 | 0.91 | |
| East Africa | 7 | 525785 | T | C | Syn | A1270 | 0.80 | |
| East Africa | 7 | 556907 | C | G | Non Syn | M709I | 0.89 | |
| East Africa | 11 | 917226 | A | T | Non Syn | F1138L | 0.87 | |
| East Africa | 11 | 1075544 | C | G | Non Syn | S455R | 0.88 | |
| East Africa | 12 | 1869662 | T | C | Non Syn | V414A | 0.84 | |
| East Africa | 13 | 334457 | A | G | Non Syn | K621E | 0.86 | |
| East Africa | 13 | 334917 | T | G | Non Syn | M774R | 0.90 | |
| East Africa | 13 | 610100 | C | A | Non Syn | K841N | 0.89 | |
| East Africa | 14 | 1296340 | T | C | Syn | S205 | 0.81 | |
| East Africa | 14 | 1923223 | G | A | Non Syn | P504S | 0.81 | |
| East Africa | 14 | 1923997 | T | G | Non Syn | M246L | 0.91 | |
| East Africa | 14 | 1923998 | C | G | Syn | T245 | 0.91 | |
| East Africa | 14 | 1924051 | T | C | Non Syn | T228A | 0.91 | |
| East Africa | 14 | 1924658 | C | T | Syn | E25 | 0.85 |
AA amino acid, *Non Syn non-synonymous mutation, Syn synonymous mutation, Inter intergenic region, Start lost start codon lost.
**Within Region vs. other isolates.
See Supplementary Data 3 for all regional comparisons.
Common mutations (allele frequency %) in putative drug-resistance genes.
| Gene name | Position | Amino acid change | South Asia ( | East Africa ( | South America ( | SEA ( | South SEA ( |
|---|---|---|---|---|---|---|---|
| 1077510 | N50I | 7.1 | |||||
| 1077530/2 | F57I | ||||||
| 1077530/2 | F57L | 4.3 | |||||
| 1077534 | K58R | 8.8 | |||||
| 1077535 | S58R | 34.5 | |||||
| 1077543 | T61M | 36.5 | |||||
| 1077711 | N117T | 0.7 | |||||
| 1077711 | N117S | 6.3 | 20.3 | 9.5 | |||
| 1078090 | D243E | 3.6 | |||||
| 1078180 | N273K | 10.7 | |||||
| 1270256 | M601I | 6.6 | |||||
| 1270401 | A553G | 11.0 | 0.6 | ||||
| 1270524 | K512M | 3.8 | |||||
| 1270683 | D459A | 3.3 | |||||
| 1270793 | C422W | 15.3 | |||||
| 1270911 | G383A | 47.2 | 3.8 | 32.1 | |||
| 1270914 | S382C | 10.6 | |||||
| 1270915 | S382A | 21.1 | |||||
| 1270966 | F365L | 4.4 | |||||
| 1271444 | M205I* | 8.2 | |||||
| 1271634 | E142G* | ||||||
| 1271664 | E132G* | 38.8 | |||||
| 478789 | L1449I | 7.8 | |||||
| 478955 | K1393N | 11.2 | 2.0 | 10.8 | 1.2 | ||
| 479329 | T1269S | 10.1 | |||||
| 479908 | L1076F | 2.2 | 33.5 | 1.2 | |||
| 480207 | F976Y | 46.0 | 11.9 | ||||
| 480412 | L908M | 13.5 | |||||
| 480552 | A861E | 4.4 | 3.8 | ||||
| 480601 | L845F | 14.3 | 8.7 | 2.4 | |||
| 480846 | A763V | 3.8 | |||||
| 481042 | S698G | 38.0 | 0.5 | ||||
| 481595 | S513R | 35.6 | 0.6 | 23.2 | |||
| 481636 | D500N | 17.3 | |||||
| 481908 | T409M | 11.1 | |||||
| 482473 | V221L | 18.9 | |||||
| 153954 | N1657S | 30.1 | |||||
| 154065 | I1620T | 4.7 | |||||
| 154107 | A1606D | 7.5 | |||||
| 154108 | A1606T | 7.6 | |||||
| 154168 | H1586Y | 9.3 | 28.9 | 3.8 | |||
| 154216 | D1570Y | 3.5 | |||||
| 154294 | V1544I | 8.5 | |||||
| 154350 | T1525I | 6.3 | |||||
| 154492 | I1478V | 20.3 | 23.5 | 15.4 | |||
| 154668 | G1419A | 23.4 | 16.7 | 33.8 | |||
| 154747 | D1393Y | 18.3 | 4.9 | 0.5 | |||
| 154831 | L1365F | 3.5 | |||||
| 154843 | L1361F | 3.0 | 0.5 | ||||
| 155080 | L1282I | 9.6 | |||||
| 155305 | L1207I | ||||||
| 155871 | C1018Y | 6.1 | |||||
| 156208 | E906Q | 42.6 | 48.6 | 24.8 | 4.3 | 9.4 | |
| 157300 | K542E | 14.1 | 2.5 | 1.4 | |||
| 158148 | R259T* | 39.3 | 13.3 | 0.5 | |||
| 158223 | T234M | 1.2 | |||||
| 1802108 | I165V | 7.7 | 24.5 | 22.6 |
Allele frequency bolded if >50%; South Asia (Afghanistan, Bangladesh, India, Sri Lanka, Pakistan); East Africa (Eritrea, Ethiopia, Madagascar), South America (Brazil, Colombia, Guyana, Peru, Mexico); SEA (Cambodia, China, Laos, Myanmar, Thailand, Vietnam); Southern SEA (Malaysia, Papua New Guinea, Indonesia, The Philippines); * in the HPPK-coding part of the bifunctional gene.
Fig. 2Evidence of selective sweeps.
Manhattan plots showing the genome-wide results of the iHS analysis on P. vivax from South Asia and East Africa regions (a, b); Rsb analysis for P. vivax between geographical regions (c, d) and countries (e, f). Loci in critical regions (above red lines: orange points iHS P < 1 × 10−5; purple points Rsb P < 1 × 10−5; two-sided tests) are reported in Supplementary Data 6 and Supplementary Data 7.