| Literature DB >> 22863733 |
Daniel E Neafsey1, Kevin Galinsky, Rays H Y Jiang, Lauren Young, Sean M Sykes, Sakina Saif, Sharvari Gujja, Jonathan M Goldberg, Sarah Young, Qiandong Zeng, Sinéad B Chapman, Aditya P Dash, Anupkumar R Anvikar, Patrick L Sutton, Bruce W Birren, Ananias A Escalante, John W Barnwell, Jane M Carlton.
Abstract
We sequenced and annotated the genomes of four P. vivax strains collected from disparate geographic locations, tripling the number of genome sequences available for this understudied parasite and providing the first genome-wide perspective of global variability in this species. We observe approximately twice as much SNP diversity among these isolates as we do among a comparable collection of isolates of P. falciparum, a malaria-causing parasite that results in higher mortality. This indicates a distinct history of global colonization and/or a more stable demographic history for P. vivax relative to P. falciparum, which is thought to have undergone a recent population bottleneck. The SNP diversity, as well as additional microsatellite and gene family variability, suggests a capacity for greater functional variation in the global population of P. vivax. These findings warrant a deeper survey of variation in P. vivax to equip disease interventions targeting the distinctive biology of this neglected but major pathogen.Entities:
Mesh:
Year: 2012 PMID: 22863733 PMCID: PMC3432710 DOI: 10.1038/ng.2373
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Strains and isolates of P. vivax and P. falciparum used in this study.
| Geographic Origin | ||
|---|---|---|
| Latin America | Salvador I (El Salvador)[ | HB3 (Honduras)[ |
| South Asia (India) | India VII[ | ML-14 |
| East Asia | North Korean[ | Dd2 (Indochina |
| Africa | Mauritania I[ | 3D7[ |
Parenthetic inclusions indicate more specific geographic origination, where known. Citations for the two reference sequences P. vivax Salvador I and P. falciparum 3D7 are their respective genome papers
Assembly statistics of four P. vivax reference strains sequenced using Illumina technology.
| Strain | Assembly | Fold | Contig | No. | Scaffold | No. | % coverage |
|---|---|---|---|---|---|---|---|
| 28.87 | 68.5 | 28.2 | 1,999 | 885.6 | 260 | 98.0 | |
| 29.25 | 35.0 | 21.2 | 3,358 | 594.6 | 568 | 98.1 | |
| 28.43 | 91.1 | 39.4 | 1,510 | 945.1 | 205 | 97.9 | |
| 29.65 | 87.6 | 22.1 | 2,499 | 317.6 | 541 | 98.8 |
Figure 1Disparity in SNP and microsatellite diversity between P. vivax and P. falciparum. (a) Quality score vs. pairwise whole genome SNP rates against reference assemblies. Blue lines indicate P. vivax isolates and red lines indicate P. falciparum isolates. (b) P. falciparum vs. P. vivax Q30 call rates for: coding sequence (CDS), whole genome, 5’ flanking sequence (1 kb), 3’ flanking sequence (1 kb), all intergenic sequence, introns, and fourfold degenerate (4D) synonymous coding sites. c) Density distribution of P. falciparum/P. vivax diversity log ratios for genes with 1:1 orthologs, compared to null expected distribution centered on 1. (d) Histogram of microsatellite diversity in microsatellite loci with a repeat unit size of two bp. Error bars indicate standard errors. Asterisks indicate size bins for which P. vivax is significantly more diverse than P. falciparum (bootstrapping, P < 0.05).
Figure 2Neighbor-Joining phylograms of P. vivax and P. falciparum, constructed from presumably neutral SNPs occurring in fourfold degenerate coding sites. Lineages are colored according to geographic origin: Red = Central/South America, Purple = Africa, Green = India, Teal = Southeast Asia. Branch lengths indicate considerable diversity in New World P. vivax strains as well as no clear affiliation between New World and African P. vivax strains. Phylograms were constructed from 471,543 sites in P. vivax and 359,901 sites in P. falciparum. Numbers at nodes indicate % bootstrap support.
Figure 3Diversity of P. vivax gene families. (a) Mean pairwise SNP diversity (π) in P. vivax gene families. Gene families associated with merozoite invasion or immune response modulation (red text) exhibit highest diversity. Red bars on the box plots represent the 25–75th percentile range, and circles indicate outlier genes. (b) Limited overlapping vir repertoires of P. vivax isolates. Vir genes exhibiting at least 70% sequence identity between isolates were included in the Venn diagram. A set of 15 ‘ultra-conserved’ vir genes with more than 95% similarity in all comparisons are included in the central red circle. (c) A neighbor-joining phylogenetic tree of ultra-conserved vir genes and related paralogs from the vir12 and vir14 subfamilies. The most highly conserved vir, PVX_113230, has clear orthologs in other Plasmodium species.