| Literature DB >> 29643376 |
Angela M Early1,2, Marc Lievens3, Bronwyn L MacInnis4,5, Christian F Ockenhouse6, Sarah K Volkman4,5,7, Samuel Adjei8, Tsiri Agbenyega8, Daniel Ansong8, Stacey Gondi9, Brian Greenwood10, Mary Hamel11, Chris Odero11, Kephas Otieno11, Walter Otieno9, Seth Owusu-Agyei10,12,13, Kwaku Poku Asante12, Hermann Sorgho14, Lucas Tina9, Halidou Tinto14, Innocent Valea14, Dyann F Wirth4,5, Daniel E Neafsey15,16.
Abstract
Host immunity exerts strong selective pressure on pathogens. Population-level genetic analysis can identify signatures of this selection, but these signatures reflect the net selective effect of all hosts and vectors in a population. In contrast, analysis of pathogen diversity within hosts provides information on individual, host-specific selection pressures. Here, we combine these complementary approaches in an analysis of the malaria parasite Plasmodium falciparum using haplotype sequences from thousands of natural infections in sub-Saharan Africa. We find that parasite genotypes show preferential clustering within multi-strain infections in young children, and identify individual amino acid positions that may contribute to strain-specific immunity. Our results demonstrate that natural host defenses to P. falciparum act in an allele-specific manner to block specific parasite haplotypes from establishing blood-stage infections. This selection partially explains the extreme amino acid diversity of many parasite antigens and suggests that vaccines targeting such proteins should account for allele-specific immunity.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29643376 PMCID: PMC5895824 DOI: 10.1038/s41467-018-03807-7
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Schematic representation of CSP, TRAP, and SERA2 expression profiles. The three genes have contrasting expression patterns that affect the source and duration of the selection pressures they experience. CSP and TRAP are among the most abundantly expressed proteins in the sporozoite stage[10]. SERA2 expression is highest during the blood stage and is in the 50th percentile of genes expressed in schizonts[65]
Population genetic statistics for CSP, TRAP, and SERA2 calculated using genome-wide sequencing data
|
|
|
| ||||
|---|---|---|---|---|---|---|
| Full gene | Amplicon region | Full gene | Amplicon region | Full gene | Amplicon region | |
| Nucleotide diversity | ||||||
| Senegal | 0.00601 (99) | 0.0204 (99) | 0.0102 (99) | 0.0122 (99) | 0.00191 (96) | 0.00888 (99) |
| Malawi | 0.00806 (99) | 0.0211 (99) | 0.0102 (99) | 0.0132 (99) | 0.00198 (96) | 0.00894 (99) |
| Non-synonymous nucleotide diversity | ||||||
| Senegal | 0.00674 (99) | 0.0254 (99) | 0.0120 (99) | 0.0137 (99) | 0.00181 (96) | 0.0120 (99) |
| Malawi | 0.00713 (99) | 0.0263 (99) | 0.0120 (99) | 0.0144 (99) | 0.00197 (97) | 0.0119 (99) |
| Tajima’s | ||||||
| Senegal | −1.39 (78) | -0.132 (96) | 0.536 (98) | 0.298 (98) | −1.21 (84) | −0.494 (94) |
| Malawi | −0.765 (94) | 0.519 (99) | 0.712 (99) | 0.257 (99) | −1.24 (89) | −0.868 (94) |
|
| ||||||
| Senegal–Malawi | 0.0512 (89) | 0.0163 (57) | 0.0576 (91) | 0.0322 (80) | 0.0236 (71) | 0.00716 (22) |
Genome-wide percentiles are given in parentheses
Fig. 2Genome-wide signatures of P. falciparum polymorphism in Senegal and Malawi. a CSP, TRAP, and SERA2 show unusually high levels of non-synonymous nucleotide diversity. b TRAP, the TRAP amplicon region, and the CSP amplicon region have unusually high Tajima’s D values. c None of the genes or amplicon regions show strong population differentiation (FST). Each point represents the coding sequence from a single protein-coding gene. Solid and dashed lines mark the 95th and 99th percentiles, respectively. Focal genes are marked with filled symbols: CSP (circle), TRAP (square), SERA2 (triangle). Full genes are colored blue, whereas the regions selected for amplicon sequencing are colored red
Summary of nucleotide variants described with targeted amplicon sequencing
| Amplicon region (amplicon length) | Sampled infections | Sequenced amplicons | Total variant sites | Nonsyn. variant sites | Singleton variants | Common variantsa | Highly variant sitesb |
|---|---|---|---|---|---|---|---|
| Total | 1687 | 3821 | 55 | 50 | 15 | 23 | 12 |
| West Africa | 935 | 2147 | 46 | 43 | 8 | 24 | 8 |
| East Africa | 752 | 1674 | 41 | 39 | 8 | 23 | 8 |
| Total | 4209 | 8774 | 52 | 52 | 13 | 15 | 5 |
| West Africa | 2334 | 4925 | 38 | 38 | 7 | 16 | 5 |
| East Africa | 1875 | 3849 | 38 | 38 | 10 | 15 | 3 |
| Total | 4209 | 9007 | 67 | 67 | 19 | 12 | 16 |
| West Africa | 2334 | 5048 | 53 | 53 | 17 | 13 | 12 |
| East Africa | 1875 | 3959 | 49 | 49 | 11 | 8 | 11 |
a Minor allele frequency >0.02
b >2 segregating alleles
Fig. 3Nucleotide diversity and population differentiation within CSP, TRAP, and SERA2 as measured with amplicon sequencing. a Graphs of pairwise nucleotide diversity (π) present the mean ± 1 S.D. across all five study sites. b Graphs of population differentiation show the mean ± 1 S.D. of pairwise FST values between western and eastern study sites. Statistics were calculated using SNPs within overlapping 10-nt windows with a step size of 5 nt. The colored bars in the CSP graph mark three previously identified regions of high diversity: DV10 (blue), Th2R (green), and Th3R (yellow)
Fig. 4Linkage disequilibrium (LD) within the CSP, TRAP, and SERA2 amplicon regions. LD between polymorphic nucleotides was calculated as Q*, an extension of r2 that allows for multiple alleles per site. Corresponding amino acid positions are in parentheses. Positions marked in red showed evidence for altered intra-host diversity at the amino acid level. For CSP, shading marks the positions within the DV10 (blue), Th2R (green), and Th3R (yellow) epitope regions. Only nucleotide positions with a major allele frequency <0.98 were included in the analysis. Data shown are from Nanoro, Burkina Faso. Observed LD trends were consistent across all five study sites (Supplementary Fig. 1)
Fig. 5Infection simulations. For each study site, we created 10,000 simulated datasets to test whether haplotypes are distributed across infections according to simple random sampling from the population. Haplotype counts within the n infections were held constant while the identity of each haplotype was resampled from the complete pool of sequenced haplotypes. To determine whether selection affects infection composition, we compared the diversity within the observed and simulated infections at each amino acid position and across the entire amplicon
Fig. 6Deviation from expected within-host diversity at individual amino acid positions within the CSP, TRAP, and SERA2 amplicon regions. Points above the dotted line mark amino acid positions with significant heterozygosity differences between observed and simulated infections after Bonferroni correction. Filled circles show a reduction in observed within-infection diversity whereas open circles show an increase in observed within-infection diversity. The size of the point corresponds to the effect size estimated with a quasi-Poisson regression. Points in red show a marginally significant effect of age on diversity (P < 0.05) before multiple-testing correction, but none of these interactions remain significant after Bonferroni correction. Only variant positions with a major allele frequency <0.98 were included in the analysis
Parameter estimates from quasi-Poisson regression model of within-host infection diversity
| CSP | TRAP | SERA2 | ||||
|---|---|---|---|---|---|---|
| Variable | Effect estimate |
| Effect estimate |
| Effect estimate |
|
| Patient age (in years) | 0.0094 | n.s. | 0.015 | n.s. | −0.032 | 0.028 |
| Sample typea | −0.033 | n.s. | 0.016 | n.s. | 0.072 | 0.0015 |
| Haplotype number | 0.011 | n.s. | -0.012 | 0.035 | -0.0020 | n.s. |
a Longitudinal, asymptomatic sampling vs. clinical infection sample