| Literature DB >> 20457586 |
Lynette Isabella Ochola1, Kevin K A Tetteh, Lindsay B Stewart, Victor Riitho, Kevin Marsh, David J Conway.
Abstract
Signatures of balancing selection operating on specific gene loci in endemic pathogens can identify candidate targets of naturally acquired immunity. In malaria parasites, several leading vaccine candidates convincingly show such signatures when subjected to several tests of neutrality, but the discovery of new targets affected by selection to a similar extent has been slow. A small minority of all genes are under such selection, as indicated by a recent study of 26 Plasmodium falciparum merozoite-stage genes that were not previously prioritized as vaccine candidates, of which only one (locus PF10_0348) showed a strong signature. Therefore, to focus discovery efforts on genes that are polymorphic, we scanned all available shotgun genome sequence data from laboratory lines of P. falciparum and chose six loci with more than five single nucleotide polymorphisms per kilobase (including PF10_0348) for in-depth frequency-based analyses in a Kenyan population (allele sample sizes >50 for each locus) and comparison of Hudson-Kreitman-Aguade (HKA) ratios of population diversity (π) to interspecific divergence (K) from the chimpanzee parasite Plasmodium reichenowi. Three of these (the msp3/6-like genes PF10_0348 and PF10_0355 and the surf(4.1) gene PFD1160w) showed exceptionally high positive values of Tajima's D and Fu and Li's F indices and have the highest HKA ratios, indicating that they are under balancing selection and should be prioritized for studies of their protein products as candidate targets of immunity. Combined with earlier results, there is now strong evidence that high HKA ratio (as well as the frequency-independent ratio of Watterson's /K) is predictive of high values of Tajima's D. Thus, the former offers value for use in genome-wide screening when numbers of genome sequences within a species are low or in combination with Tajima's D as a 2D test on large population genomic samples.Entities:
Mesh:
Year: 2010 PMID: 20457586 PMCID: PMC2944029 DOI: 10.1093/molbev/msq119
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Tests of Neutrality on Polymorphisms in Six Plasmodium falciparum Genes in a Kenyan Population.
| MK | ||||||||||||
| Number of isolates ( | HKAr ( | S | NS | MK | Fu and Li’s | |||||||
| Locus | nt | Fixed | Poly | Fixed | Poly | Tajima's | ||||||
| PFD0100c | 51 | 2,061 | 29.6 | 162.4 | 0.18 | 40 | 49 | 142 | 242 | 0.18 | −0.80 | −1.37 |
| PFD1160w | 69 | 2,199 | 43.0 | 153.8 | 0.28 | 39 | 25 | 178 | 216 | |||
| PF07_0004 | 59 | 1,152 | 16.6 | 157.8 | 0.11 | 27 | 23 | 90 | 64 | 0.62 | −0.48 | −0.04 |
| PF10_0342 | 79 | 1,680 | 6.5 | 40.7 | 0.16 | 10 | 13 | 39 | 44 | 0.82 | −0.16 | −0.24 |
| PF10_0348 | 56 | 1,896 | 36.3 | 70.4 | 0.52 | 18 | 62 | 57 | 133 | 0.24 | ||
| PF10_0355 | 53 | 2,073 | 33.8 | 99.2 | 0.34 | 30 | 46 | 101 | 116 | 0.35 | ||
NOTE.—Nt, number of aligned nucleotide positions analyzed. Full alignments shown in supplementary figure S2 (Supplementary Material online), with P. reichenowi orthologues and the reference 3D7 sequence for comparison.
Less sequence aligned when Plasmodium reichenowi added to analysis (PF10_0348 n = 1818, PF10_0355 n = 2061, PFD0100c n = 1893, PFD1160w n = 2031, and PF07_0004 n = 996).
Specific region generated (PFD1160w exon 1, PFD0100c exon 1, and PF07_0004 exon 2).
Complex codons not analyzed by DNAsp software (PF10_0348 n = 3, PFD0100c n = 17, PFD1160w n = 17, and PF07_0004 n = 3).
Repeats removed from gene sequences for analysis.
Divergent allele in a minority of samples removed from alignment-based analysis (PF10_0348 n = 7 and PF10_0355 n = 10, included in supplementary fig. S2, Supplementary Material online).
Stop codon in eight alleles of PF10_0348, and eight stop codons in the P. reichenowi orthologue of PF10_0355 (codons removed from analysis).
Sliding window analysis shows significant regions (windows of 100 nucleotide sites, step size 50 sites).
*P < 0.05, **P < 0.02, ***P < 0.001.
FSchematic diagram of each of the six gene loci studied. The positions of nucleotide polymorphisms (within Plasmodium falciparum, marked above each gene scheme) and fixed differences (between P. falciparum and Plasmodium reichenowi, marked below each gene scheme) in nonrepeat regions are indicated. Repeat regions (in three of the genes) are excluded from alignment-based analysis (their translated sequences for representative alleles are shown in supplementary fig. S1, Supplementary Material online). Alignments of the nonrepeat sequences of all alleles and P. reichenowi are shown for each of the loci in supplementary figure S2 (Supplementary Material online).
FLD in each of the six gene loci studied in the Kenyan population. The r2 values for all pairwise tests between polymorphic sites are shown. Red and black symbols indicate values that are statistically significant (red points remain significant after Bonferroni correction for the multiple tests), whereas open gray symbols indicate nonsignificant values.
FSliding window analysis of Tajima's D index along the aligned sequences of all alleles for each of the six genes studied in the Kenyan population. The x axis shows the midpoint of contiguous windows of 100 bp with step size of 50 bp for the portions of the genes sequenced (nucleotide positions in each gene are given as those in the allele of the reference genome strain 3D7): PFD0100c exon 1 positions 61–2124; PFD1160w exon 1 positions 1–2205; PF07_0004 exon 2 positions 124–2781 (positions on x axis condensed as repeat sequences were deleted); PF10_0342 positions 4–1683; PF10_0348 positions 97–2028; and PF10_0355 positions 70 – 2241. Symbols for points indicating windows with significant departures from zero: P < 0.05, open square; P < 0.01, open circle; P < 0.001, open triangle.
FScatterplot of Tajima's D index and two diversity-versus-divergence ratios for six genes studied here in the Kenyan population (black square points) and nine genes studied previously in a Gambian population (open circle points). (A) Tajima's D correlated with the π/K ratio (the conventional HKA ratio) and (B) Tajima's D correlated with the θ/K ratio (a form of HKA ratio that is independent of allele frequencies). One of the genes (PF10_0348) had been studied in both populations (with similar results as indicated by nearly overlapping points furthest right on each scatterplot), so data for this gene were only included from one of the populations (it made no difference which) in the correlation analyses.