| Literature DB >> 28253868 |
Jesse N Gitaka1,2,3, Mika Takeda2, Masatsugu Kimura4, Zulkarnain Md Idris5,6, Chim W Chan5, James Kongere7, Kazuhide Yahata1,2, Francis W Muregi3, Yoshio Ichinose1,7, Akira Kaneko5,8,9, Osamu Kaneko10,11.
Abstract
BACKGROUND: Plasmodium falciparum SURFIN4.1 is a putative ligand expressed on the merozoite and likely on the infected red blood cell, whose gene was suggested to be under directional selection in the eastern Kenyan population, but under balancing selection in the Thai population. To understand this difference, surf 4.1 sequences of western Kenyan P. falciparum isolates were analysed. Frameshift mutations and copy number variation (CNV) were also examined for the parasites from western Kenya and Thailand.Entities:
Keywords: Copy number variation; Frameshift; Malaria; Plasmodium falciparum; SURFIN4.1; Selection
Mesh:
Substances:
Year: 2017 PMID: 28253868 PMCID: PMC5335827 DOI: 10.1186/s12936-017-1743-x
Source DB: PubMed Journal: Malar J ISSN: 1475-2875 Impact factor: 2.979
Fig. 1Schematic structure of SURFIN4.1. Extracellular region is divided into 4 parts; N-terminal (Nter), Cys-rich domain (CRD), and variable regions 1 and 2 (Var1 and Var2). Scale bar indicates 100 amino acids (aa). The length of the cytoplasmic region varies among parasite lines due to the frameshift mutations: 3D7 line SURFIN4.1 has a stop codon (first arrowhead, at nt 2498–2503) just after the transmembrane region (tm), another potential frameshift would generate a stop codon (second arrowhead at nt 3894–3903) before 2nd Trp-rich (WR) domain, FCR3 line has a stop codon (third arrowhead at nt 4529–4536) after 2nd WR domain [4], and IT line sequence has no frameshift mutations and contains three WR domains
Test of neutrality for Plasmodium falciparum surf 4.1 for the western Kenyan isolates (n = 24)
| Region | Nucleotide position | Number of sites (base) | η | S | Two variants | More than two variants | π | θ | Tajima’s | Fu and Li’s | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Singleton | Not singleton |
|
| |||||||||
| Extracellular | 4–2289 | 2286 | 439 | 410 | 115 | 266 | 29 | 0.046 | 0.048 | −0.19 | −0.29 | −0.30 |
| Nter | 4–150 | 147 | 2 | 2 | 1 | 1 | 0 | 0.002 | 0.004 | −1.20 | −0.66 | −0.93 |
| CRD | 151–585 | 435 | 11 | 10 | 3 | 6 | 1 | 0.007 | 0.006 | 0.39 | −0.61 | −0.36 |
| Var1 | 586–1506 | 918 | 109 | 107 | 44 | 61 | 2 | 0.022 | 0.031 | −1.17 | −0.93 | −1.18 |
| Var2 | 1507–2289 | 783 | 317 | 291 | 67 | 198 | 26 | 0.104 | 0.100 | 0.17 | −0.02 | 0.05 |
Sequence number is after 3D7 line sequence
Extracellular, extracellular region; Nter, N-terminal segment; CRD, cysteine-rich domain; Var1, variable region 1; Var2, variable region 2; sites, nucleotide sites analyzed; η, the total number of mutations; S, number of segregating sites; π, observed nucleotide diversity; θ, the expected nucleotide diversity under neutrality derived from S
Fig. 2Sliding window plot of nucleotide diversity and amino acid polymorphism of SURFIN4.1 extracellular region in the western Kenyan Plasmodium falciparum population. Window length of 90 bp and step size of 3 bp is used for the sliding window plot (top). The number of the amino acid type at each amino acid position (middle) and a scheme of SURFIN4.1 extracellular region (bottom) are shown in scale to visualize the location of the polymorphic sites. SURFIN4.1 extracellular region was divided into 4 parts: N-terminal (N), Cys-rich domain (CRD), and variable regions 1 and 2 (Var1 and Var2). A total of 24 sequences from western Kenya isolates were used. Nucleotide and amino acid positions are after 3D7 line sequences
Nucleotide diversity of Plasmodium falciparum surf 4.1 in western Kenyan isolates (n = 24)
| Region (position) | Number of sites (base) | Indels | k |
|
|
|
| π |
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (SE) | (SE) | (SE) | (SE) | (SE) | (SE) | (SE) | (SE) | ( | ||||
| Extracellular | 2286 | 9 | 104.64 | 87.22 | 1791.60 | 17.43 | 488.40 | 0.048 | 0.051 | 0.037 | 1.38 | 0.004 |
| (4–2289) | (5.50) | (6.03) | (8.40) | (2.25) | (8.40) | (0.003) | (0.003) | (0.005) | ||||
| Nter | 147 | 0 | 0.24 | 0.24 | 118.30 | 0.00 | 28.70 | 0.002 | 0.002 | 0.000 | ∞ | ns* |
| (4–150) | (0.19) | (0.17) | (2.01) | (0.00) | (2.10) | (0.001) | (0.002) | (0.000) | ||||
| CRD | 435 | 0 | 2.99 | 1.52 | 351.61 | 1.47 | 80.39 | 0.007 | 0.004 | 0.019 | 0.21 | ns* |
| (151–585) | (1.11) | (0.70) | (3.68) | (0.79) | (3.64) | (0.003) | (0.002) | (0.011) | ||||
| Var1 | 918 | 3 | 20.23 | 17.86 | 721.42 | 2.37 | 196.58 | 0.023 | 0.026 | 0.012 | 2.17 | ns* |
| (586–1506) | (2.15) | (2.16) | (5.52) | (0.64) | (5.57) | (0.002) | (0.003) | (0.003) | ||||
| Var2 | 783 | 6 | 81.19 | 67.60 | 600.28 | 13.59 | 182.72 | 0.115 | 0.126 | 0.080 | 1.58 | 0.0003 |
| (1507–2289) | (4.73) | (4.62) | (4.74) | (1.97) | (4.57) | (0.007) | (0.009) | (0.011) |
Extracellular, extracellular region; Nter, N-terminal segment; CRD, cysteine-rich domain; Var1, variable region 1; Var2, variable region 2; sites, sites nucleotide analyzed; indels, insertion/deletion polymorphism; k, the average number of nucleotide differences; N and S, average numbers of nonsynonymous and synonymous sites; π, pairwise nucleotide diversity (Jukes-Cantor model); d N, mean number of nonsynonymous substitutions per nonsynonymous site; d S, mean number of synonymous substitutions per synonymous site; SE, standard error computed using the Nei-Gojobori method with the Jukes-Cantor correction. SE was estimated using the bootstrap method with 500 replication
The numbers of synonymous (Sd) and nonsynonymous (Nd) differences were calculated by the Nei-Gojobori method. p value indicates the statistical difference between d N and d S, tested using one-tail Z-test with 500 bootstrap pseudo samples implemented in MEGA. ns indicates not significant by two-tailed Fisher’s exact tests (*). Number is after 3D7 line sequence
Fig. 3Sliding window plots of d N/d S ratio for Plasmodium falciparum surf sequence encoding extracellular region. Data obtained in this study for western Kenyan isolates (n = 24) is plotted with a solid red line. As a comparison, previously published surf sequences from eastern Kenya [15] and Thai [17] are plotted with a solid cyan line and a dotted black line, respectively. Asterisks indicate a part of Var1 region and Var2 region where d N was significantly higher than d S in two Kenyan parasite populations (p < 0.05). Nucleotide positions are after 3D7 line sequence. Window length is 90 bp and step size is 3 bp. A scheme of SURFIN4.1 extracellular region (bottom) are shown to visualize the location of the peaks. N-terminal (N), Cys-rich domain (CRD), and variable regions 1 and 2 (Var1 and Var2)
Fig. 4Sliding window plots of Tajima’s D, Fu and Li’s D* and F* for Plasmodium falciparum surf sequence encoding the extracellular region. Data obtained in this study for western Kenyan isolates (n = 24) is plotted with a solid red line. As a comparison, previously published surf sequences from eastern Kenya [15] and Thai [17] are plotted with a solid cyan line and a dotted black line, respectively. Sites significantly departed from the neutrality (p < 0.05, two-tailed) are indicated with circle, square, and diamond symbols on each line. Asterisk and hash symbols indicate the region where positive or negative deviation from the neutrality, respectively, were detected in the western Kenyan parasite population in this study. Dagger symbol indicates the region where negative deviation from the neutrality was detected in the eastern, but not western Kenyan population [15]. Nucleotide positions are after 3D7 line sequence. Window length is 90 bp and step size is 3 bp. A scheme of SURFIN4.1 extracellular region (bottom) are shown to visualize the location of the peaks. N-terminal (N), Cys-rich domain (CRD), and variable regions 1 and 2 (Var1 and Var2)
Fig. 5Frequency distribution of amino acid of Plasmodium falciparum SURFIN4.1 extracellular region. Data obtained in this study for western Kenya isolates in 2014 (n = 24), eastern Kenya isolates in 1998 (n = 51) [15], and Thai isolates in 1988/1989 (n = 37) [17] are plotted. Each allele at polymorphic sites are shown with white, black, red, yellow, and blue. Positions where Tajima’s D showed significant positive or negative deviation from the neutrality (p < 0.05) are indicated with red or blue bars, respectively. Asterisk symbols indicate the region where positive deviation from the neutrality was detected in the western Kenyan parasite population in this study. Hash and dagger symbol indicates the region where negative deviation from the neutrality was detected. A scheme of SURFIN4.1 extracellular region (bottom) are shown to visualize the location of the polymorphic sites. N-terminal (N), Cys-rich domain (CRD), and variable regions 1 and 2 (Var1 and Var2)
Existence of frameshift mutations in the surf region encoding the cytoplasmic region in the western Kenyan and Thai Plasmodium falciparum isolates
| Frameshift (nucleotide positionsa) | Number (%) of parasite line/cloneb | ||||||
|---|---|---|---|---|---|---|---|
| 2498–2503 | 3894–3903 | 4529–4536 | Western Kenya | Thai | |||
| Pattern 1 | Yes | Yes | No | 8 | (35) | 17 | (47) |
| Pattern 2 | Yes | No | No | 14 | (61) | 5 | (14) |
| Pattern 3 | No | Yes | No | 0 | (0) | 1 | (3) |
| Pattern 4 | No | No | No | 1 | (4) | 13 | (36) |
| Total | 23 | (100) | 36 | (100) | |||
| 3D7c | Yes | Yes | No | ||||
| FCR3c | No | No | Yes | ||||
| ITd | No | No | No | ||||
| PrCDCd | No | No | No | ||||
a Nucleotide positions are after 3D7 line sequence
b Pattern 1 contains KK14-92-B7, KN14-098-D1, KT14-111-H2, KU14-044-D2, KU14-050-B3, KU14-071-D4, KU14-226-G5, KU14-257-A7, MS804, MS807-H3, MS836, MS808, MS812, MS828, MS827, MS838, MS830, MS814-F1, MS814-H1, MS815-A2, MS818-C2, MS833-B4, MS840, MS841, and MS829-B5; pattern 2 contains KK14-02-B9, KK14-53-F6, KN14-076-B6, KT14-158-G3, KU14-042-D12, KU14-061-B1, KU14-062-A8, KU14-110-C4, KU14-119-D4, KU14-127-B12, KU14-170-A10, KU14-175-D6, KU14-211-D1, KU14-217-B5, MS811, MS813, MS835A1, MS844-A4, and MS947; pattern 3 contains MS843; and pattern 4 contains KU14-086-B3, MS805, MS806, MS809, MS810, MS816, MS817, MS819-A3, MS822-G8, MS825, MS837, MS842, MS946, and MS948
c GenBank accession numbers for 3D7 and FCR3 lines are AL844503 and AB759920, respectively
d PFIT_0400900 and PRCDC_0005300 from GeneDB, Wellcome Trust Sanger Institute, UK (both are 2015-06-18 version)
Fig. 6Proposed model for the evolution of the surf gene locus. Nucleotide (nt) positions are after 3D7 line sequence. CNV copy number variation, iRBC infected red blood cell, WR Trp-rich domain