| Literature DB >> 36114242 |
Chaturong Putaporntip1, Napaporn Kuamsab2,3, Rattanaporn Rojrung2, Sunee Seethamchai4, Somchai Jongwutiwes5.
Abstract
The merozoite surface protein-1 (MSP1) is a prime candidate for an asexual blood stage vaccine against malaria. However, polymorphism in this antigen could compromise the vaccine's efficacy. Although the extent of sequence variation in MSP1 has been analyzed from various Plasmodium species, little is known about structural organization and diversity of this locus in Plasmodium malariae (PmMSP1). Herein, we have shown that PmMSP1 contained five conserved and four variable blocks based on analysis of the complete coding sequences. Variable blocks were characterized by short insertion and deletion variants (block II), polymorphic nonrepeat sequences (block IV), complex repeat structure with size variation (block VI) and degenerate octapeptide repeats (block VIII). Like other malarial MSP1s, evidences of intragenic recombination have been found in PmMSP1. The rate of nonsynonymous nucleotide substitutions significantly exceeded that of synonymous nucleotide substitutions in block IV, suggesting positive selection in this region. Codon-based analysis of deviation from neutrality has identified a codon under purifying selection located in close proximity to the homologous region of the 38 kDa/42 kDa cleavage site of P. falciparum MSP1. A number of predicted linear B-cell epitopes were identified across both conserved and variable blocks of the protein. However, polymorphism in repeat-containing blocks resulted in alteration of the predicted linear B-cell epitope scores across variants. Although a number of predicted HLA-class II-binding peptides were identified in PmMSP1, all variants of block IV seemed not to be recognized by common HLA-class II alleles among Thai population, suggesting that diversity in this positive selection region could probably affect host immune recognition. The data on structural diversity in PmMSP1 could be useful for further studies such as vaccine development and strain characterization of this neglected malaria parasite.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36114242 PMCID: PMC9481586 DOI: 10.1038/s41598-022-19049-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Map of Thailand showing the provinces and period of sample collection. Distribution of the PmMSP1 alleles is shown as corresponding color circles. The map is modified from GADM maps and data (https://gadm.org/index.html) under the GADM license version 6.0.
Figure 2(A) Nucleotide diversity across the complete coding region of PmMSP1 using window length of 100 nucleotides and step size of 25 nucleotides. (B) Corresponding scheme of PmMSP1 showing organization of the protein regions. The potential cleavage sites for the 42-kDa and 19-kDa fragments are shown in open and filled downward arrow heads, respectively.
Figure 3Variation in block VI of PmMSP1 spanning amino acid residues 683 and 827 (positions after GenBank accession no. FJ824669). Repeats are shown in brackets, bolds, italics or underlined residues. Numbers of repeat units are indicated by subscripts. Mutations in nonrepeat regions are highlighted. Representative isolates are shown in parentheses after the alleles. Alleles in bolds are found in P. brasiliamum. Allele A1.10 belongs to both P. malariae (KR072216) and P. brasilianum (JX045641). Asterisks denote alleles with two or more isolates: KR072218 and KR072215 in allele A1.11; PM17-PM19 in allele B1.1; PM3-PM7 in allele B1.2; KX672047 and KX672048 in allele B1.4; PM22-PM26 in allele B1.5; PM12 in allele B2.2; and PM29 and PM30 in allele B2.4.
Distribution of alleles in block II of PmMSP1 and PbrMSP1.
| Allele | Sequence | Total | Distribution | ||
|---|---|---|---|---|---|
| Species | Country* (n) | ||||
| P.m | P.br | ||||
| I | NKDGNT––TTN–--ANN | 21 | 21 | – | Thailand (21) |
| II | NKDGNTSTTTN––-ANN | 8 | 8 | – | Thailand (8) |
| III | NKDGNT––TTNANNANN | 1 | 1 | – | Thailand (1) |
| IV | NKDGNTS-TTNANNANN | 1 | 1 | – | Thailand (1) |
| V | NKDGNT––TTD–-–ANN | 9 | 7 | Myanmar (3), Thailand (2), Cameroon (1), French Guiana (1) | |
| VI | ––––-–––-TN––-ANN | 7 | 3 | Thailand (2), Saudi Arabia (1), | |
| VII | ––––-–––-TNANN––N | 2 | – | ||
| VIII | ––––-–––-TDANN––N | 1 | – | ||
| IX | NKDGST––TTD–-–ANN | 1 | – | ||
P.m and P.br denote Plasmodium malariae and P. brasilianum, respectively. Thai isolates and GenBank accession numbers of isolates elsewhere are (I) PM2-PM8, PM10, PM14, PM16-PM19, PM22-PM26 and PM31-PM33; (II) PM9, PM11, PM12, PM21, PM28-PM30 and PM34; (III) PM1, (IV) PM35, (V) PM20, PM27, FJ824669, KX672046-KX672048, AF138879, AF138881 and AF138882; (VI) PM13, PM15, AF138878, FLQW01000468, KC906711, KC906714, KC906715, (VII) KC906713 and KC906716; (VIII) KC906712 and (IX) AF138880. *Bolds are countries for P. brasilianum.
Distribution of alleles in block IV of PmMSP1 and PbrMSP1.
| Allele | Sequence | Total | Distribution | ||
|---|---|---|---|---|---|
| Species | Country* (n) | ||||
| P.m | P.br | ||||
| I | YDNIATTNKELEAPSGSGSDDEDIKNCDKKQK | 14 | 14 | Thailand (11), Brazil (3) | |
| II | ......K......H.............A...E | 8 | 8 | Thailand (8) | |
| III | .N...DE..K.....E...........NE... | 6 | 6 | Myanmar (3), Thailand (2), Cameroon (1) | |
| IV | .............S.................. | 5 | 5 | Thailand (5) | |
| V | ....T...N..VTSNVPRP.....T....... | 5 | 5 | Thailand (4), Saudi Arabia (1) | |
| VI | ........N..K.L.E.RP.....T....... | 4 | 4 | Brazil (4) | |
| VII | ........N..K...EPRP.....T....... | 3 | 3 | Brazil (3) | |
| VIII | ........N..K...E.RP.....R....... | 1 | 1 | Brazil (1) | |
| IX | ......K....................A...E | 1 | 1 | Thailand (1) | |
| X | H.......N....S............Y..... | 1 | 1 | Thailand (1) | |
| XI | ........N..K...E.RP.....T....... | 1 | 1 | Thailand (1) | |
| XII | .N...DE..K.....E................ | 1 | 1 | Thailand (1) | |
| XIII | .N...DE..K...S.E...........GI... | 1 | 1 | Thailand (1) | |
| XIV | ........N..K.H.E.RT.....T....... | 2 | |||
| XV | ........N..K.L.E.RP....KT....... | 1 | |||
| XVI | ........N..K.H.E.RT.....T.Y..... | 1 | |||
P.m and P.br denote Plasmodium malariae and P. brasilianum, respectively. Dots represent corresponding identical amino acids per allele I. Thai isolates and GenBank accession numbers of isolates elsewhere are (I) PM3-PM8, PM16-PM19, PM33, KR072273, KR072274 and KR072276; (II) PM13 and PM20-PM26; (III) PM1, PM27, KX672046–KX672048 and FJ824669; (IV) PM11, PM12 and PM28–PM30; (V) PM14, PM31, PM32, PM34 and FLQW01000468; (VI) KR072269, KR072272, KR072278 and KR072279; (VII) KR072271, KR072275 and KR072277; (VIII) KR072270; (IX) PM9; (X) PM10; (XI) PM15; (XII) PM2; (XIII) PM35; (XIV) KR072281 and KR072283; (XV) KR072284 and (XVI) KR072282. *Bolds are countries for P. brasilianum.
Distribution of alleles in block VIII of PmMSP1 and PbrMSP1.
| Allele | Sequence | Distribution (n) | ||
|---|---|---|---|---|
| Species | Country* | |||
| P.m | P.br | |||
| I | PQPQAALPAQPQAALPAQPQAALPAQPQAAVPAQSQATVPAQSQAAVPATTQSSSVSAPT | 15 | Thailand (12), Myanmar (1), Brazil (2) | |
| II | PQPQAALPAQPQAALPAQPQAAV----------------PAQSQAAVPATTQSSSVSAPT | 6 | Thailand (6) | |
| III | PQQQSSS-AQPQAALPAQPQAAV--------PAQSQATVPAQSQAAVPATTQSSSVSAPT | 5 | Thailand (5) | |
| IV | PQPQAALPAQPQAALPAQPQAALPAQPQAALPAQPQAAVPAQSQAAVPATT--------- | 4 | Thailand (4) | |
| V | PQQQSSS-AQPQAAVPAQSQATV----------------PAQSQAAVPATTQSSSVSAPT | 4 | Thailand (3), Saudi Arabia (1) | |
| VI | PQPQAALPAQPQAALPAQPQAAVPAQSQATV--------PAQSQAAVPATTQSSSVSAPT | 2 | Thailand (2) | |
| VII | PQPQAALPAQPQAAVPAQSQATV----------------PAQSQAAVPATTQSSSVSAPT | 1 | Cameroon (1) | |
| VIII | PQPQAALPAQPQAALPAQPQAALPAQPQAALPAQPQAAVPAQSQAAVPATTQSSSVSAPT | 1 | Thailand (1) | |
| IX | PQQQSSS-AQPQAALPAQPQAAVPAQSQATVPAQSQAAVPAQSQAAVPATTQSSSVSAPT | 1 | Thailand (1) | |
| X | PQPQAALPAQPQAAV––––––––––––––––––––––––––––––––PATTQSSSVSAPT | 1 | Thailand (1) | |
| XI | PQPQAALPAQPQAALPAQPQAALPAQPQAAVPAQSQATVPAQSQAAVPATTQSSSVSAPT | 1 | Myanmar (1) | |
| XII | PQPQAALPAQPQAALPAQPQAAVPAQSQAAL––––––––PAQSQAAVPATTQSSSVSAPT | 1 | Brazil (1) | |
| XIII | PQPQAALPAQPQAALPAQPQAALPAQPQAAVPAQSQAALPAQSQAAVPATTQSSSVSAPT | 8 | Brazil (8), | |
P.m and P.br denote Plasmodium malariae and P. brasilianum, respectively. Dash indicates a deletion. Thai isolates and GenBank accession numbers of isolates elsewhere are (I) PM1, PM2, PM10, PM14, PM20–27; KX672048, KR072259 and KR072258; (II) PM3–PM7; (III) PM11, PM12 and PM28–PM30; (IV) PM16–PM19; (V) PM31, PM34, PM35 and FLQW01000468, (VI) PM32 and PM33; (VII) FJ824669; (VIII) PM9; (IX) PM13; (X) PM15; (XI) KX672047; (XII) KR072262; (XIII) KR072254–KR072257, KR072260–KR072263, KR072265–KR072268 and KY189272. *Bolds are countries for P. brasilianum.
Haplotype and nucleotide diversity in the complete PmMSP1 sequences.
| Block | No. codons | M | S | H | ||||
|---|---|---|---|---|---|---|---|---|
| I (conserved) | 56 | 2 | 2 | 3 | 0.160 ± 0.080 | 0.00098 ± 0.00071 | 0.01980 ± 0.02021 | 0.00501 ± 0.00503 |
| II (indels) | 5–15 | – | – | 6 | 0.157 ± 0.077 | – | 0.00000 ± 0.00000 | 0.03810 ± 0.03041 |
| III (conserved) | 140 | 2 | 2 | 3 | 0.398 ± 0.081 | 0.00103 ± 0.00090 | 0.00000 ± 0.00000 | 0.00392 ± 0.00271 |
| IV (variable) | 32 | 30 | 27 | 10 | 0.838 ± 0.036 | 0.07763 ± 0.01894 | 0.01882 ± 0.01386 | 0.12613 ± 0.02543** |
| V (conserved) | 440 | 10 | 10 | 9 | 0.740 ± 0.048 | 0.00144 ± 0.00061 | 0.00158 ± 0.00149 | 0.00397 ± 0.00130 |
| VI (repeats) | 97–229 | – | – | 20 | 0.863 ± 0.040 | 0.12162 ± 0.01382 | – | – |
| VII (conserved) | 156 | 10 | 10 | 4 | 0.340 ± 0.093 | 0.00232 ± 0.00088 | 0.01346 ± 0.00949 | 0.01282 ± 0.00454 |
| VIII (repeats) | 28–60 | – | – | 10 | 0.567 ± 0.071 | 0.02852 ± 0.00954 | – | – |
| IX (conserved) | 719 | 13 | 12 | 13 | 0.890 ± 0.027 | 0.00113 ± 0.00043 | 0.00115 ± 0.00110 | 0.00161 ± 0.00051 |
| All | 1696–1831 | 72 | 68 | 21 | 0.944 ± 0.021 | 0.01204 ± 0.00162 | 0.00208 ± 0.00077 | 0.00447 ± 0.00065* |
M the number of mutations, S the number of segregating sites, H the number of haplotypes, h haplotype diversity, π nucleotide diversity, d number of synonymous substitutions per synonymous site, d number of nonsynonymous substitutions per nonsynonymous site, S.D. standard deviation, S.E. standard error. Analysis includes 35 Thai isolates and the FJ824669 sequence. Z-tests of the hypothesis that mean d equals that of mean d: * p < 0.05; **p < 0.0005.
Intragenic recombination in PmMSP1 inferred from 35 Thai isolates.
| Event no. | Recombination breakpoints | Methods ( | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Between positions* | Between blocks | RDP | GENECONV | Bootscan | Maxchi | Chimaera | SiSscan | 3Seq | |||
| 1 | 2187 | 2728 | VI | VII | 4.69 × 10–12 | 1.07 × 10–9 | 1.72 × 10–10 | 7.26 × 10–14 | 1.93 × 10–14 | 1.79 × 10–19 | 2.03 × 10–21 |
| 2 | 2192 | 2841 | VI | VII | 3.02 × 10–4 | 4.76 × 10–3 | NS | 2.26 × 10–5 | 1.66 × 10–6 | 4.72 × 10–11 | 5.47 × 10–4 |
| 3 | 706 | 4684 | IV | IX | NS | 4.47 × 10–5 | 9.81 × 10–6 | 5.22 × 10–5 | 2.39 × 10–5 | 6.55 × 10–5 | 1.90 × 10–9 |
| 4 | 1556 | 4190 | V | IX | NS | 1.51 × 10–3 | 1.15 × 10–4 | 9.31 × 10–5 | NS | 9.78 × 10–5 | 1.19 × 10–8 |
| 5 | 2192 | 2353 | VI | VI | 4.47 × 10–4 | NS | NS | 3.65 × 10–4 | 5.06 × 10–5 | 8.60 × 10–5 | 8.78 × 10–8 |
| 6 | 1361 | 2979 | V | VIII | 1.87 × 10–7 | 6.47 × 10–6 | NS | NS | NS | NS | NS |
| 7 | 2146 | 2192 | VI | VI | NS | 0.02319 | NS | 6.99 × 10–7 | 9.12 × 10–3 | NS | NS |
| 8 | 2467 | 3057 | VI | VIII | 6.45 × 10–5 | 2.38 × 10–5 | 4.07 × 10–6 | 5.40 × 10–3 | 4.42 × 10–3 | NS | 2.97 × 10–7 |
| 9 | 2377 | 2462 | VI | VI | NS | 3.89 × 10–5 | 7.90 × 10–6 | NS | NS | NS | 9.50 × 10–5 |
| 10 | 2303 | 2369 | VI | VI | NS | NS | NS | 9.83 × 10–4 | 2.78 × 10–3 | NS | 1.08 × 10–5 |
| 11 | 644 | 1599 | IV | V | NS | NS | NS | 4.60 × 10–3 | NS | NS | 1.44 × 10–5 |
| 12 | 2482 | 2869 | VII | VII | NS | 6.08 × 10–3 | 2.19 × 10–3 | NS | NS | NS | 8.02 × 10–5 |
| 13 | 797 | 1844 | V | V | NS | 1.98 × 10–2 | 5.39 × 10–3 | NS | NS | NS | 8.63 × 10–4 |
| 14 | 2294 | 2466 | VI | VI | NS | NS | NS | NS | NS | 1.82 × 10–13 | 1.59 × 10–3 |
| 15 | 644 | 2147 | IV | VI | NS | NS | NS | 1.48 × 10–2 | NS | NS | 2.18 × 10–3 |
| 16 | 2197 | 2302 | VI | VI | NS | NS | NS | NS | NS | NS | 2.62 × 10–3 |
| 17 | 2212 | 2328 | VI | VI | NS | NS | NS | NS | NS | NS | 5.35 × 10–3 |
| 18 | 1847 | 2185 | V | VI | 8.05 × 10–3 | NS | NS | NS | NS | NS | 1.35 × 10–2 |
| 19 | 2357 | 2840 | VI | VII | NS | NS | NS | NS | NS | NS | 1.35 × 10–2 |
| 20 | 2172 | 2213 | VI | VI | NS | NS | NS | 2.18 × 10–3 | NS | NS | 2.63 × 10–2 |
| 21 | 644 | 1599 | IV | V | NS | NS | NS | NS | NS | NS | 2.73 × 10–2 |
*Positions after the FJ824669 sequence.
Figure 4Maximum likelihood trees inferred from 21 distinct complete coding sequences of PmMSP1 with PocMSP1 and PowMSP1 as outgroup sequences. Numbers on the branches are the percentage of 1000 bootstrap samples supporting the branch; only values greater than 50 are shown. Types are based on repeat classification in Fig. 3. Scale bar indicates nucleotide substitution per site.
Figure 5Predicted linear B-cell epitopes in PmMSP1 based on the BepiPred 2.0 method. (A) Epitope scores across the entire protein (GenBank accession no. FJ824669). Variable blocks are shown as broken boxes. (B–D) represent epitope scores for representative alleles of blocks IV, VI and VIII, respectively. The cutoff value is indicated by a broken line.
Predicted HLA class-II binding peptides in block IV of PmMSP1.
| Amino acid residue† | Allele | Peptides and their variants | Prevalence (%, n = 35) | HLA§ | Allele Frequency* | IC50# | Peptide rank# |
|---|---|---|---|---|---|---|---|
| 207–221 | I | KKEYNNIADENKKLE | 11.43 | None | – | – | – |
| II | KKEY | 45.71 | DQA1*04:01/ DQB1*04:02 | 0.0021/ 0.0032 | 747 | 9.9 | |
| III | KKEY | 25.71 | None | – | – | – | |
| IV | KKEY | 11.43 | DRB1*13:02 | 0.0138 | 43.7 | 5.5 | |
| DRB1*04:05 | 0.0489 | 235.4 | 10 | ||||
| V | KKE | 2.86 | DQA1*04:01/ DQB1*04:02 | 0.0021/ 0.0032 | 478 | 6.4 | |
DQA1*03:01/ DQB1*03:02 | 0.0457/ 0.0426 | 878 | 9.9 | ||||
| VI | KKEY | 2.86 | None | – | – | – | |
| 211–225 | I | NNIADENKKLEAPSE | 8.57 | None | – | – | – |
| II | 31.43 | None | – | – | – | ||
| III | 22.86 | None | – | – | – | ||
| IV | 14.29 | None | – | – | – | ||
| V | 11.43 | DRB1*13:02 | 0.0138 | 55.9 | 6.6 | ||
| VI | 2.86 | None | – | – | – | ||
| VII | 2.86 | None | – | – | – | ||
| VIII | 2.86 | None | – | – | – | ||
| IX | NNIADENKKLEA | 2.86 | None | – | – | – |
Positions and amino acid substitutions are based on the FJ824669 sequence.
§Analysis based on the HLA alleles available in the IEDB analysis resource (accessed February 18, 2022).
*Allele frequency among Thai population[44].
#Based on NN-align and the IEDB recommended 2.22 method[43].