| Literature DB >> 28798400 |
Adam Kim1, Jean Popovici2, Amélie Vantaux2, Reingsey Samreth2, Sophalai Bin2, Saorin Kim2, Camille Roesch2, Li Liang3, Huw Davies3, Philip Felgner3, Sócrates Herrera4, Myriam Arévalo-Herrera4,5, Didier Ménard6,7,8, David Serre9.
Abstract
Our understanding of the structure and regulation of Plasmodium vivax genes is limited by our inability to grow the parasites in long-term in vitro cultures. Most P. vivax studies must therefore rely on patient samples, which typically display a low proportion of parasites and asynchronous parasites. Here, we present stranded RNA-seq data generated directly from a small volume of blood from three Cambodian vivax malaria patients collected before treatment. Our analyses show surprising similarities of the parasite gene expression patterns across infections, despite extensive variations in parasite stage proportion. These similarities contrast with the unique gene expression patterns observed in sporozoites isolated from salivary glands of infected Colombian mosquitoes. Our analyses also indicate that more than 10% of P. vivax genes encode multiple, often undescribed, protein-coding sequences, potentially increasing the diversity of proteins synthesized by blood stage parasites. These data also greatly improve the annotations of P. vivax gene untranslated regions, providing an important resource for future studies of specific genes.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28798400 PMCID: PMC5552866 DOI: 10.1038/s41598-017-07275-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary statistics of the infections and genomic analyses for the three blood stage samples (V_DJK_8, V_DJK_10, V_DJK_16) and the sporozoite sample (SP_1).
| Blood sample 1 | Blood sample 2 | Blood sample 3 | Salivary gland sample | |
|---|---|---|---|---|
| V_DJK_8 | V_DJK_10 | V_DJK_16 | SP_1 | |
| Parasite density (parasites/µL) | 750 | 6.970 | 11.000 | — |
| Parasite-stages proportion (thick/thin blood films) | ||||
| Ring | 22% | 79% | 35% | — |
| Trophozoite | 45% | 14% | 65% | — |
| Schizont | 0% | 0% | 0% | — |
| Gametocyte | 33% | 7% | 0% | — |
|
| ||||
| #Read pairs generated | 65,219,377 | 55,945,558 | 69,641,135 | 437,202,828 |
| #Read pairs mapped to human (%) | 50,647,696 (77.66%) | 35,653,630 (63.73%) | 53,537,602 (76.88%) | — |
| #Read pairs mapped to human, duplicates removed (%) | 24,607,961 (48.59%) | 21,835,168 (61.24%) | 40,479,261 (74.61%) | — |
| Read pairs mapped to rRNAs (%) | 9,878 (0.04%) | 7017 (0.03%) | 9106 (0.02%) | — |
| Read pairs mapped to globin mRNAs (%) | 59,327 (0.24%) | 19624 (0.09%) | 22117 (0.05%) | — |
| Reads mapped to other annotated protein-coding genes (%) | 16,291,255 (66.20%) | 16,463,840 (75.40%) | 23,090137 (57.04%) | — |
| #Reads pairs mapped to P. vivax (%) | 10,436,776 (16.00%) | 16,988,674 (30.37%) | 11,208,385 (16.09%) | 17,833,896 (4.08%) |
| #Read pairs mapped to P. vivax, duplicates removed (%) | 3,778,226 (36.20%) | 8,828,288 (51.97%) | 7,249,998 (64.68%) | 1,594,798 (8.94%) |
| Read pairs mapped to rRNAs (%) | 1,880 (0.5%) | 2,717 (0.03%) | 2,167 (0.3%) | 230 (0.01%) |
| Reads mapped to annotated protein-coding genes (%) | 1,668,417 (44.16%) | 3,983,570 (45.12%) | 3,729,411 (51.44%) | 984,997 (61.8%) |
|
| ||||
| #Read pairs used for Trinity | 4,080,296 | 9,211,092 | 7,532,645 | 2,650,203 |
| #Transcripts assembled (% reads) | 15,746 (75.30%) | 21,477 (96.25%) | 20,631 (96.99%) | 7,359 (57.74%) |
| #Transcripts expressed >10X (% reads) | 4,298 (68.82%) | 9,516 (92.47%) | 8,654 (92.87%) | 6,221 (57.52%) |
| noncoding transcripts (% reads) | 1,471 (31.95%) | 2,990 (44.68%) | 2,708 (38.99%) | 4,146 (18.17%) |
| partial protein-coding transcripts (% reads) | 1,642 (11.88%) | 3,848 (16.28%) | 3,454 (17.36%) | 1,866 (25.35%) |
| complete protein-coding transcripts (% reads) | 1,185 (24.98%) | 2,678 (31.51%) | 2,492 (36.51%) | 209 (14.00%) |
| encoding unique AA sequences | 1017 | 2235 | 2029 | 187 |
| assembled in combined Trinity | 893 (87.8%) | 1875 (83.9%) | 1697 (83.6%) | — |
| #Transcripts single position (% reads) | 15421 | 20781 | 20049 | 7311 |
|
| ||||
| #Read pairs used for Trinity | 20,824,238 | |||
| #Transcripts assembled (% reads) | 29,510 (95.28%) | |||
| #Transcripts expressed >10X (% reads) | 15,951 (93.68%) | |||
| noncoding transcripts (% reads) | 6,348 (41.02%) | |||
| partial protein-coding transcripts (% reads) | 5,762 (17.45%) | |||
| complete protein-coding transcripts (% reads) | 3,841 (35.21%) | |||
| encoding unique AA sequences | 3044 | |||
List of the 25 most expressed genes in each sample (ranked by their relative coverage in read counts per bp).
| V_DJK_8_0 | V_DJK_10_0 | V_DJK_16_0 | Sp_1 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene ID | Name | Cov. | Gene ID | Name | Cov. | Gene ID | Name | Cov. | Gene ID | Name | Cov. |
| PVX_092995 | tryptophan-rich antigen (Pv-fam-a) | 34.39 | PVX_117322 | glyceraldehyde-3-phosphate dehydrogenase putative | 63.06 | PVX_117322 | glyceraldehyde-3-phosphate dehydrogenase putative | 55.51 | PVX_001715 | early transcribed membrane protein (ETRAMP) | 17.86 |
| PVX_003565 | early transcribed membrane protein (ETRAMP) | 31.09 | PVX_003565 | early transcribed membrane protein (ETRAMP) | 45.54 | PVX_003565 | early transcribed membrane protein (ETRAMP) | 42.13 | PVX_123510 | cell traversal protein for ookinetes and sporozoites | 7.83 |
| PVX_117322 | glyceraldehyde-3-phosphate dehydrogenase putative | 21.06 | PVX_000010 | Plasmodium exported protein unknown function | 42.45 | PVX_000010 | Plasmodium exported protein unknown function | 40.06 | PVX_089425 | heat shock 70 kDa protein putative | 3.53 |
| PVX_000010 | Plasmodium exported protein unknown function | 20.98 | PVX_114830 | elongation factor 1-alpha putative | 30.20 | PVX_090930 | histone H4 putative | 31.70 | PVX_088870 | early transcribed membrane protein (ETRAMP) | 2.48 |
| PVX_097583 | skeleton-binding protein 1 putative | 19.89 | PVX_114832 | elongation factor 1-alpha putative | 28.31 | PVX_114015 | histone H2A putative | 30.97 | PVX_122910 | hypothetical protein conserved | 2.29 |
| PVX_096020 | Plasmodium exported protein unknown function | 19.34 | PVX_090930 | histone H4 putative | 27.93 | PVX_083045 | phosphoethanolamine N-methyltransferase | 25.61 | PVX_091975 | hypothetical protein conserved | 2.09 |
| PVX_093680 | Phist protein (Pf-fam-b) | 19.02 | PVX_095015 | enolase putative | 27.18 | PVX_095015 | enolase putative | 24.54 | PVX_099035 | inhibitor of cysteine proteases putative | 1.68 |
| PVX_112670 | unspecified product | 16.62 | PVX_114015 | histone H2A putative | 26.75 | PVX_114830 | elongation factor 1-alpha putative | 24.47 | PVX_119355 | circumsporozoite (CS) protein | 1.65 |
| PVX_090930 | histone H4 putative | 16.20 | PVX_093680 | Phist protein (Pf-fam-b) | 22.81 | PVX_090935 | histone 2B | 24.37 | PVX_096315 | hypothetical protein conserved | 1.45 |
| PVX_114830 | elongation factor 1-alpha putative | 14.67 | PVX_119470 | 40 S ribosomal protein S23 putative | 21.53 | PVX_097583 | skeleton-binding protein 1 putative | 23.49 | PVX_080040 | hypothetical protein conserved | 1.44 |
| PVX_113235 | Pv-fam-d protein | 13.87 | PVX_090935 | histone 2B | 20.88 | PVX_114832 | elongation factor 1-alpha putative | 23.24 | PVX_093500 | gamete release protein putative | 1.39 |
| PVX_114832 | elongation factor 1-alpha putative | 13.54 | PVX_097583 | skeleton-binding protein 1 putative | 20.66 | PVX_093680 | Phist protein (Pf-fam-b) | 20.55 | PVX_118040 | gamete egress and sporozoite traversal protein putative | 1.28 |
| PVX_101520 | Pv-fam-d protein | 13.42 | PVX_092820 | 60S ribosomal protein L41 putative | 19.58 | PVX_113235 | Pv-fam-d protein | 19.28 | PVX_087935 | DNA-directed RNA polymerase II 8.2 kDa polypeptide putative | 1.25 |
| PVX_112665 | unspecified product | 13.10 | PVX_113235 | Pv-fam-d protein | 18.88 | PVX_119470 | 40S ribosomal protein S23 putative | 19.12 | PVX_117605 | thioredoxin 1 putative | 1.24 |
| PVX_114015 | histone H2A putative | 12.08 | PVX_123060 | DNA/RNA-binding protein Alba 1 putative | 18.74 | PVX_092820 | 60S ribosomal protein L41 putative | 17.83 | PVX_100695 | CHCH domain containing protein | 1.22 |
| PVX_090935 | histone 2B | 11.65 | PVX_087860 | 60S ribosomal protein L37 putative | 17.76 | PVX_003955 | 60S ribosomal protein L37a putative | 17.55 | PVX_099860 | hypothetical protein | 1.18 |
| PVX_101595 | Plasmodium exported protein unknown function | 11.61 | PVX_080245 | 40S ribosomal protein S9 putative | 17.36 | PVX_087860 | 60S ribosomal protein L37 putative | 17.51 | PVX_082735 | thrombospondin-related anonymous protein | 1.18 |
| PVX_119470 | 40S ribosomal protein S23 putative | 10.64 | PVX_003955 | 60S ribosomal protein L37a putative | 17.11 | PVX_087825 | 40S ribosomal protein S29 putative | 17.30 | PVX_122540 | hypothetical protein conserved | 1.14 |
| PVX_096035 | hypothetical protein | 9.78 | PVX_113725a | 60S ribosomal protein L39 putative | 16.75 | PVX_092995 | tryptophan-rich antigen (Pv-fam-a) | 17.04 | PVX_000810 | perforin-like protein 1 | 1.10 |
| PVX_123060 | DNA/RNA-binding protein Alba 1 putative | 9.32 | PVX_089425 | heat shock 70 kDa protein putative | 16.54 | PVX_113665 | histone H3 putative | 16.87 | PVX_122458 | conserved Plasmodium protein unknown function | 1.09 |
| PVX_092820 | 60S ribosomal protein L41 putative | 9.27 | PVX_119587 | 60S acidic ribosomal protein P2 putative | 16.48 | PVX_101520 | Pv-fam-d protein | 16.77 | PVX_117755 | nifU protein putative | 1.09 |
| PVX_115470 | Pv-fam-d protein | 8.17 | PVX_087825 | 40S ribosomal protein S29 putative | 16.43 | PVX_113725a | 60S ribosomal protein L39 putative | 16.10 | PVX_001015 | 6-cysteine protein putative | 1.06 |
| PVX_113725a | 60S ribosomal protein L39 putative | 8.14 | PVX_099535 | phosphoglycerate kinase putative | 16.17 | PVX_080245 | 40S ribosomal protein S9 putative | 15.94 | PVX_098795 | hypothetical protein | 1.05 |
| PVX_087825 | 40S ribosomal protein S29 putative | 8.13 | PVX_116630 | lactate dehydrogenase | 15.55 | PVX_119587 | 60S acidic ribosomal protein P2 putative | 15.01 | PVX_118255 | fructose 1,6-bisphosphate aldolase putative | 1.03 |
| PVX_087860 | 60S ribosomal protein L37 putative | 8.08 | PVX_091640 | phosphoglycerate mutase putative | 15.33 | PVX_112670 | unspecified product | 14.78 | PVX_002900 | secreted protein with altered thrombospondin repeat domain putative | 0.99 |
Figure 1Correlation between the parasite gene expression patterns in two infections. The figure shows the number of reads mapped to each annotated gene (black dots) in the RNA-seq data generated from (A) the infection of patient V_DJK_8 (x-axis) and patient V_DJK_16 (y-axis) and (B) the infection of patient V_DJK_10 (x-axis) and patient V_DJK_16 (y-axis).
Figure 2Examples of transcripts differing from the current P. vivax gene annotations. Data from each patient are displayed in successive rows (labeled 1–3). For each data set, the grey track shows the read coverage, the bridges display evidence of intron splicing and the blue and red bars the actual reads generated. (A) Read coverage across PvCRT. Note that intron 9 is retained in some of the transcripts from infections 2 and 3 and in all transcripts from infection 1 (red box). (B) Read coverage across PvMDR1. Note that in two infections, some PvMDR1 transcripts contains an unannotated intron in 3′-UTR resulting in a longer transcript (red box).
Figure 3Distribution of the length of untranslated regions for full-length protein-coding transcripts. (A) The histogram shows the number of protein-coding transcripts (y-axis) with a given 5′- and 3′-UTR length (x-axis, in blue and green respectively). The dashed line represents the currently annotated UTR length for all P. vivax protein-coding genes. (B) Pair-wise comparison of 5′UTR lengths between samples V_DJK_10 and V_DJK_8. (C) Pair-wise comparison of 3′UTR lengths between samples V_DJK_10 and V_DJK_8. Additional comparisons in Supplemental Figure 4.