| Literature DB >> 26858037 |
Lei Zhu1, Sachel Mok1, Mallika Imwong2,3, Anchalee Jaidee4, Bruce Russell5, Francois Nosten3,4, Nicholas P Day2,3, Nicholas J White2,3, Peter R Preiser1, Zbynek Bozdech1.
Abstract
Historically seen as a benign disease, it is now becoming clear that Plasmodium vivax can cause significant morbidity. Effective control strategies targeting P. vivax malaria is hindered by our limited understanding of vivax biology. Here we established the P. vivax transcriptome of the Intraerythrocytic Developmental Cycle (IDC) of two clinical isolates in high resolution by Illumina HiSeq platform. The detailed map of transcriptome generates new insights into regulatory mechanisms of individual genes and reveals their intimate relationship with specific biological functions. A transcriptional hotspot of vir genes observed on chromosome 2 suggests a potential active site modulating immune evasion of the Plasmodium parasite across patients. Compared to other eukaryotes, P. vivax genes tend to have unusually long 5' untranslated regions and also present multiple transcription start sites. In contrast, alternative splicing is rare in P. vivax but its association with the late schizont stage suggests some of its significance for gene function. The newly identified transcripts, including up to 179 vir like genes and 3018 noncoding RNAs suggest an important role of these gene/transcript classes in strain specific transcriptional regulation.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26858037 PMCID: PMC4746618 DOI: 10.1038/srep20498
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The P. vivax IDC transcriptome and its functional aspects.
(a) The heat map shows the overview of the IDC transcriptome of the SMRU2 P. vivax isolate (see Supplementary Fig. S3 online for the SMRU1 transcriptome). Briefly, mRNA abundance (log2FPKM) of 5226 annotated protein-coding genes is depicted across the IDC in five groups stratified (20th percentile) based on their maximum level of transcription. Genes were sorted by their expression timing within group. The estimation of timing and parasites age of hours post invasion (HPI) are described in Methods. Left bar plots represent the fold enrichment of time-point specific genes (genes peaking their transcription at the same particular time point during the IDC) for each group. The expected frequency used here is the proportion of genes maximally expressed at the time point in whole genome; * indicates over-representation by binomial test at P < 0.01. (b) The “wizard’s hat”-like distributions was generated by plotting the transcriptional dynamics (max-to-min rank change of mRNA abundance through the IDC) against the maximum mRNA abundance for 5226 genes (grey dots). Blue dots represent medoids of functional groups (see Supplementary Data S2 online) with member genes significantly (P < 0.01) clustered by locations on “wizard’s hat” compared to random dataset. (c) Scatter plots for selected pathways/functional groups which are significantly clustered on the top (merozoite proteins), left bottom (vir family), right bottom (ribosome proteins) and middle (proteolysis and translation) of the distribution depicted in panel (b). In addition, the scatted of the Ap2 family members is shown.
Figure 2Untranslated regions (UTRs) of the P. vivax genes.
(a) An example of detecting UTR boundaries. The detection of UTR boundary is within the region covered by the de novo transcript. Red arrows point out the positions where mapped reads significantly (P < 0.05) rise after/before it. The testing window size is 20 bp. For details see Methods. The potential TSS or transcription termination site (TTS) is marked at the position with the highest frequency of starting/ending tags. The position along chromosome, L:R (left:right starting/ending tag) and P value at each boundary are listed in grey boxes. (b) Scatter plot, histogram distribution of 5′ and 3′UTR length with the median and mean shown inside histograms. The density plot represents the size difference of paired 5′ and 3′UTR of each gene (top corner). The black dash line represents boundaries of equal length of 5′UTR and 3′UTR. The 5′UTR in this diagram represent the most frequent isoform delineated by the main TSS (see Results). (c) Histogram of the overall distribution of PCC for transcriptional profiles of 5′UTR (red) and ORF (blue) for 3609 genes. (d) The distribution of maximum change of 5′ UTR size in categories of putative TSS number per gene.
Figure 3Examples of alternative splicing events in P. vivax.
(a) Exon skipping. (b) Alternative 5′ splicing site. (c) Alternative 3′ splicing site. (d) Exon skipping mixed with 5′(3′) AS. (e) Alternative transcription start site with both ends of the first exon altered. (f) Intron retention. Blue bars represent exons and the black line introns and the arrow depicts the direction of translation. The histograms above the gene representation depict read coverage (grey bars) as mapped to chromosomal location in the reference genome of P. vivax SalI strain (above).
Figure 4Novel transcripts in the Southeast Asian P. vivax strain.
(a) Length distribution of 3049 type-I novel transcripts which map to genome and outside current gene models. The grey line marks the median size (1647 bp) of protein coding genes currently annotated. (b) Length distribution of predicted ORFs for 3018 type-I transcripts without protein homologies. (c) Transcriptional profiles in log2 ratios for 2794 type-I transcripts (left) in pairs of transcriptional profiles of their nearest downstream gene (right). The histogram on the bottom represents the PCC distribution for paired transcriptional profiles for isolate SMRU2 (see Supplementary Fig. S10 online for SMRU1 data). The grey line represents a PCC distribution of random data. (d) Length distribution of 2178 type-II novel transcripts which do not map to current P. vivax genome. (e) Pie chart of homologous sequences in categories of gene product descriptions for 2178 type-II transcripts. (f) Transcriptional profiles in log2FPKM for 179 vir-like type-II transcripts with SMRU1 data (left) in same order of that with SMRU2 (right).