| Literature DB >> 29716542 |
Xuewu Liu1, Yuanyuan Wang2, Jiao Liang1, Luojun Wang2, Na Qin2, Ya Zhao3, Gang Zhao4.
Abstract
BACKGROUND: Plasmodium falciparum is the most virulent malaria parasite capable of parasitizing human erythrocytes. The identification of genes related to this capability can enhance our understanding of the molecular mechanisms underlying human malaria and lead to the development of new therapeutic strategies for malaria control. With the availability of several malaria parasite genome sequences, performing computational analysis is now a practical strategy to identify genes contributing to this disease.Entities:
Keywords: Cerebral malaria; Parasite-infected erythrocyte surface protein 2 (PIESP2); Plasmodium falciparum; Virtual genome
Mesh:
Substances:
Year: 2018 PMID: 29716542 PMCID: PMC5930813 DOI: 10.1186/s12864-018-4654-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Identification of group-enriched genes by virtual genome method. a Workflow of our comparative analysis. Protein sequence alignment was performed using phmmer to construct a protein similarity network where each edge represents a significant hit between query and target. Then, a modified BGLL algorithm was applied to find clusters within this network. Each cluster was considered as a virtual gene. Genes within these clusters were allocated to the species from which they originated, subsequently generating enriched values of all clusters in six species. Group-enriched genes can be identified by comparing cluster values in ingroup species with those in outgroup species. b The number of edges and the number of components included in the protein similarity networks that were obtained under different thresholds. c The number of clusters identified by the modified BGLL algorithm using different cut-off values of modularity. The arrow indicates the cut-off value used in this study. d Principal component analyses (PCA) of the enriched values of all clusters in six Plasmodium species. Components 1 (PC1) and 2 (PC2) represent 79% and 9% of total variance, respectively
Fig. 2Identification of P. falciparum genes responsible for parasitizing human erythrocytes. a Heat map showing the clusters enriched in human and malaria parasites. Green, black, and red indicate cluster values equal to zero, one, and higher than one, respectively. b Bar plot displaying the genomic location of 115 P. falciparum genes and 267 P. berghei genes. Proximity to telomeres and proximity to centromeres refer to the genome regions within 40 kb away from telomeres and 10 kb away from centromeres, respectively. The rest of the genome was referred to as the chromosome internal region. The numbers in each parenthesis represent the number of genes and the percentage to human or malaria enriched genes. c Venn diagram showing the number of P. falciparum genes (upper panel) or P. berghei genes (lower panel) whose proteins contain a signal peptide, a transmembrane domain, or a PEXEL motif. d Domain models of SURFIN family members. Domains were identified through CD-search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) with a cut-off value of 0.01. TM domain stands for transmembrane domain
Fig. 3Candidate genes related to virulence of the P. falciparum parasite. a Heat map showing the clusters enriched in P. falciparum. Green, black, and red indicate cluster values equal to zero, one, and higher than one, respectively. b Pie chart displaying the enrichment of each cluster in candidate genes. The numbers in each box represent the cluster size and the percentage to the total number of P. falciparum enriched genes. c Heat map showing the number of members detected in Plasmodium species or other species using hidden Markov models of seven families. Deep purple indicates no member was found, black indicates one member was detected, and gold indicates more than one member was discovered
Cellular component analysis of proteins produced by virulence-related candidate genes. Enriched terms were ranked according to their percentage of background. The top 20 terms are listed
| ID | Name | Background count | Result count | Percent of background | Bonferroni adjusted |
|---|---|---|---|---|---|
| GO:0020030 | infected host cell surface knob | 54 | 51 | 94.4 | 1.71E-26 |
| GO:0020002 | host cell plasma membrane | 217 | 201 | 92.6 | 5.58E-118 |
| GO:0033644 | host cell membrane | 230 | 205 | 89.1 | 1.39E-118 |
| GO:0044218 | other organism cell membrane | 230 | 205 | 89.1 | 1.39E-118 |
| GO:0044279 | other organism membrane | 230 | 205 | 89.1 | 1.39E-118 |
| GO:0020036 | Maurer’s cleft | 229 | 182 | 79.5 | 8.18E-97 |
| GO:0033655 | host cell cytoplasm part | 367 | 243 | 66.2 | 4.01E-125 |
| GO:0020003 | symbiont-containing vacuole | 184 | 117 | 63.6 | 6.25E-51 |
| GO:0065010 | extracellular membrane-bounded organelle | 184 | 117 | 63.6 | 6.25E-51 |
| GO:0033643 | host cell part | 472 | 298 | 63.1 | 2.25E-161 |
| GO:0043230 | extracellular organelle | 186 | 117 | 62.9 | 1.45E-50 |
| GO:0030430 | host cell cytoplasm | 398 | 250 | 62.8 | 7.23E-126 |
| GO:0033646 | host intracellular part | 398 | 250 | 62.8 | 7.23E-126 |
| GO:0043656 | intracellular region of host | 399 | 250 | 62.7 | 1.13E-125 |
| GO:0043245 | extraorganismal space | 479 | 298 | 62.2 | 5.47E-160 |
| GO:0018995 | host | 479 | 298 | 62.2 | 5.47E-160 |
| GO:0043657 | host cell | 479 | 298 | 62.2 | 5.47E-160 |
| GO:0044217 | other organism part | 479 | 298 | 62.2 | 5.47E-160 |
| GO:0044216 | other organism cell | 479 | 298 | 62.2 | 5.47E-160 |
| GO:0044215 | other organism | 479 | 298 | 62.2 | 5.47E-160 |
Biological process analysis of proteins produced by virulence-related candidate genes. Enriched terms were ranked according to their percentage of background. The top 20 terms are listed
| ID | Name | Background count | Result count | Percent of background | Bonferroni adjusted |
|---|---|---|---|---|---|
| GO:0034110 | regulation of homotypic cell-cell adhesion | 190 | 190 | 100 | 3.94E-114 |
| GO:0022407 | regulation of cell-cell adhesion | 190 | 190 | 100 | 3.94E-114 |
| GO:0034118 | regulation of erythrocyte aggregation | 190 | 190 | 100 | 3.94E-114 |
| GO:0020013 | modulation by symbiont of host erythrocyte aggregation | 190 | 190 | 100 | 3.94E-114 |
| GO:0030155 | regulation of cell adhesion | 190 | 190 | 100 | 3.94E-114 |
| GO:0044068 | modulation by symbiont of host cellular process | 192 | 191 | 99.5 | 1.44E-114 |
| GO:0044003 | modification by symbiont of host morphology or physiology | 194 | 191 | 98.5 | 5.12E-114 |
| GO:0051817 | modification of morphology or physiology of other organism involved in symbiotic interaction | 194 | 191 | 98.5 | 5.12E-114 |
| GO:0051809 | passive evasion of immune response of other organism involved in symbiotic interaction | 204 | 200 | 98 | 2.96E-120 |
| GO:0020033 | antigenic variation | 204 | 200 | 98 | 2.96E-120 |
| GO:0035821 | modification of morphology or physiology of other organism | 195 | 191 | 97.9 | 9.62E-114 |
| GO:0020035 | cytoadherence to microvasculature, mediated by symbiont protein | 164 | 160 | 97.6 | 2.98E-92 |
| GO:0044406 | adhesion of symbiont to host | 164 | 160 | 97.6 | 2.98E-92 |
| GO:0051834 | evasion or tolerance of defenses of other organism involved in symbiotic interaction | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0052173 | response to defenses of other organism involved in symbiotic interaction | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0051832 | avoidance of defenses of other organism involved in symbiotic interaction | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0051707 | response to other organism | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0043207 | response to external biotic stimulus | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0051805 | evasion or tolerance of immune response of other organism involved in symbiotic interaction | 206 | 200 | 97.1 | 1.04E-119 |
| GO:0051807 | evasion or tolerance of defense response of other organism involved in symbiotic interaction | 206 | 200 | 97.1 | 1.04E-119 |
Fig. 4Identification of P. falciparum genes contributing to cerebral malaria. a Principal component analysis performed on eight RNA-seq datasets. Components 1 (PC1) and 2 (PC2) represent 71% and 21% of total variance, respectively. Datasets of two adjacent time points tend to be located close together within the plot. b The periodic genes identified by FFT ordered by the time points of their peak expression. Expression values of each transcript were log2-scaled and centered by subtracting their mean value. c Venn diagram of the number of genes transcribed at the trophozoite stage and that of candidate genes whose proteins contain transmembrane domains. d Domain model of PIESP2 protein (upper panel) and expression signals of PIESP2 in the intraerythrocytic cycle (lower panel). TM represents transmembrane domain. Blue line represents the observed expression level of PIESP2 and red line is the fitting curve using FFT