Literature DB >> 29317802

R Script Approach to Infer Toxoplasma Infection Mechanisms From Microarrays and Domain-Domain Protein Interactions.

Ailan F Arenas1, Gladys E Salcedo2, Jorge E Gomez-Marin1.   

Abstract

Pathogen-host protein-protein interaction systems examine the interactions between the protein repertoires of 2 distinct organisms. Some of these pathogen proteins interact with the host protein system and may manipulate it for their own advantages. In this work, we designed an R script by concatenating 2 functions called rowDM and rowCVmed to infer pathogen-host interaction using previously reported microarray data, including host gene enrichment analysis and the crossing of interspecific domain-domain interactions. We applied this script to the Toxoplasma-host system to describe pathogen survival mechanisms from human, mouse, and Toxoplasma Gene Expression Omnibus series. Our outcomes exhibited similar results with previously reported microarray analyses, but we found other important proteins that could contribute to toxoplasma pathogenesis. We observed that Toxoplasma ROP38 is the most differentially expressed protein among toxoplasma strains. Enrichment analysis and KEGG mapping indicated that the human retinal genes most affected by Toxoplasma infections are those related to antiapoptotic mechanisms. We suggest that proteins PIK3R1, PRKCA, PRKCG, PRKCB, HRAS, and c-JUN could be the possible substrates for differentially expressed Toxoplasma kinase ROP38. Likewise, we propose that Toxoplasma causes overexpression of apoptotic suppression human genes.

Entities:  

Keywords:  R; Toxoplasma; domain-domain protein interaction; host-pathogen interaction; microarrays

Year:  2017        PMID: 29317802      PMCID: PMC5753922          DOI: 10.1177/1177932217747256

Source DB:  PubMed          Journal:  Bioinform Biol Insights        ISSN: 1177-9322


Introduction

Because infectious diseases are a major health problem worldwide, understanding the molecular mechanisms among pathogen-host interactions (PHIs) is key in addressing this concern. The design of an appropriate strategy to combat a specific pathogen depends on our understanding of the specific PHI. Most studies have focused on identifying protein-protein interactions (PPIs) within a single organism (intraspecies PPI prediction). It is difficult to infer new PPIs between 2 different species (interspecies PPI prediction) because the development of an interaction database depends on experimentally verified PHI data that are costly in time, equipment, and budget to produce.[1] Therefore, the design of computational strategies is worthwhile to elucidate infection mechanisms when experimental PHI data are scarce. The interactions between pathogen proteins and their hosts allow the pathogens to manipulate host cellular mechanisms for their own advantage, such as escaping from host immune responses. For instance, the Toxoplasma gondii pathogen can manipulate and control a variety of host processes due to secreted factors that interact with the host cell proteins.[2-5] For example, rhoptry proteins are vital for the Toxoplasma infection process and its survival. Most of the virulence of T gondii strains relies on their polymorphic rhoptry kinases, secreted protein effectors that target host transcription factors and other proteins with antimicrobial functions. The ROP18 and ROP5 cooperatively interact with the murine IFN-γ–induced immunity-related GTPases (IRGs).[6-8] ROP16, another polymorphic kinase, is correlated with virulence because it is involved in constitutive activation of the host STAT transcription factors.[3,4,9] Likewise, expression-level differences in ROP38, another secreted rhoptry kinase, mediate differences in gene activation along the MAP kinase pathway in the host cell.[10,11] After the genome sequence of T gondii became available, gene expression profiles at different developmental stages were investigated by microarray expression analyses.[12-15] This technology allows examining genome-wide expression changes in tissues under different conditions. This information was useful in identifying differential expression patterns in human and mouse cell cultures relative to infection by different Toxoplasma strains, revealing that either polymorphic or overexpression effector proteins from rhoptry or granule dense organelles are the main elements responsible for modulating host gene expression.[2,10,11] Although significant progress has been made regarding Toxoplasma infection mechanisms through microarray analysis, additional research is necessary to learn and decipher the interspecific PPI between toxoplasma and its hosts. Expression profiling indicates whether a particular gene is expressed in a particular condition (by measuring messenger RNA levels), but to determine whether it is involved in a particular cell process, the protein (the product of the gene) must also be examined. Protein domains are the basic building blocks that determine the structure and function of proteins, and interactions between domains mediate (PPI). Domain-domain interaction (DDI)-based approaches are often used to predict both intraspecies and interspecies PPI. Several different databases store lists regarding experimentally confirmed and predicted DDIs, such as iPfam[16] and DOMINE[17]; this information would be useful if integrated with expression data to infer pathogen-host PPIs. Therefore, we have developed an R script that integrates differential gene expression calculations, enrichment analysis, and the crossing of interspecific DDI to predict interactions between host and pathogen proteins. We applied this script to examine the Toxoplasma-host system and identify human proteins that can potentially interact with a specific protein domain of T gondii. We focused on this parasite because it is a ubiquitous obligate intracellular protozoan that can invade and replicate in almost all cells of a broad range of warm-blooded animals and is estimated to infect approximately one-third of the world population.[18,19]

Materials and Methods

Data sets

All microarray data used in this work for Toxoplasma and host were downloaded from the NCBI (National Center for Biotechnology Information) Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/gds, Toxoplasma series GSE44189, GSE16115, GSE24905, GSE20145, GSE22315; human series GSE44191, GSE32104, GSE25468, GSE81016; mouse series GSE55298 and GSE27972). The series are interpreted as matrices in which columns are conditions and rows are gene expression values for each condition (all the series are already normalized). All GEO series are included as txt. file in Additional file 1. To predict DDI, we downloaded the collection of known and predicted DDIs from the database DOMINE v2.0 release 2010.[17] We converted the lists “INTERACTION” into comma-separated value files. This list is included in Additional file 1. Because each Toxoplasma strain exhibits unique characteristics and gene expression signatures in the host cell, an appropriate way to exploit this information would be to identify the genes that are more variable for each strain compared with another across microarrays. Conversely, in microarray data, it is common to observe asymmetric gene expression distributions with extreme values. Generally, microarray expression data exhibit similar means (and medians) but heterogeneous dispersion. This fact suggests that dispersion measurements are appropriate to describe the gene expression profiles. For a set of observations , where , the standard deviation , the mean deviation , and the coefficient of variation are 3 useful dispersion measurements when the chosen central tendency measurement is the mean However, because the median is robust in asymmetric distributions with extreme values, 2 more appropriate dispersion measurements are the Meda and the coefficient of median variance. For the set of observations , their corresponding order statistics are given by , where and , respectively. The middle value of is the median of , denoted by is that value which separates the upper 50% of values from the lower 50%. Considering now the set of deviations , the Meda is given by the median of , ie, remember that the mean deviation of is given by . However, analogous to the coefficient of variation, the coefficient of median variation can be defined given by the quotient as a dispersion index based on the median.

The algorithm

We implemented 2 new functions in R named: “rowDM,” which estimates the mean deviance for each row in a matrix and “rowCVmed,” which estimates the median variation coefficient for each row in a matrix. The output of both functions is a numerical vector ordered from the highest variability to 0; then, a percentile threshold is defined for each vector to choose a set of genes with extreme variability, ie, a desired amount of differentially expressed genes, such that its variability is greater than the percentile chosen. The procedure used to identify PHI based on microarray and DDI data is explained in the following steps: Step 1. Functions rowDM and rowCVmed are applied on the GEO data set matrix and the set of genes with extreme variability are select according to appropriate percentile thresholds to obtain submatrices ranking from the most variable rows to the fewer ones (ID probes) in both the pathogen and host microarray experiments. Step 2. The pathogen (toxoplasma) ID probes obtained from step 1 are mapped to the database “ToxoDB release 26” (www.toxodb.org). Step 3. The host ID probe sets obtained from both functions (in Step 1) are mapped to gene symbols and Pfam entries using the hgu133plus2.db and mouse4302.db packages.[20,21] We also included a collection of human illumina IDs to map into gene symbol. This list is included in Additional file 1. Step 4. Subsequently, functional enrichment analysis (FEA) was performed with the different gene sets obtained from Step 3, using the FGNet enrichment package.[22] Step 5. We included the sqldf package in our R script, as well as the DDI list “INTERACTIONS” to map all domains obtained from the previous host enrichment gene sets (in Step 4), which could interact with the representative domains in Toxoplasma proteins. “INTERACTIONS” list is included in Additional file 1. Step 6. The result displays a vector list with gene symbols that can interact with a specific domain (for our case, PF00069 kinase domain from toxoplasma). Step 7. Finally, we remapped the gene symbols for the (KEGG) signaling pathway (www.genome.jp/kegg) to identify all genes that could interact with PF00069 into a specific cell-signaling pathway. The R script is provided in the Supplementary Information S11.

Results and Discussion

Application of rowDM and rowCVmed functions

Toxoplasma GEO series analysis

On implementing the program, we sought to evaluate its performance in a descriptive manner, comparing it against the GEO2R tool (www.ncbi.nlm.nih.gov/geo/geo2r), which was used by other authors. The GSE44191 and GSE16115 series include RNA expression values for 3 type 1 toxoplasma strains (RH-ERP, RH-JSR, and GT1) from human foreskin fibroblasts (HFFs) infected for 24 hours with each of these strains. To evaluate the performance of the R program, we applied this code script to identify the Toxoplasma genes most differentially expressed in both microarrays. Once we applied the rowDM and rowCVmed functions to the GSE44191 series with critical values of 1.0 and 0.142, respectively, we obtained 2 sets of 29 and 30 genes for each function with the most variable Toxoplasma genes, according to their RNA expression levels and observed that ROP8 and ROP38 were differentially expressed between the toxoplasma strains RH-ERP versus GT1. Similar results were observed by the original authors of this microarray data under GEO2R[23] in which ROP8 and ROP38 were found to be differentially expressed between RH-ERP and GT1 strains; likewise, 30% of the differentially expressed genes identified from our R script overlapped with the original results.[23] Furthermore, no remarkable differences were observed between the output data from rowDM and rowCVmed functions (Table 1 and Supplementary Information S1A and S1B). For the GSE16115 series, we applied the critical values 0.83 and 0.65 for both functions to obtain 104 and 113 Toxoplasma genes with the highest variability in RNA levels, respectively. We found similar results as the original authors, who proposed those hypothetical proteins and 3 members of ABC transporters as the most variable genes in their RNA expression among Toxoplasma type 1 strains by GEO2R.[24] By applying our functions, we also found the ROP38 with high variability in its expression for this microarray analysis (Table 1 and Supplementary Information S2A and S2B). Now, we analyzed the GSE24905 series that contain the RNA expression values from 49 recombinant progenies and their parental Toxoplasma type I (GT1) and type II (ME49) parasites. Using our R script, we created a submatrix with the parental type I and type II RNA expression data only. By applying critical values of 2.0 and 0.56 for rowDM and rowCVmed, 2 groups of 42 and 49 genes, respectively, were obtained. Interestingly, the ROP5 Toxoplasma protein was observed in both groups (Table 1 and Supplementary Information S3A and S3B). Although ROP5 has no kinase activity, it is known as one of the most important virulence factors in Toxoplasma, but it is less expressed in type I strains due to lack of some copies of this gene in type I strains. The authors of this series also highlight ROP5 as one of the most important genes differentially expressed between these 2 strains by means of GEO2R.[25] Likewise, we observed more ROP kinases, such as ROP8, ROP1, ROP29, ROP39, ROP21, and ROP16, as important representative groups with the most variable gene expression and possibly related to virulence strain dependence (Table 1 and Supplementary Information S3A and S3B). The ROP16 kinase has been associated with virulence in type I and III strains due to its ability to phosphorylate STAT3/6 host transcription factors.[3,4,9]
Table 1.

The most relevant differentially expressed gene candidates obtained with both functions from the Toxoplasma GEO set series.

GEO set seriesOrganismrowDM critical valuerowCVmed critical valueNo. of protein candidates for both functions, respectivelyEnrichment pathway (FEA)More relevant candidates in expression variability obtained with both functionsJournal support (in references)Supplementary information
GSE44191 Toxoplasma 10.14229 and 28NAROP8, ROP38, hypothetical proteins, and ABC transportersYang et al[23]S1A and S1B
GSE16115 Toxoplasma 0.830.65104 and 113NAROP38, hypothetical proteins, and ABC transportersKhan et al[24]S2A and S2B
GSE24905 Toxoplasma 20.5642 and 49NAROP5, ROP8, ROP1, ROP29, ROP39, ROP21, and ROP16Behnke et al[25]S3A and S3B
GSE22315 Toxoplasma 0.920.2497 and 95NAROP8, ROP18, ROP46, ROP38, ROP20, ROP19AS4A and S4B
GSE20145 Toxoplasma 1.60.2597 and 105NAROP18, ROP14, ROP15, ROP38, ROP1, ROP7, ROP40, ROP31, ROP20, ROP6, ROP11, and ROP29S5A and S5B
GSE44189Human0.510.09380 and 81Type I interferonIRF7, ISG15, ISG20, MX1, MX2, OAS1, OASL, and RSAD2Yang et al[23]S6A and S6B
GSE25468Human1.8771.115120 and 120Immunity related (NF-κB)ILB1, IRF1, and NFKB1Rosowski et al[26]S7A and S7B
GSE32104Human2.4551.1255 and 5NAMEOX1, MMP10, SERPINB3, SERPINB4, IL1RNBehnke et al[27]
GSE55298Mouse10.2550 and 50Immunity relatedMARCKS, HBEGF, SLC7A2, SOCS2, EGR3, c-MYC, SOCS2, SERPINB9, ITGAX, CISH, C3Franco et al[28]S8A and S8B
GSE27972Mouse1.520.30872 and 77JAK-STATCISH, SOCS1, SOCS2, and SOCS3Blader and Saeij[29]S9A, S9B, and S9C
GSE81016Human2.718 (2 h)1.221 (2 h)250 and 200Regulation of MAPK cascadeCDK5RAP3, CSK, FOXM1, NDRG2, PRKCA, SORL1, SPRY2S10B
GSE81016Human2.178 (6 h)1.219 (6 h)220 and 210Regulation of macroautophagyCTTN, MAP1LC3B, RAB33B, ULK1, ZDHHC8S10B
GSE81016Human2.718 (24 h)1.222 (24 h)150 and 170Regulation of apoptotic processARNT2, BIRC5, DFFA, F2R, FNIP1, JUN, MTDH, NME2, RNF34, SLC39A10, SOCS2, SQSTM1, THOC6S10B

Abbreviations: FEA, functional enrichment analysis; GEO, Gene Expression Omnibus; NA, not applicable; NF-κB, nuclear factor κB.

The most relevant differentially expressed gene candidates obtained with both functions from the Toxoplasma GEO set series. Abbreviations: FEA, functional enrichment analysis; GEO, Gene Expression Omnibus; NA, not applicable; NF-κB, nuclear factor κB. Thereafter, we examine the GSE22315 series, which contains RNA values for 6 more representative toxoplasma strains (type I: GT1 and RH, type II: ME49 and Prugniaud, and type III: CTG and VEG), taken 12 hours after infection in HFF cells. For this series, we used the critical values in both functions to obtain a list with the 100 most variable genes in their RNA values (Table 1). Among them, we found that the ROP18 protein considered along with ROP5 as the main virulence factors in the toxoplasma type I genetic background. The ROP18 gene has low expression in the type III strains, considered less virulent, at least in the mice hosts.[30] Similarly, as in the other toxoplasma series, ROP38 along with ROP8, ROP46, ROP20, and ROP19A were also found with high variability in their expression, which was also observed with GEO2R (Table 1 and Supplementary Information S4A and S4B). Finally, we analyzed the GSE20145 series, which compares the RNA values for the 3 canonical Toxoplasma strains (type I [RH], type II [Prugniaud], and type III [VEG]) after infecting HFF cells. As with the series examined above, ROP18 and ROP38 appear as the most differentially expressed genes among the 3 Toxoplasma strains during the infection process in HFF cells. We also observed other differentially expressed ROPs such as ROP14, ROP15, ROP1, ROP7, ROP40, ROP31, ROP20, ROP6, ROP11, and ROP29 confirmed with GEO2R (Table 1 and Supplementary Information S5A and S5B). In summary, agreeing with other authors, the most representative groups of proteins with the highest variability of RNA expression among canonical Toxoplasma strains are the ABC transporters, hypothetical proteins, and rhoptry kinase protein (ROP) family, especially ROP38, which was observed as a differentially expressed gene in most of the Toxoplasma strains compared (Table 1 and Supplementary Information).

Host GEO series (human and mouse) and FEA

Continuing with the descriptive performance analysis of our script, we examined the host gene response during Toxoplasma infection. We chose the GSE44189 series that includes RNA expression values for HFF infected with 3 type 1 toxoplasma strains. The GSE44189 series was also structured as a matrix in which columns were treatments (Toxoplasma infections with 3 strains) and rows were RNA expression values for each human gene for each treatment. After applying both functions with critical values rowDM > 0.51 and rowCVmed > 0.093, we obtained 2 HFF gene sets (ID probes) with the greatest variability in the microarray experiment for each function with 80 and 81 genes, respectively. We observed differentially expressed human genes because of Toxoplasma type I infections; these are IRF7, ISG15, ISG20, MX1, MX2, OAS1, and OASL. After applying an FEA for the 2 gene sets, we confirmed that these genes are altogether activated by interferon type I, as proposed by Yang et al[23] (Table 1 and Supplementary Information S6A). Differential expression for those genes was also observed by GEO2R (Supplementary Information S6B). We also analyzed the GSE25468 series which comes from HFF cells infected with the type II (Prugniaud) and type III (CEP) canonical Toxoplasma strains. We applied both functions with critical values to obtain 2 subsets with 120 genes each. In this array, we found variably expressed human genes such as ILB1, IRF1, and NFKB1 which are important molecules in the host inflammatory response against pathogens. It seems to be modulated by differential expression genes among Toxoplasma strains (Table 1 and Supplementary Information S7A). The differential expression for these 3 genes was also confirmed by GEO2R (Supplementary Information S7B). The original authors for the GSE25468 series reported that toxoplasma type II strains interfere in the nuclear factor κB (NF-κB) pathway.[26] Likewise, it was shown that the activity of this transcription factor is modulated during Toxoplasma infection.[29] Now, we examine the GSE32104 series that contains the HFF RNA level in 2 infections: one of them is a wild-type RH type I strain and the other is the same strain but knockout for ROP5 gene. We did not find interesting human variably expressed genes with our script for this series. The author for the GSE32104 series only reported the SERPINB3 as the most differentially expressed gene related to the knockout ROP5 condition.[27] This gene was also observed in our set among the first 5 genes with the highest variability in their RNA expression (MEOX1, MMP10, SERPINB3, SERPINB4, and IL1RN) (Table 1). Although ROP5 alleles are significantly related to infection in the host specifically because of interaction with IRGs,[6-8] the ROP5 gene does not seem to modulate host gene expression. After exploring Toxoplasma infection in mouse macrophages, the data are contained in the GSE55298 series show RNA values from RAW264.7 cells infected with Toxoplasma RH strain versus uninfected cells. We looked for the first 5 most variable genes in this array and found the MARCKS and HBEGF genes also proposed by the original authors of this microarray as the most differentially expressed genes because of toxoplasma RH infections in mouse cells.[28] By applying our functions, we expanded the search for the first 50 genes with the greatest variability in RNA expression and found the c-Myc transcription factor in this gene set, which was reported as a gene regulated by Toxoplasma RH infection, producing differential expression for the following genes: MARCKS, HBEGF, SLC7A2, SOCS2, EGR3, and others (Table 1 and Supplementary Information S8A). The differential expression for these 5 genes was also corroborated via GEO2R (Supplementary Information S8B).[28] After that we examine the GSE27972 series that compares the RNA levels from mouse bone marrow–derived macrophages (BMdMs) infected with T gondii type I RH strain for 6 hours versus the BMdM uninfected. By taking the first 70 most variably expressed genes in this series, we found the cytokine signaling suppressor groups, such as CISH, SOCS1, SOCS2, and SOCS3, which are involved in inhibiting the JAK-STAT signaling pathway (Table 1, Supplementary Material S9A and S9B); the highest differential expression for those 4 genes were also observed via GEO2R (Supplementary Material S9C). Evidence exists that Toxoplasma mediated the induction of the suppressor cytokine signaling protein 1 (SOCS1), which contributes to the inhibition of IFN-β immune response, proven to be critical to control parasite replication in the host.[29] Finally, we executed our script in the GSE81016 series that contains RNA values for WERI-Rb-1 human retinal cells infected with toxoplasma for 2, 6, and 24 hours compared with uninfected control. Thus far, no information has been reported about this series. We found variability in the RNA levels after 2 hours of Toxoplasma infection for genes related to both regulations of MAPK cascade and kinase activity. In addition, we observed genes involved in macroautophagy regulation after 6 hours of infection; interestingly, autophagy has been demonstrated to be an antitoxoplasmacidal cell process.[31] We also observed that after 24 hours of infection, apoptotic and cell death processes were also altered. Apoptosis has also been shown as a cell immune mechanism to control Toxoplasma growth in the host cell[31] (Table 1 and Supplementary Information S10B).

Mapping toxoplasma Gene Ontology terms to Pfam entries

It was observed that ROP kinases in the Toxoplasma genome were the most differentially expressed genes among the Toxoplasma strains when they infect and grow inside the host cell (Table 1). This means that Toxoplasma strains have different molecular mechanisms to survive, which is correlated with the infectiveness of the strain. The outstanding protein domain for Toxoplasma was the protein kinase domain “Pkinase” (Pfam entry: PF00069), present in active ROP kinases. The ROP38 seems to be the most interesting gene found with high expression variability among strains in 4 of 5 toxoplasma arrays analyzed (Table 1). It has been demonstrated that ROP38 can activate host genes associated with the MAPK signaling pathway and NF-κB,[10,23] indicating that ROP38 may interact with some host proteins.

DDIs: “Pkinase PF00069 versus gene set domains” and mapping to the KEGG signaling pathways 04630 (JAK-STAT), 04064 (NF-κB), and 04010 (MAPK)

The previous human and mouse gene sets obtained with our functions from the GSE44189, GSE25468, GSE27972, and GSE81016 series were also mapped to Pfam domain entries to identify functional domains that could interact with the Toxoplasma Pkinase domain PF00069 found in Toxoplasma ROP38. In addition, because the differential expression of ROP38 influences HFF gene expression associated with the MAPK, JAK-STAT, and NF-κB signaling pathways,[10,23] we remapped the gene sets obtained with both functions from the GSE44189, GSE25468, GSE27972, and GSE81016 series to the KEGG pathway IDs 04630, 04064, and 04010 to identify possible targets involved in some of these signaling pathways. We observed interesting transcription factors such as the NFKB1 p105 subunit in the GSE25468 series and the NFKB inhibitor zeta (NFKBIZ) in the retinal human GSE81016 series. These 2 proteins contain inhibitory ankyrin repeat domains, which have been shown to interact with kinase activity proteins[32] (Table 2). Likewise, we also found suppression of SOCS2 and SOCS5 cytokine signaling in the GSE81016 series; these proteins are regulators of the JAK/STAT signaling pathway. Both SOCS2 and SOCS5 contain SH2 domains that can be phosphorylated by JAK proteins[33,34] (Table 2). We saw important kinase proteins, such as PIK3R1, PRKCA, PRKCG, PRKCB, and the GTPase HRAS, that have been reported as key molecules in the MAPK signaling, which activate antiapoptotic genes.[35] In our list, we evidenced the presence of inflammasome activator MAP2K3 and the proapoptotic proteins NFKBIZ, MAP3K5, as well the c-JUN transcription factor (Table 2).
Table 2.

Domain-domain interaction.

GEO set seriesType cell and gene setFunctionProteins that could interact with Pkinase domain PF00069JAK/STAT (KEGG 04630)NF-κB (KEGG 04064)MAPK (KEGG 04010)
GSE44189HFFrowCVmedPABPN1, KRT19, MX1, CDH3, VCAM1, RGS4, GAP43, MX2, CENPE, EIF2S3, SMAD3, ISG15, OASL, SLC18A2, CACNB2HIST1H3C, MCAM, HGF, MAP3K7, CIT, PCLO, CASP4, ITGBL1, PPFIA4, COL7A1, PSAT1, GDF15, NTRK2, SCLYMAP3K7CACNB2, MAP3K7, NTRK2
GSE25468HFFrowCVmedHSPA1A, HSPA1B, HSPA1L, GSTP1, RAD23A, PSMC3, SFRP1, CSNK1E, CFB, TRIP10, UBE2S, FKBP2, TRMT1, RGS4USP13, DCLK1, HSD11B1, ABLIM3, IL7R, SLC18A2, UBD, GABBR1, EPHA3, SELE, SERPINB7, NR0B1, RAB2A, SART3NFKB1, SERPINB3, SERPINB4, SLC43A3, CASP1, MSH6, STIP1, SLC16A3, CSNK1D, PCLO, HIST2H2AA3, HIST2H2AA4CDH6, FUS, ANKRD11, MRPS18A, HNRNPUL2IL7RNFKB1, CASP1HSPA1A, HSPA1B, HSPA1L, NFKB1
GSE27972BMdMrowCVmedDUSP6, SATB1, VDR, HBEGF, SOCS2, VCAN, AHR, PLEKHF1, SERPINA3G, PTPN2, EDIL3, GEM, TRIB3, EGR2, CDC42EP2ABTB2, MCF2L, RASGRP1, DIXDC1, ITGB8, NR4A3, NR4A2, CDH1, CISH, F10, SOCS1, DUSP2, BATF3, SOCS3SOCS2, CISH, SOCS1, SOCS3DUSP6, RASGRP1, DUSP2
ERI-Rb-1 (2 h) GSE81016rowCVmedrowDMZNF337, ROBO1, NFKBIZ, MIXL1, DDR2, SPAG7, HIPK2, KLK1, CC2D1A, STK24, SLC22A7, DGKK, CNTNAP5, SLC25A24KIF19, ZNF483, ULK1, WDSUB1, PRSS33, UTP3, SUPV3L1, IDH2, KLHDC7A, AHCYL1, CNTN4, PRKCA, ARL13B, ACADMIL1RL1, PAX6, NHLH1, HMHA1, SETD1B, MCM5, KLK9, STK17B, SEZ6L2, SDCBP, FERMT3, LTB, HNRNPA1L2, SORDPLEKHA1, SMARCD3, CASP2, ARL17B, ZDBF2, ZNF583TIAM1, DNAJC10, WWP2, GABBR2, ZNF337, KHSRP, PRKAR1B, SPAG7, HIPK2, OGT, CC2D1A, PTGR1, H2AFV, KPNA3CCNY, SF1, ZNF786, ERAL1, SHC1, FGFR3, TRIM36, FOSL2, USP7, SRC, ZNF483, ULK1, WDSUB1, ZCCHC7, UTP3, SGK3SUPV3L1, IDH2, CSK, R3HDM1, SORL1, PIK3R1, ARL5A, PRKCA, PSPH, UBE2Q1, ACADM, ARHGEF9, HSPD1, SOCS5GTF3C3, NHLH1, HMHA1, RAF1, SETD1B, MICALL1, PI4K2B, MCM5, IRS2, REL, RGL2, ZNF14, RBM7, DHX37, PTPN12SLC25A42, LGALS8, ELK4, PTGS1, FOXM1, DDX59, TNFSF14, SDCBP, CASP6, UBE2V1, DYRK2, SYT7, HNRNPA1L2PLEKHA1, MARK2, ARL17B, KAT2B, ZNF583, ZSWIM7PIK3R1, SOCS5NFKBIZFGFR3PRKCA, NFKBIZFGFR3, PRKCA, RAF1, ELK4
GSE81016WERI-Rb-1 (6 h)rowCVmedrowDMLRIG2, DMD, DEF8, NBEAL2, SLC25A43, DCC, TRIM21, TMPRSS3, SLC25A4, LRSAM1, NFATC3, BRCC3, AARSD1, PSKH1MASP2, PDZD2, ZNF71, PRKCG, MAP2K3, GP9, TDRD3, RXRG, STXBP6, CABP2, CRB1, FBXW2, PICALM, EPHB2, ZDHHC8STK39, MXI1, ABCA6, SNX7, RAB5A, ZFHX3, TRIM25, SHANK2, PCDHB10, GIPC2, ZNF14, SAR1B, FOXR1, USP21, CLK3PUM1, SNRNP40, PPM1KLRIG2, OAT, DEF8, NBEAL2, HNRNPA1, FKBP14, SNX5, ME2, MPP6, PDIA6, ADCY6, CHD4, NFATC3, RALY, NCL, BRCC3ANKRD28, TRAP1, AARSD1, ULK1, PCNA, EED, SASH1, ZNF420, RGS12, PLCB2, WDR53, PPM1A, RAB33B, PDZD2TRIM41, PLCL2, SLTM, CCNH, ZNF85, BANF1, KLHL20, ZNF526, MAP4K5, DDX49, SUPT4H1, HRAS, FBXW2, PICALMZDHHC8, STK39, ZNF787, CRK, ZNF721, ZFHX3, TRIM25, CORO2A, NOL3, BMPR2, TSSC1, DNAJB4, PPWD1, ARNTUSP21, ZNF436, EIF4G1, SYT7, CLK3, LRFN1, KLHL28MAP2K3PRKCGTRIM25, TRAP1PRKCG, MAP2K3PPM1A, HRAS, CRK, MAP4K5
GSE81016WERI-Rb-1 (24 h)rowCVmedrowDMSHH, ZNF580, FMNL3, SQSTM1, BDH2, ZNF785, THOC6, AFAP1L1, PRPF4B, HOXC11, RASA2, PARD3, ARNT2, DUSP19CD4, RORC, CLTCL1, UBE2Z, PTPRT, UBQLNL, TMTC4, ARNT, FOXP3, ZNF468, PRRX2, SPAST, SLC16A5, SFN, UBE2E3PHLPP1, HPCA, CLK2, BMPR2, FOXJ3, EIF2AK1, USP48, CCNG1, CACNB2, CD79A, PRKCB, ZNF69, PACSIN2GNB2, SLC2A1, SQSTM1, GAB2, ZNF785, THOC6, MPP6, BAZ1B, PPA2, ZNF2, DTX1, ATP1A3, RALY, WBSCR16, PTPN9AARSD1, BRD8, HIATL1, PFKL, ERMAP, PICALM, ARNT2, AKAP10, MAP3K5, L3MBTL3, BRD1, ZNF680, OSR2, MKLN1IRF2BP1, SYNE2, SP3, RAPGEFL1, LRFN2, TMTC4, ANKHD1, ZNF43, NUDT2, ZNF468, CDH23, PRSS8, ZNF91, BIRC5SEMA4F, JUN, LHX2, RXRG, PHLPP1, HPCA, BMPR2, DHX29, LMNB1, SOCS2, SERPINF1, BRSK1, USP48, ZNF565, PTPROCORO2A, PKIA, TRAF5, NUDT6, SNRNP70, SENP7, FAM13A, ZNF69, PACSIN2SOCS2SOCS2RASA2, CACNB2, PRKCBMAP3K5, JUN

Abbreviations: BMdM, bone marrow–derived macrophage; GEO, Gene Expression Omnibus; HFF, human foreskin fibroblast; NF-κB, nuclear factor κB; TF, transcription factor.

The most differentially expressed gene sets obtained from the functions rowDM and rowCVmed which have domains that can interact with the Pkinase domain PF00069. Those genes were also mapped for 3 signal pathways JAK/STAT, NF-κB, and MAPK. Red: survival factors HRAS, PIK3R1, PRKCA, PRKCG, and PRKCB that activate antiapoptotic genes. Green: MAP2K3 is related to inflammasome activation. Blue: MAP3K5, NFKBIZ, and JUN (TF) are apoptosis activators.

Domain-domain interaction. Abbreviations: BMdM, bone marrow–derived macrophage; GEO, Gene Expression Omnibus; HFF, human foreskin fibroblast; NF-κB, nuclear factor κB; TF, transcription factor. The most differentially expressed gene sets obtained from the functions rowDM and rowCVmed which have domains that can interact with the Pkinase domain PF00069. Those genes were also mapped for 3 signal pathways JAK/STAT, NF-κB, and MAPK. Red: survival factors HRAS, PIK3R1, PRKCA, PRKCG, and PRKCB that activate antiapoptotic genes. Green: MAP2K3 is related to inflammasome activation. Blue: MAP3K5, NFKBIZ, and JUN (TF) are apoptosis activators. After all of these results, our hypothesis is that the lower expression of ROP38 in type I Toxoplasma is correlated with the high expression of these survival genes, which could suppress the activation of the c-JUN transcription factor and avoid cell death via the activation of apoptosis suppressors.[36] Likewise, proapoptotic genes such as NFKBIZ, MAP3K5, and c-JUN were downregulated, compared with the uninfected controls in the human retinal cells (Figure 1). Unlike type I, overexpression of ROP38 in Toxoplasma type III upregulates proapoptotic factors leading Toxoplasma clearance in the host.[10] ROP38 may interact transiently with some members of this group of MAPK kinases or mimic these proteins and, consequently, the alteration of the MAPK signal pathway in retinal human cells. In summary, according to these results, we propose that T gondii can control immunity mechanisms, such as the apoptosis keeping overexpression of apoptotic repressors through the secretion of ROP kinases such as ROP38 (Figure 1).
Figure 1.

Suggested proteins belonging to the MAPK signaling pathway (KEGG 04010) that could potentially interact with the Pkinase domain (PF00069), as observed in ROP38. On the left is a heat map showing ROP38 toxoplasma; this was the most differentially expressed gene among type I (RH), type II (Prugniaud), and type III (VEG). On the right, up/downregulated genes in human WERI retinal cells (GSE81016 series); these proteins contain domains to interact with PF00069 and are mapped for the MAPK signaling pathway.

Suggested proteins belonging to the MAPK signaling pathway (KEGG 04010) that could potentially interact with the Pkinase domain (PF00069), as observed in ROP38. On the left is a heat map showing ROP38 toxoplasma; this was the most differentially expressed gene among type I (RH), type II (Prugniaud), and type III (VEG). On the right, up/downregulated genes in human WERI retinal cells (GSE81016 series); these proteins contain domains to interact with PF00069 and are mapped for the MAPK signaling pathway. Because there is no gold standard consensus to identify differentially expressed genes in array experiments,[37] most of the methodologies to identify differentially expressed genes in array experiments are based on a variety of statistical tests, such as analysis of variance[38-40] or false discovery rate.[41-43] We tried to evaluate the performance of our R script differently; first, we proposed critical values to obtain a free chosen output of differentially expressed genes; second, we compare the outputs obtained through our functions against those reported by other authors. It was shown that with our functions we could reproduce similar outcomes to those previously reported, even with a smaller gene set than that proposed by the original authors (Table 1). Similarly, all the differentially expressed gene sets obtained with our functions were also evidenced by the GEO2R tool. The aforementioned demonstrated that our R script can be used by other researchers, even those with no knowledge of R programming, and can be applied to other pathogen-host coexpression experiments to predict more PHI for specific domains. Our R script does not seek to compete with other approaches, such as GSEA,[44] or other R packages, such as limma or GSEABase[45,46]; rather, we wrote an R script code to identify plausible pathogen genes involved in pathogenesis, particularly those with high expression variability. Furthermore, using their functional domains, we aim to identify the host domains that are differentially expressed and potential interactors, as well as map the different cellular pathways.

Conclusions

Predicting interactions between host and pathogen proteins is a never-ending problem with important implications for public health. We have presented an R script that integrates differential gene expression calculations, enrichment analyses, and the crossing of interspecific DDI to predict interactions between pathogen and host proteins (PHI). When applied to the Toxoplasma-host interaction system, the R script outcome exhibits similar results with previously reported microarray analyses, thus validating our approach. The R script for human retinal cells suggested that apoptosis inhibitors, such as PIK3R1, PRKCA, PRKCG, PRKCB, and HRAS, or the c-JUN transcription factor directly could be the possible substrate for the differentially expressed ROP38 Toxoplasma kinase. This approach will help researchers determine the number of interspecific candidates for each interaction and reduce the number of experiments required for confirmation; in addition, mapping with the different pathways and cell processes could help to infer the parasite’s survival mechanisms in the host cell.
  41 in total

1.  Microarray analysis reveals previously unknown changes in Toxoplasma gondii-infected human cells.

Authors:  I J Blader; I D Manger; J C Boothroyd
Journal:  J Biol Chem       Date:  2001-04-09       Impact factor: 5.157

2.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

3.  Phosphorylation of immunity-related GTPases by a Toxoplasma gondii-secreted kinase promotes macrophage survival and virulence.

Authors:  Sarah J Fentress; Michael S Behnke; Ildiko R Dunay; Mona Mashayekhi; Leah M Rommereim; Barbara A Fox; David J Bzik; Gregory A Taylor; Benjamin E Turk; Cheryl F Lichti; R Reid Townsend; Wei Qiu; Raymond Hui; Wandy L Beatty; L David Sibley
Journal:  Cell Host Microbe       Date:  2010-12-16       Impact factor: 21.023

4.  Structural and functional analysis of phosphorylation-specific binders of the kinase ERK from designed ankyrin repeat protein libraries.

Authors:  Lutz Kummer; Petra Parizek; Peter Rube; Bastian Millgramm; Anke Prinz; Peer R E Mittl; Melanie Kaufholz; Bastian Zimmermann; Friedrich W Herberg; Andreas Plückthun
Journal:  Proc Natl Acad Sci U S A       Date:  2012-07-27       Impact factor: 11.205

Review 5.  SOCS regulation of the JAK/STAT signalling pathway.

Authors:  Ben A Croker; Hiu Kiu; Sandra E Nicholson
Journal:  Semin Cell Dev Biol       Date:  2008-07-30       Impact factor: 7.727

6.  Infection by Toxoplasma gondii specifically induces host c-Myc and the genes this pivotal transcription factor regulates.

Authors:  Magdalena Franco; Anjali J Shastri; John C Boothroyd
Journal:  Eukaryot Cell       Date:  2014-02-14

7.  The polymorphic pseudokinase ROP5 controls virulence in Toxoplasma gondii by regulating the active kinase ROP18.

Authors:  Michael S Behnke; Sarah J Fentress; Mona Mashayekhi; Lucy X Li; Gregory A Taylor; L David Sibley
Journal:  PLoS Pathog       Date:  2012-11-08       Impact factor: 6.823

8.  DOMINE: a comprehensive collection of known and predicted domain-domain interactions.

Authors:  Sailu Yellaboina; Asba Tasneem; Dmitri V Zaykin; Balaji Raghavachari; Raja Jothi
Journal:  Nucleic Acids Res       Date:  2010-11-27       Impact factor: 16.971

Review 9.  The human immune response to Toxoplasma: Autophagy versus cell death.

Authors:  Shruthi Krishnamurthy; Eleni K Konstantinou; Lucy H Young; Daniel A Gold; Jeroen P J Saeij
Journal:  PLoS Pathog       Date:  2017-03-09       Impact factor: 6.823

10.  Transcriptional analysis of murine macrophages infected with different Toxoplasma strains identifies novel regulation of host signaling pathways.

Authors:  Mariane B Melo; Quynh P Nguyen; Cynthia Cordeiro; Musa A Hassan; Ninghan Yang; Renée McKell; Emily E Rosowski; Lindsay Julien; Vincent Butty; Marie-Laure Dardé; Daniel Ajzenberg; Katherine Fitzgerald; Lucy H Young; Jeroen P J Saeij
Journal:  PLoS Pathog       Date:  2013-12-19       Impact factor: 6.823

View more
  2 in total

1.  Protein targets of thiazolidinone derivatives in Toxoplasma gondii and insights into their binding to ROP18.

Authors:  Diego Molina; Rodrigo Cossio-Pérez; Cristian Rocha-Roa; Lina Pedraza; Edwar Cortes; Alejandro Hernández; Jorge E Gómez-Marín
Journal:  BMC Genomics       Date:  2018-11-29       Impact factor: 3.969

2.  Genome-Wide Association Study of Cryptosporidiosis in Infants Implicates PRKCA.

Authors:  Genevieve L Wojcik; Poonum Korpe; Chelsea Marie; Alexander J Mentzer; Tommy Carstensen; Josyf Mychaleckyj; Beth D Kirkpatrick; Stephen S Rich; Patrick Concannon; A S G Faruque; Rashidul Haque; William A Petri; Priya Duggal
Journal:  mBio       Date:  2020-02-04       Impact factor: 7.867

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.