| Literature DB >> 27822537 |
Jacqueline U McDonald1, Myrsini Kaforou2, Simon Clare3, Christine Hale3, Maria Ivanova1, Derek Huntley4, Marcus Dorner5, Victoria J Wright2, Michael Levin2, Federico Martinon-Torres6, Jethro A Herberg2, John S Tregoning1.
Abstract
Greater understanding of the functions of host gene products in response to infection is required. While many of these genes enable pathogen clearance, some enhance pathogen growth or contribute to disease symptoms. Many studies have profiled transcriptomic and proteomic responses to infection, generating large data sets, but selecting targets for further study is challenging. Here we propose a novel data-mining approach combining multiple heterogeneous data sets to prioritize genes for further study by using respiratory syncytial virus (RSV) infection as a model pathogen with a significant health care impact. The assumption was that the more frequently a gene is detected across multiple studies, the more important its role is. A literature search was performed to find data sets of genes and proteins that change after RSV infection. The data sets were standardized, collated into a single database, and then panned to determine which genes occurred in multiple data sets, generating a candidate gene list. This candidate gene list was validated by using both a clinical cohort and in vitro screening. We identified several genes that were frequently expressed following RSV infection with no assigned function in RSV control, including IFI27, IFIT3, IFI44L, GBP1, OAS3, IFI44, and IRF7. Drilling down into the function of these genes, we demonstrate a role in disease for the gene for interferon regulatory factor 7, which was highly ranked on the list, but not for IRF1, which was not. Thus, we have developed and validated an approach for collating published data sets into a manageable list of candidates, identifying novel targets for future analysis. IMPORTANCE Making the most of "big data" is one of the core challenges of current biology. There is a large array of heterogeneous data sets of host gene responses to infection, but these data sets do not inform us about gene function and require specialized skill sets and training for their utilization. Here we describe an approach that combines and simplifies these data sets, distilling this information into a single list of genes commonly upregulated in response to infection with RSV as a model pathogen. Many of the genes on the list have unknown functions in RSV disease. We validated the gene list with new clinical, in vitro, and in vivo data. This approach allows the rapid selection of genes of interest for further, more-detailed studies, thus reducing time and costs. Furthermore, the approach is simple to use and widely applicable to a range of diseases.Entities:
Keywords: host response; immunity; respiratory syncytial virus; viral immunity
Year: 2016 PMID: 27822537 PMCID: PMC5069771 DOI: 10.1128/mSystems.00051-16
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 Flowchart of the gene selection method used in this study.
Genes most frequently upregulated following RSV infection
| Weighted score | Gene(s) |
|---|---|
| 24 | |
| 23 | |
| 18 | |
| 16 | |
| 15 | |
| 14 | |
| 12 | |
| 11 | |
| 10 | |
| 9 | |
| 8 | |
| 7 | |
| 6 | |
| 5 | |
| 4 |
Genes were collated from multiple studies of RSV. A cutoff of a 2-fold increase in expression, compared to the reference group in the study from which the data were collated, was used when available. Genes were weighted as follows on the basis of the studies from which they were collated: human genetic studies, 4; human in vivo microarray studies, 3; human in vitro microarray studies, 2; mouse studies, 1. After weighting, genes were analyzed for multiple hits by a custom Perl script.
Genes most frequently downregulated following RSV infection
| Weighted score | Gene(s) |
|---|---|
| 9 | |
| 6 | |
| 5 | |
| 4 |
Genes were collated from multiple studies of RSV. A cutoff of a 2-fold decrease in expression was used when available. Genes were weighted as follows on the basis of the studies from which they were collated: human genetic studies, 4; human in vivo microarray studies, 3; human in vitro microarray studies, 2; mouse studies, 1. After weighting, genes were analyzed for multiple hits by a custom Perl script.
FIG 2 Pathway analysis of genes upregulated after RSV infection. The top 130 genes identified by a literature search are organized on the basis of their predicted functions and subcellular locations. Top candidates are indicated by bold outlining, upregulated genes are red, and downregulated genes are green. Interactions between the top 16 upregulated genes and the top 5 downregulated genes are based on known interactions in the IPA knowledge base.
Canonical pathways[
| Ingenuity canonical pathway | Molecules |
|---|---|
| IFN signaling | OAS1, IFIT1, IFN-γ, IFITM1, STAT1, IFN-A1/IFN-A13, IFIT3, STAT2, MX1, IFI35, IFITM3, PSMB8 |
| Activation of IRF by cytosolic pattern recognition receptors | Jun, DHX58, STAT2, IFIT2, IL-6, NFKBIA, IRF7, STAT1, TNF, IFN-A1/IFN-A13, ISG15, IFIH1, IL-10 |
| Communication between innate and adaptive immune cells | IL-15, TNFSF13B, IFN-γ, CCL5, CXCL10, HLA-G, IL-6, CXCL8, IL-1RN, TNF, IFN-A1/IFN-A13, IL-10, HLA-B, CCL4 |
| Role of hypercytokinemia/hyperchemokinemia in pathogenesis of influenza | IL-15, IL-1RN, IFN-γ, CCL5, CXCL10, TNF, IFN-A1/IFN-A13, CCL2, CCL4, IL-6, CXCL8 |
| Role of pattern recognition receptors in recognition of bacteria and viruses | EIF2AK2, OAS1, IFN-γ, C3, MYD88, CCL5, OAS2, IL-6, CXCL8, IRF7, TNF, IFN-A1/IFN-A13, OAS3, IFIH1, IL-10 |
| Granulocyte adhesion and diapedesis | CXCL9, CCL8, CCL7, CCL5, MMP9, CXCL10, CXCL2, CXCL8, CXCL11, IL-1RN, TNF, CCL2, MMP8, CCL4 |
| Agranulocyte adhesion and diapedesis | CXCL9, CCL8, CCL7, CCL5, MMP9, CXCL10, CXCL2, CXCL8, CXCL11, IL-1RN, TNF, CCL2, MMP8, CCL4 |
| Role of cytokines in mediating communication between immune cells | IL-15, IL-1RN, IFN-γ, TNF, IFN-A1/IFN-A13, IL-20, IL-10, IL-6, CXCL8 |
| Differential regulation of cytokine production in intestinal epithelial cells by IL-17A and IL-17F | IFN-γ, CCL5, TNF, CCL2, IL-10, CCL4, LCN2 |
| Dendritic cell maturation | FCGR1B, IL-15, MYD88, FCGR1A, STAT2, IL-6, NFKBIA, IL-1RN, STAT1, TNF, IFN-A1/IFN-A13, IL-10, HLA-B |
Ingenuity pathway analysis was applied to the top-scoring genes.
FIG 3 Validation of bioinformatic screening of a patient cohort. SDE genes from the clinical cohort were compared with the literature-derived gene list. Shown are overlaps between the gene lists expressed as pie charts, with directionality of agreement indicated for genes on the literature-derived list that are upregulated (A) or downregulated (B). Also shown are relative expression data from RSV-infected patients overlaid on the gene network derived from the literature-derived list (C). Upregulated genes are red, downregulated genes are green, genes identified in bioinformatic study but not clinical study are white, shading represents differential expression, and genes on the literature-derived list are indicated by bold outlining.
FIG 4 Flow cytometry confirmation of inhibitory functions of genes identified in silico. HEp-2 cells were transduced with lentiviral vectors expressing genes of interest identified by in silico screening, and 24 h later, the cells were infected with RSV expressing GFP. Cells were harvested at 48 h postinfection, and expression relative to that of control lentivirus-transfected wells was assessed. Each bar represents the mean value of three experiments ± the standard error of the mean. Red bars represent the top upregulated genes on the literature-derived list.
FIG 5 IRF7, but not IRF1, is important in the control of RSV infection. IRF7 (A, C, E, G, I) or IRF1 (B, D, F, H, I) mice were infected with 5 × 105 PFU of RSV strain A2 and compared to wild-type controls on the same background. Mice were weighed daily, and weight changes were recorded as percentages of the original weight (A, F). Lungs were excised, and viral loads were calculated by quantitative PCR on days 4 and 7 postinfection (B, G). Total lung cell counts (C, H) were calculated, along with the total numbers of CD3, CD4, and CD8 (T); CD19 (B); and DX5+ (NK) (D, I) cells measured in the lungs by flow cytometry on day 7 postinfection. Levels of the inflammatory cytokines IL-1β and IFN-γ in the lungs (E, J) were measured by enzyme-linked immunosorbent assay on day 7 postinfection. Results are mean values ± the standard error of the mean (n >5). Statistical significance was assessed by Student t test (*, P < 0.05; **, P < 0.01; ***, P < 0.001).