| Literature DB >> 30515189 |
Terezinha Souza1, Panuwat Trairatphisan2, Janet Piñero3, Laura I Furlong3, Julio Saez-Rodriguez2,4, Jos Kleinjans1, Danyel Jennen1.
Abstract
In toxicogenomics, functional annotation is an important step to gain additional insights into genes with aberrant expression that drive pathophysiological mechanisms. Nevertheless, there exists a gap on annotation of these genes which often hampers the interpretation of results and limits their applicability in translational medicine. In this study, we evaluated the coverage of functional annotations of differentially expressed genes (DEGs) induced by 10 selected compounds from the TG-GATEs database identified as high- or no-risk in causing drug-induced liver injury (most-DILI or no-DILI, respectively) using in vitro human data. Functional roles of DEGs not present in the most common biological annotation databases - termed "dark genes" - were unveiled via literature mining and via the identification of shared regulatory transcription factors or signaling pathways. Our results demonstrated that there were approximately 13% of dark genes induced by these compounds in vitro and we were able to obtain additional relevant information for up to 76% of those. Using interactome data from several sources, we have uncovered genes such as LRBA, and WDR26 as highly connected in the protein network that play roles in drug response. Genes such as MALAT1, H19, and MIR29C - whose links to hepatotoxicity have been confirmed - were identified as markers for the most-DILI group and appeared as top hits across all literature-based mining methods. Furthermore, we investigated the potential impact of dark genes on liver toxicity by identifying their rat orthologs in combination with their correlation to drug-induced liver pathologies observed in vivo following chemical exposure. We identified a set of important regulatory transcription factors of dark genes for all most-DILI compounds including E2F1 and JUND with supporting evidences in literature and we found Magee1 correlated with chemically induced bile duct hyperplasia and adverse responses at 29 days in rats in vivo. In conclusion, in this study we show the potential role of these poorly annotated genes in mechanisms underlying hepatotoxicity and offer a number of computational approaches that may help to minimize current gaps in gene annotation and highlight their values as potential biomarkers in toxicological studies.Entities:
Keywords: DILI; annotation; gene ontology; network biology; text mining; translational bioinformatics
Year: 2018 PMID: 30515189 PMCID: PMC6255978 DOI: 10.3389/fgene.2018.00527
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Number of differentially expressed genes (DEGs, absolute FC > 1.5 and FDR < 0.05) of compounds from most-DILI and no-DILI groups.
| Most-DILI | Number of DEGs | No-DILI | Number of DEGs |
|---|---|---|---|
| Acetaminophen | 2,280 | Caffeine | 2,316 |
| Diclofenac | 1,888 | Chloramphenicol | 108 |
| Isoniazid | 1,024 | Chlorpheniramine | 93 |
| Nimesulide | 1,697 | Hydroxyzine | 815 |
| Valproic acid | 2,290 | Theophylline | 2,918 |
Classification of genes without GO BP annotation and absent on Reactome, MSigDB, OmniPath, and Pathway Commons databases (dark genes) modulated by compounds from most-DILI and no-DILI groups.
| Gene type | Array dark genes | Most-DILI | No-DILI | Dark DEGes |
|---|---|---|---|---|
| Protein coding | 1,756 | 444 | 278 | 567 |
| Antisense RNA | 527 | 69 | 33 | 78 |
| lincRNA | 722 | 53 | 33 | 63 |
| Processed transcript | 113 | 11 | 8 | 15 |
| Pseudogenes1 | 56 | 18 | 14 | 25 |
| snoRNA | 8 | 4 | 3 | 5 |
| Sense intronic | 25 | 3 | 1 | 3 |
| Sense overlapping | 10 | 1 | 1 | 2 |
| miRNA | 3 | 1 | 0 | 1 |
| TEC2 | 11 | 1 | 1 | 1 |
FIGURE 1Degree distribution of the dark genes in human interactome databases Biana, HIPPIE, Inbiomap, and IntAct.
Degree of connectivity for top 10 genes in human protein–protein interaction databases.
| Gene symbol | Description | BIANA | HIPPIE | INBIOMAP | IntAct |
|---|---|---|---|---|---|
| RNA binding motif protein 12 | 405 | 44 | 17 | 8 | |
| LPS responsive beige-like anchor protein | 402 | 28 | 16 | 9 | |
| Small glutamine rich tetratricopeptide repeat containing beta | 8 | 87 | 88 | 177 | |
| Transmembrane protein 25 | 3 | 85 | 77 | 2 | |
| Family with sequence similarity 189 member A2 | 10 | 78 | 75 | 10 | |
| Zinc finger CCHC-type containing 10 | 51 | 52 | 53 | 59 | |
| Chromosome 1 open reading frame 109 | 45 | 51 | 53 | 116 | |
| Tumor suppressing subtransferable candidate 4 | 13 | 68 | 59 | 17 | |
| WD repeat domain 26 | 3 | 79 | 51 | 33 | |
| Family with sequence similarity 90 member A1 | 33 | 49 | 50 | 122 |
Top 10 genes associated to diseases in DisGeNET (curated data).
| Symbol | Description | Gene type | DILI risk group(s) | Number of diseases |
|---|---|---|---|---|
| CAP-Gly domain containing linker protein 2 | Protein-coding | Most-DILI | 141 | |
| Imprinted in Prader-Willi syndrome (non-protein coding) | ncRNA | Most-DILI, no-DILI | 66 | |
| TDP-glucose 4,6-dehydratase | Protein-coding | Most-DILI | 62 | |
| LPS responsive beige-like anchor protein | Protein-coding | Most-DILI | 33 | |
| Alport syndrome, mental retardation, midface hypoplasia and elliptocytosis chromosomal region gene 1 | Protein-coding | Most-DILI, no-DILI | 27 | |
| Transmembrane protein 98 | Protein-coding | Most-DILI | 9 | |
| H19, imprinted maternally expressed transcript (non-protein coding) | ncRNA | Most-DILI | 7 | |
| Metastasis associated lung adenocarcinoma transcript 1 (non-protein coding) | ncRNA | Most-DILI | 7 | |
| WD repeat domain 11 | Protein-coding | Most-DILI | 6 | |
| Cardiomyopathy associated 5 | Protein-coding | no-DILI | 3 |
Top 10 dark genes by number of GeneRIFs with their corresponding number of publications indexed on PubTator.
| Symbol | Description | Gene Type | DILI risk group(s) | GeneRIFs | Number of publications |
|---|---|---|---|---|---|
| H19, imprinted maternally expressed transcript (non-protein coding) | ncRNA | Most-DILI | 193 | 1169 | |
| Metastasis associated lung adenocarcinoma transcript 1 (non-protein coding) | ncRNA | Most-DILI | 156 | 1203 | |
| microRNA 29c | ncRNA | Most-DILI | 77 | 234 | |
| Urothelial cancer associated 1 (non-protein coding) | ncRNA | Most-DILI, no-DILI | 63 | 152 | |
| Nuclear paraspeckle assembly transcript 1 (non-protein coding) | ncRNA | Most-DILI, no-DILI | 56 | 223 | |
| Pvt1 oncogene (non-protein coding) | ncRNA | Most-DILI | 56 | 182 | |
| Taurine up-regulated 1 (non-protein coding) | ncRNA | Most-DILI, no-DILI | 41 | 99 | |
| Microtubule associated scaffold protein 1 | Protein-coding | Most-DILI, no-DILI | 26 | 71 | |
| Transmembrane 4 L six family member 5 | Protein-coding | Most-DILI, no-DILI | 20 | 37 | |
| Family with sequence similarity 167 member A | Protein-coding | Most-DILI | 19 | 32 |
Overview of mapped dark genes based on transcriptional regulation (DoRothEA) and on signaling pathway signatures (PROGENy).
| Compound | Dark genes | Dark genes in DoRothEA | Number of mapped TFs | Dark genes in PROGENy | Number of mapped signaling pathways |
|---|---|---|---|---|---|
| Acetaminophen | 294 | 46 | 24 | 11 | 8 |
| Valproic acid | 330 | 51 | 28 | 11 | 6 |
| Isoniazid | 152 | 22 | 21 | 8 | 5 |
| Diclofenac | 221 | 32 | 26 | 10 | 7 |
| Nimesulide | 145 | 34 | 26 | 5 | 4 |
| 732 | 115 | 40 | 29 | 10 | |
| Theophylline | 326 | 48 | 29 | 16 | 6 |
| Caffeine | 271 | 47 | 30 | 16 | 7 |
| Hydroxyzine | 81 | 17 | 19 | 3 | 3 |
| Chloramphenicol | 6 | 2 | 4 | 0 | 0 |
| Chlorpheniramine | 7 | 1 | 1 | 1 | 1 |
| 451 | 70 | 36 | 19 | 7 |
FIGURE 2Venn diagrams showing the intersection of transcription factors (TFs) and signaling pathways regulating at least one dark gene. The number accompanying each compound refers to the number of transcription factors and signaling pathways enriched by dark genes and the intersected modules by all or most of the compounds are highlighted in the adjacent boxes. (A,B) Regulatory TFs of hepatotoxic and non-hepatotoxic compounds, respectively. (C,D) Regulatory signaling pathways of hepatotoxic and non-hepatotoxic compounds, respectively. No enriched signaling pathway was found for Chloramphenicol (absent in D).
Orthologs to human dark genes present in modules associated to pathologies in rats described by Sutherland et al. (2017).
| Module | Gene symbol | Pathology association | GO-BP |
|---|---|---|---|
| 13m | Adverse at 29 days, Hematopoiesis | Complement activation; Inflammatory response, Leukocyte chemotaxis | |
| 39 | BDH | Extracellular matrix organization, Collagen fibril organization | |
| 205 | BDH, Adverse at 29 days | Cellular response to DNA damage stimulus, Signal transduction by p53 class mediator | |
| 293 | BDH | – | |
| 55m | Fibrosis, BDH, Necrosis | Membrane raft assembly, Regulation of cytoskeleton organization | |
| 14m | Hypertrophy | Protein folding, tRNA metabolic process | |
| 10 | Increased mitosis | Cell cycle, Mitotic cell cycle | |
| 81 | Increased mitosis, BDH | Actin polymerization or depolymerization | |
| 70 | Single cell necrosis | Cell cycle arrest | |
| 309 | Single cell necrosis | – | |
| 147 | Single cell necrosis | – | |
| 27m | Vacuolation |
FIGURE 3Association between H19, MALAT1, and MIR29C and liver -related disease phenotypes.