Literature DB >> 22125387

DiseaseComps: a metric that discovers similar diseases based upon common toxicogenomic profiles at CTD.

Allan Peter Davis1, Michael C Rosenstein, Thomas Conrad Wiegers, Carolyn J Mattingly.   

Abstract

UNLABELLED: The Comparative Toxicogenomics Database (CTD) is a free resource that describes chemical-gene-disease networks to help understand the effects of environmental exposures on human health. The database contains more than 13,500 chemical-disease and 14,200 gene-disease interactions. In CTD, chemicals and genes are associated with a disease via two types of relationships: as a biomarker or molecular mechanism for the disease (M-type) or as a real or putative therapy for the disease (T-type). We leveraged these curated datasets to compute similarity indices that can be used to produce lists of comparable diseases ("DiseaseComps") based upon shared toxicogenomic profiles. This new metric now classifies diseases with common molecular characteristics, instead of the traditional approach of using histology or tissue of origin to define the disorder. In the dawning era of "personalized medicine", this feature provides a new way to view and describe diseases and will help develop testable hypotheses about chemical-gene-disease networks. AVAILABILITY: The database is available for free at http://ctd.mdibl.org/

Entities:  

Keywords:  chemical; curation; database; disease; gene

Year:  2011        PMID: 22125387      PMCID: PMC3220301          DOI: 10.6026/97320630007154

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health [1]. CTD biocurators manually curate interactions from the scientific literature in a structured format using controlled vocabularies for chemicals, genes, diseases, molecular interactions, and organisms [2]. These datasets can be used to explore relationships and to generate novel, testable hypotheses about chemical-gene-disease pathways. Analyses of the human disease network strive to categorize diseases with respect to common genes and molecular pathways, enabling hypotheses about shared or predisposing co-disorders and putative genetic susceptibilities [3]. Disease names, especially cancers, were often originally based upon histology (e.g., carcinoma) or tissue of origin (e.g., liver cancer). With the advent of molecular genotyping, however, many cancers are now being further differentiated by their unique molecular signatures (e.g., HER2-positive vs. HER2-negative breast cancer). Bioinformatics analyses of human disease networks have also discovered sets of interacting proteins and molecular pathways often shared by multiple diseases [4]. This shift to analyzing, classifying, and describing diseases via their molecular perspective can help discover new therapeutic approaches not previously considered. For example, shared molecular connections between diabetes and dementia are now fueling research into the possible use of insulin to treat Alzheimer disease [5]. Personalized medicine (which seeks to improve a drug's therapeutic potential as well as minimize its side-effects) is dependent upon understanding the unique molecular profile of the patient and their disease [6]. Genes alone, however, are not solely responsible for all complex diseases, since the environment also plays an important role [7]. Thus, CTD, which integrates datasets for all three components of environmental chemicals, genes, and diseases, can be uniquely leveraged to further advance hypotheses on human disorders. Discovering analogous diseases (based upon their shared toxicogenomic portrait) could promote alternative methods for classifying diseases beyond the standard histological techniques, and more towards a molecular basis. Here we report a new bioinformatics approach to discovering analogous diseases based upon shared chemical and/or gene relationship profiles in CTD, which we call DiseaseComps (for comparable diseases). This metric parallels our previous implementation of GeneComps and ChemComps, which organized analogous genes and chemicals, respectively, based upon common toxicogenomic interactions [8].

Methodology:

CTD biocurators manually curate the literature to capture chemical-disease and gene-disease relationships [2]. A chemical or gene can have an M-type relationship (wherein the molecule acts as either a biomarker or plays a role in the molecular mechanism of the disease) or a T-type relationship (wherein the molecule is described as either a real or putative therapeutic for the disease). Here we used the data available in CTD in September 2011, which included 13,530 chemical-disease relationships (for 2,652 chemicals and 1,180 diseases) and 14,173 gene-disease relationships (for 5,470 genes and 4,149 diseases). Similarity indices were computed using a modification of the Jaccard index, whose value ranges between 0 and 1 [8]. DiseaseComps are delineated by the type of relationship (M or T-type). For example, chemicals with a T-type relationship to a disease can be used to find comparable diseases wherein the chemicals have the same T-type relationship. Six types of DiseaseComps are generated: (1) via all chemical relationships (M- and T-type combined), (2) via only chemical M-type relationships, (3) via only chemical T-type relationships, (4) via all gene relationships (M- and T-type combined), (5) via only gene M-type relationships, and (6) via only gene T-type relationships.

Utility:

CTD computes values that reflect the degree of similarity between the molecular interaction profiles of each curated disease and generates a list of DiseaseComps, delineated by the six possible types of relationships. DiseaseComps provide a simple approach to view diseases that share common molecular interactions, allowing disorders to be classified in a novel manner without regard to histology or tissue of origin. Every curated disease in CTD now includes a DiseaseComps data tab that lists the top 20 comparable diseases based upon their ranked similarity index. For example, the disease schizophrenia has 160 genes curated with an M-type relationship in CTD. DiseaseComps identifies the top comparable diseases for schizophrenia that share the greatest number of those 160 Mtype genes to produce a ranked list that includes bipolar disorder, autism, and psychoses, as well as non-intuitive diseases such as breast, lung, and colorectal cancers (Figure 1a), suggesting that schizophrenia shares many of the same molecular networks common to some cancers. The shared genes can be viewed by clicking on the hyperlinked numeral in the “Common Interacting Gene” column (Figure ab). This new CTD metric provides researchers with additional predictive information that can help construct novel, testable hypotheses about the mechanisms (and potentially treatments or targets) underlying schizophrenia based upon its shared molecular profile with other diseases.
Figure 1

(a) DiseaseComps (orange tab) for schizophrenia via gene marker/mechanism (M-type) relationships include familiar diseases such as bipolar disorder, autism, and psychoses (ranked #1-3), but also discovers non-intuitive diseases like breast cancer (ranked #6) that share ten M-type genes with schizophrenia (red circle). (b) The common curated genes can be viewed by clicking on the hyperlinked numeral (red circle) to produce a Venn diagram and a list of the ten genes shared between schizophrenia and breast cancer.

Future development:

DiseaseComps find similar diseases based upon shared chemical or gene relationships. The algorithm can be reversed to now find similar genes (or chemicals) based upon shared diseases, a feature we hope to soon add to our already established GeneComps and ChemComps data-tabs at CTD [8]. CTD is also expanding its content via the targeted curation of over 50,000 additional toxicology publications selected for four disease areas (cardiovascular, renal, hepatic, and neurological disorders). This will provide significantly more data as input for the generation of DiseaseComps calculations.
  7 in total

1.  The human disease network.

Authors:  Kwang-Il Goh; Michael E Cusick; David Valle; Barton Childs; Marc Vidal; Albert-László Barabási
Journal:  Proc Natl Acad Sci U S A       Date:  2007-05-14       Impact factor: 11.205

Review 2.  Interactome networks and human disease.

Authors:  Marc Vidal; Michael E Cusick; Albert-László Barabási
Journal:  Cell       Date:  2011-03-18       Impact factor: 41.582

3.  GeneComps and ChemComps: a new CTD metric to identify genes and chemicals with shared toxicogenomic profiles.

Authors:  Allan Peter Davis; Cynthia G Murphy; Cynthia A Saraceni-Richards; Michael C Rosenstein; Thomas C Wiegers; Thomas H Hampton; Carolyn J Mattingly
Journal:  Bioinformation       Date:  2009-10-15

4.  The Comparative Toxicogenomics Database: update 2011.

Authors:  Allan Peter Davis; Benjamin L King; Susan Mockus; Cynthia G Murphy; Cynthia Saraceni-Richards; Michael Rosenstein; Thomas Wiegers; Carolyn J Mattingly
Journal:  Nucleic Acids Res       Date:  2010-09-22       Impact factor: 16.971

Review 5.  Personalized medicine: new genomics, old lessons.

Authors:  Kenneth Offit
Journal:  Hum Genet       Date:  2011-06-26       Impact factor: 4.132

6.  The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database.

Authors:  Allan Peter Davis; Thomas C Wiegers; Michael C Rosenstein; Cynthia G Murphy; Carolyn J Mattingly
Journal:  Database (Oxford)       Date:  2011-09-20       Impact factor: 3.451

7.  Genetic and environmental pathways to complex diseases.

Authors:  Julia M Gohlke; Reuben Thomas; Yonqing Zhang; Michael C Rosenstein; Allan P Davis; Cynthia Murphy; Kevin G Becker; Carolyn J Mattingly; Christopher J Portier
Journal:  BMC Syst Biol       Date:  2009-05-05
  7 in total
  7 in total

1.  MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database.

Authors:  Allan Peter Davis; Thomas C Wiegers; Michael C Rosenstein; Carolyn J Mattingly
Journal:  Database (Oxford)       Date:  2012-03-20       Impact factor: 3.451

2.  Generating Gene Ontology-Disease Inferences to Explore Mechanisms of Human Disease at the Comparative Toxicogenomics Database.

Authors:  Allan Peter Davis; Thomas C Wiegers; Benjamin L King; Jolene Wiegers; Cynthia J Grondin; Daniela Sciaky; Robin J Johnson; Carolyn J Mattingly
Journal:  PLoS One       Date:  2016-05-12       Impact factor: 3.240

3.  Genome-scale meta-analysis of breast cancer datasets identifies promising targets for drug development.

Authors:  Reem Altaf; Humaira Nadeem; Mustafeez Mujtaba Babar; Umair Ilyas; Syed Aun Muhammad
Journal:  J Biol Res (Thessalon)       Date:  2021-02-16       Impact factor: 1.889

4.  The Comparative Toxicogenomics Database: update 2013.

Authors:  Allan Peter Davis; Cynthia Grondin Murphy; Robin Johnson; Jean M Lay; Kelley Lennon-Hopkins; Cynthia Saraceni-Richards; Daniela Sciaky; Benjamin L King; Michael C Rosenstein; Thomas C Wiegers; Carolyn J Mattingly
Journal:  Nucleic Acids Res       Date:  2012-10-23       Impact factor: 16.971

5.  A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions.

Authors:  Allan Peter Davis; Thomas C Wiegers; Phoebe M Roberts; Benjamin L King; Jean M Lay; Kelley Lennon-Hopkins; Daniela Sciaky; Robin Johnson; Heather Keating; Nigel Greene; Robert Hernandez; Kevin J McConnell; Ahmed E Enayetallah; Carolyn J Mattingly
Journal:  Database (Oxford)       Date:  2013-11-28       Impact factor: 3.451

6.  DisSim: an online system for exploring significant similar diseases and exhibiting potential therapeutic drugs.

Authors:  Liang Cheng; Yue Jiang; Zhenzhen Wang; Hongbo Shi; Jie Sun; Haixiu Yang; Shuo Zhang; Yang Hu; Meng Zhou
Journal:  Sci Rep       Date:  2016-07-26       Impact factor: 4.379

7.  The Comparative Toxicogenomics Database: update 2019.

Authors:  Allan Peter Davis; Cynthia J Grondin; Robin J Johnson; Daniela Sciaky; Roy McMorran; Jolene Wiegers; Thomas C Wiegers; Carolyn J Mattingly
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.