| Literature DB >> 23093600 |
Allan Peter Davis1, Cynthia Grondin Murphy, Robin Johnson, Jean M Lay, Kelley Lennon-Hopkins, Cynthia Saraceni-Richards, Daniela Sciaky, Benjamin L King, Michael C Rosenstein, Thomas C Wiegers, Carolyn J Mattingly.
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between environmental chemicals and gene products and their relationships to diseases. Chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature are integrated to generate expanded networks and predict many novel associations between different data types. CTD now contains over 15 million toxicogenomic relationships. To navigate this sea of data, we added several new features, including DiseaseComps (which finds comparable diseases that share toxicogenomic profiles), statistical scoring for inferred gene-disease and pathway-chemical relationships, filtering options for several tools to refine user analysis and our new Gene Set Enricher (which provides biological annotations that are enriched for gene sets). To improve data visualization, we added a Cytoscape Web view to our ChemComps feature, included color-coded interactions and created a 'slim list' for our MEDIC disease vocabulary (allowing diseases to be grouped for meta-analysis, visualization and better data management). CTD continues to promote interoperability with external databases by providing content and cross-links to their sites. Together, this wealth of expanded chemical-gene-disease data, combined with novel ways to analyze and view content, continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases.Entities:
Mesh:
Year: 2012 PMID: 23093600 PMCID: PMC3531134 DOI: 10.1093/nar/gks994
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Increase in CTD content from 2008 to 2012
| July 2012 | December 2010 | December 2008 | |
|---|---|---|---|
| Curated data types | |||
| Articles | 94 513 | 23 918 | 10 854 |
| Chemicals | 11 755 | 6217 | 4323 |
| Genes | 27 950 | 18 446 | 15 140 |
| Diseases | 5987 | 3703 | 3445 |
| Relationships | |||
| Direct chemical–gene interactions | 599 182 | 283 976 | 147 285 |
| Direct gene–disease relationships | 23 395 | 12 505 | 7456 |
| Direct chemical–disease relationships | 176 627 | 9264 | 4181 |
| Inferred gene–disease relationships | 10 132 094 | 1 170 317 | 472 423 |
| Inferred chemical–disease relationships | 913 622 | 284 205 | 117 974 |
| Enriched chemical–GO relationships | 2 221 348 | 1 166 669 | n/a |
| Enriched chemical–pathway relationships | 211 782 | 213 261 | n/a |
| Inferred disease–pathway relationships | 46 912 | 24 258 | n/a |
| Gene–GO annotations | 807 848 | 855 215 | 685 781 |
| Gene–pathway annotations | 63 393 | 55 912 | 45 795 |
| Inferred disease–GO relationships | 465 797 | 229 810 | n/a |
| Total relationships | 15 662 000 | 4 305 392 | 1 480 895 |
aImported from external databases.
n/a, not available.
CTD’s links to external databases
| CTD page | Links to | Linking URL |
|---|---|---|
| Chemical | CCRIS | |
| ChEBI | ||
| ChemIDplus | ||
| DrugBank | ||
| GENE-TOX | ||
| Household products DB | ||
| Hazardous substance DB | ||
| MeSH | ||
| PubChem | ||
| TOXLINE | ||
| Gene | NCBI gene | |
| UniProt | ||
| PharmGKB | ||
| WikiGenes | ||
| Disease | MeSH | |
| OMIM | ||
| Organism | NCBI taxonomy | |
| Gene ontology | AmiGO | |
| MGI | ||
| QuickGO | ||
| RGD | ||
| WormBase | ||
| Pathway | KEGG | |
| Reactome | ||
| Reference | PubMed | |
| DOI |
Databases using CTD content or providing links to CTD
| Database | Description | Database URL |
|---|---|---|
| AutismKB | Autism knowledgebase | |
| BIAdb | Benzylisoquinoline alkaloids database | |
| BioGraph | Biomedical knowledge discovery server | |
| BioXM | BioXM™ Knowledge Management Environment | |
| BPAGenomics | Bisphenol A genomics data portal | |
| CancerResource | Cancer-related database | |
| Chem2Bio2RDF | Semantic system for chemical biology | |
| ChemIDplus | Chemical dictionary and structure database | |
| ChemProt | Annotated and predicted chemical–protein interactions | |
| ChemSpider | Chemical structures and property predictions | |
| DDSS | Drug Discovery and Diagnostic Support System | |
| GAD | Genetics Association Database | |
| Galaxy | Web-based platform for biomedical data analysis | |
| GeneSetDB | Meta-database integrating human disease and pharmacology | |
| GeneWeaver | Integrates functional genomics experiments | |
| GPSy | Gene Prioritization SYstem that prioritizes genes for functional analyses | |
| Harvester Portal | Aggregate portal of scientific sites | |
| HOMER | Human Organ-specific Molecular Electronic Repository | |
| MIRIAM | Pharmacogenomics data collections | |
| NCBI Gene | Gene LinkOuts | |
| PharmDB | Pharmacological network database | |
| PharmGKB | PharmacoGenomics KnowledgeBase | |
| PhenoHM | Human-Mouse comparative phenome-genome server | |
| PPDB | Pathogenic Pathway Database for Periodontitis | |
| PubChem | Database of chemical molecules | |
| Reactome | Pathway database | |
| RefGene | Index of genes and antibodies | |
| RGD | Rat Genome Database disease and pathway portals | |
| STITCH | Search Tool for InTeractions of CHemicals | |
| T3DB | Toxin, Toxin-Target Database | |
| ToppGene | Portal of gene information | |
| TOXLINE | Toxicology literature online | |
| TOXNET | Toxicology data network | |
| UCSC | UCSC genome browser | |
| UniProt | Universal Protein Resource | |
| WENDI | Web Engine for Non-obvious Drug Information | |
| WhichGenes | Gene-set building portal |
Figure 1.DiseaseComps finds similar disorders. CTD’s Disease page for autistic disorders contains a ‘DiseaseComps’ data tab that allows users to see similar disorders based upon shared chemicals or genes and either via marker/mechanism or therapeutic relationships. Users can toggle open any of the different representations of the comparable diseases, as shown here for ‘via gene marker/mechanism associations’. In addition to intuitive disorders such as intellectual disability and schizophrenia (the top two comparable diseases identified), it is also discovered that autism shares many genes with non-obvious diseases (red boxes) such as prostatic neoplasms (30 genes), lung neoplasms (17 genes), hypotension (12 genes), obesity (10 genes) and hypertension (11 genes). Clicking on the hyperlinked gene count in the right-hand column opens another window listing the common interacting genes. The Similarity Index is derived from the Jaccard similarity coefficient (22).
Figure 2.Filtering GeneComps by type of interaction. CTD users can now filter ChemComps and GeneComps based on the direction and type of interaction, as shown here for gene HMOX1. The panel on the left displays other genes that are comparable to HMOX1 based on filtering for chemicals that increase the expression of the genes (red lariat). The panel on the right, however, produces a different set of comparable genes to HMOX1 based on chemicals that decrease the expression of genes (green lariat). Users can also filter for activity, binding or all (unfiltered) interaction types.
Figure 3.Enrichment analysis of genes in chemical inference networks. CTD’s Chemical page for the nerve agent Soman has the ‘Diseases’ data tab highlighted, listing the diseases to which Soman can be linked (either directly or by an inferred network of genes). By clicking the ‘GO’ button under the ‘Enrichment Analysis’ column for the first listed disease (Seizures), the tool automatically sends the 14 genes listed in the ‘Inference Network’ column (red dashed box) to the Gene Set Enricher tool (red arrow). The results (red inset box) include 84 enriched GO terms associated with these 14 genes. The list can be further revised by selecting corrected versus raw P-values, changing the P-value threshold itself and filtering the results for any of the three GO branches. Similar analysis can be performed for Pathway annotations by clicking the ‘Pathway’ button under the ‘Enrichment Analysis’ column.
Figure 4.New visualization at CTD. (a) Manually curated interactions are now color-coded on web pages to rapidly discern between statements that describe an ‘increased’ interaction (red font), a ‘decreased’ interaction (green font) or one in which the directionality is not specified (brown font). (b) The ‘ChemComps’ data tab on a CTD Chemical page provides the option to visualize networks of common interacting genes for the top 10 ranked comparable chemicals using a web version of Cytoscape. The chemicals that form the ChemComps are depicted as blue triangles and the connecting genes are green nodes. The map is customizable by the user (data not shown). For larger networks, XGMML files can be downloaded and used on a desktop platform of Cytoscape (inset).
Figure 5.CTD disease landscape. CTD currently contains over 11 million disease relationships (both direct and inferred) for 5987 unique diseases MEDIC-Slim reduces the complexity of this information into 36 generic disease categories (y-axis) to show the overall landscape of disease information at CTD for both direct relationships (blue bars) and inferred relationships (yellow bars), as a percentage of the total number of relationships.
Figure 6.MEDIC-Slim adds functionality, reduces complexity of disease information and eases data management. CTD biocurators use the MEDIC disease vocabulary to curate disease relationships. These MEDIC diseases are now mapped to 36 MEDIC-Slim generic disease categories, which help reduce complexity and add the functionality of allowing users to easily retrieve and manage the information. Under its ‘Diseases’ data tab, the chemical bisphenol A is associated with 1965 diseases (red box). This data set can be filtered for any of the 36 MEDIC-Slim categories from a pick-list, such as ‘Cardiovascular disease’ (red circle), to retrieve only the 188 cardiovascular diseases associated with bisphenol A (red arrow).