| Literature DB >> 25841438 |
Tsung-Jung Wu1, Lynn M Schriml1, Qing-Rong Chen1, Maureen Colbert1, Daniel J Crichton1, Richard Finney1, Ying Hu1, Warren A Kibbe1, Heather Kincaid1, Daoud Meerzaman1, Elvira Mitraka1, Yang Pan1, Krista M Smith1, Sudhir Srivastava1, Sari Ward1, Cheng Yan1, Raja Mazumder2.
Abstract
Bio-ontologies provide terminologies for the scientific community to describe biomedical entities in a standardized manner. There are multiple initiatives that are developing biomedical terminologies for the purpose of providing better annotation, data integration and mining capabilities. Terminology resources devised for multiple purposes inherently diverge in content and structure. A major issue of biomedical data integration is the development of overlapping terms, ambiguous classifications and inconsistencies represented across databases and publications. The disease ontology (DO) was developed over the past decade to address data integration, standardization and annotation issues for human disease data. We have established a DO cancer project to be a focused view of cancer terms within the DO. The DO cancer project mapped 386 cancer terms from the Catalogue of Somatic Mutations in Cancer (COSMIC), The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium, Therapeutically Applicable Research to Generate Effective Treatments, Integrative Oncogenomics and the Early Detection Research Network into a cohesive set of 187 DO terms represented by 63 top-level DO cancer terms. For example, the COSMIC term 'kidney, NS, carcinoma, clear_cell_renal_cell_carcinoma' and TCGA term 'Kidney renal clear cell carcinoma' were both grouped to the term 'Disease Ontology Identification (DOID):4467 / renal clear cell carcinoma' which was mapped to the TopNodes_DOcancerslim term 'DOID:263 / kidney cancer'. Mapping of diverse cancer terms to DO and the use of top level terms (DO slims) will enable pan-cancer analysis across datasets generated from any of the cancer term sources where pan-cancer means including or relating to all or multiple types of cancer. The terms can be browsed from the DO web site (http://www.disease-ontology.org) and downloaded from the DO's Apache Subversion or GitHub repositories. Database URL: http://www.disease-ontology.orgEntities:
Mesh:
Year: 2015 PMID: 25841438 PMCID: PMC4385274 DOI: 10.1093/database/bav032
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.DO cancer tree plot presenting the hierarchical tree structures of the system. The summarized terms (DOIDs, level 1), TopNodes_DOcancerslim (DOIDs, level 2) and child terms (DOIDs, level 3) are included in the tree with DOID: 162 / Cancer as the root. In the case that the same term is used in more than one level, only the highest level is plotted. The branch of the summarized term with more than five nodes is colored as shown. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID 162 / cancer in DO.
Figure 2.DO cancer Circos plot showing the hierarchical structure of the system. All mapped subsumed terms (the innermost layer), TopNodes_DOcancerslim level terms (the middle layer) and child terms (the outermost layer) are plotted with the full DOIDs/terms listed. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID / cancer in DO.
DO TopNodes_DOcancerslim terms (detailed mapping is available in Supplementary Table S1)
| DOID | DO cancer-slim | Children node | Source |
|---|---|---|---|
| DOID:2531 | Hematologic cancer | 20 | COSMIC,EDRN,ICGC,IntOGen,TARGET,TCGA |
| DOID:1319 | Brain cancer | 11 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:1324 | Lung cancer | 11 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:263 | Kidney cancer | 10 | COSMIC,EDRN,ICGC,IntOGen,TARGET,TCGA |
| DOID:1793 | Pancreatic cancer | 8 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:4159 | Skin cancer | 8 | COSMIC,EDRN,TCGA |
| DOID:184 | Bone cancer | 6 | COSMIC,EDRN,TARGET |
| DOID:0060119 | Pharynx cancer | 5 | COSMIC,EDRN,IntOGen |
| DOID:2394 | Ovarian cancer | 5 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:1612 | Breast cancer | 4 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:201 | Connective tissue cancer | 4 | COSMIC,ICGC |
| DOID:3070 | Malignant glioma | 4 | COSMIC,TCGA |
| DOID:363 | Uterine cancer | 4 | COSMIC,EDRN,IntOGen,TCGA |
| DOID:3953 | Adrenal gland cancer | 4 | COSMIC,TCGA |
| DOID:5041 | Esophageal cancer | 4 | COSMIC,EDRN,ICGC,TCGA |
| DOID:8850 | Salivary gland cancer | 4 | COSMIC |
| DOID:10155 | Intestinal cancer | 3 | COSMIC |
| DOID:10283 | Prostate cancer | 3 | COSMIC,EDRN,ICGC,TCGA |
| DOID:10534 | Stomach cancer | 3 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:11054 | Urinary bladder cancer | 3 | COSMIC,EDRN,ICGC,IntOGen,TCGA |
| DOID:1192 | Peripheral nervous system neoplasm | 3 | COSMIC,TARGET |
| DOID:1781 | Thyroid cancer | 3 | COSMIC,EDRN,ICGC,TCGA |
| DOID:3571 | Liver cancer | 3 | COSMIC,EDRN,ICGC,TCGA |
| DOID:4362 | Cervical cancer | 3 | COSMIC,EDRN,ICGC,TCGA |
| DOID:5672 | Large intestine cancer | 3 | COSMIC |
| DOID:119 | Vaginal cancer | 2 | COSMIC,EDRN |
| DOID:11934 | Head and neck cancer | 2 | COSMIC,TCGA |
| DOID:1993 | Rectum cancer | 2 | COSMIC,EDRN,TCGA |
| DOID:2174 | Ocular cancer | 2 | COSMIC |
| DOID:219 | Colon cancer | 2 | COSMIC,EDRN,IntOGen,TCGA |
| DOID:2596 | Larynx cancer | 2 | COSMIC,EDRN |
| DOID:2994 | Germ cell cancer | 2 | COSMIC |
| DOID:3119 | Gastrointestinal system cancer | 2 | COSMIC |
| DOID:3277 | Thymus cancer | 2 | COSMIC |
| DOID:4045 | Muscle cancer | 2 | COSMIC |
| DOID:8618 | Oral cavity cancer | 2 | COSMIC,EDRN,ICGC |
| DOID:0060073 | Lymphatic system cancer | 1 | COSMIC |
| DOID:10021 | Duodenum cancer | 1 | COSMIC |
| DOID:10153 | Ileum cancer | 1 | COSMIC |
| DOID:10811 | Nasal cavity cancer | 1 | COSMIC |
| DOID:1115 | Sarcoma | 1 | COSMIC |
| DOID:11239 | Appendix cancer | 1 | COSMIC |
| DOID:11615 | Penile cancer | 1 | COSMIC |
| DOID:11819 | Ureter cancer | 1 | COSMIC |
| DOID:11920 | Tracheal cancer | 1 | COSMIC |
| DOID:1245 | Vulva cancer | 1 | COSMIC |
| DOID:13499 | Jejunal cancer | 1 | COSMIC |
| DOID:170 | Endocrine gland cancer | 1 | COSMIC |
| DOID:1725 | Peritoneum cancer | 1 | COSMIC |
| DOID:175 | Vascular cancer | 1 | COSMIC |
| DOID:1790 | Malignant mesothelioma | 1 | EDRN |
| DOID:1909 | Melanoma | 1 | COSMIC |
| DOID:1964 | Fallopian tube cancer | 1 | COSMIC |
| DOID:2998 | Testicular cancer | 1 | EDRN |
| DOID:3121 | Gallbladder cancer | 1 | EDRN |
| DOID:3565 | Meningioma | 1 | COSMIC |
| DOID:3996 | Urinary system cancer | 1 | COSMIC |
| DOID:4606 | Bile duct cancer | 1 | COSMIC |
| DOID:5099 | Middle ear cancer | 1 | COSMIC |
| DOID:5559 | Mediastinal cancer | 1 | COSMIC |
| DOID:5612 | Spinal cancer | 1 | COSMIC |
| DOID:5875 | Retroperitoneal cancer | 1 | COSMIC |
| DOID:9917 | Pleural cancer | 1 | COSMIC |
Figure 3.An example showing pan-cancer view of gene mutations mapped to DO cancer terms. A. Six oncogenes were mapped to 110 DO terms. The bandwidth represents the number of unique SNVs found in that gene in different cancer types. B. TopNodes_DOcancerslim display of the same analysis which shows 46 cancer terms associated with mutations found in six oncogenes. Overall, panel B displays a clearer view and the summarization enables large-scale analysis on an entire set of oncogenes or tumor suppressors across multiple cancer types. DOID terms are available in Table 1. HGNC gene symbols are used to represent the cancer genes.