| Literature DB >> 28927463 |
Alokkumar Jha1, Yasar Khan1, Muntazir Mehdi1, Md Rezaul Karim1, Qaiser Mehmood1, Achille Zappa1, Dietrich Rebholz-Schuhmann1, Ratnesh Sahay2.
Abstract
BACKGROUND: Next Generation Sequencing (NGS) is playing a key role in therapeutic decision making for the cancer prognosis and treatment. The NGS technologies are producing a massive amount of sequencing datasets. Often, these datasets are published from the isolated and different sequencing facilities. Consequently, the process of sharing and aggregating multisite sequencing datasets are thwarted by issues such as the need to discover relevant data from different sources, built scalable repositories, the automation of data linkage, the volume of the data, efficient querying mechanism, and information rich intuitive visualisation.Entities:
Keywords: Biomarkers; Cancer genomics; Gynecological cancer; Linked data; Multi-Omics; Pathways; Semantic technologies
Mesh:
Substances:
Year: 2017 PMID: 28927463 PMCID: PMC5606033 DOI: 10.1186/s13326-017-0146-9
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Links between COSMIC and TCGA datasets
Fig. 2Links between COSMIC, TCGA, REACTOME, KEGG, and GO datasets
Fig. 3BIOOPENER: Linking & Querying Cancer Genomic Resources
RDF Data Statistics
| No. | Data | Triples | Subjects | Predicates | Objects | Size (MB) |
|---|---|---|---|---|---|---|
| 1 | COSMIC GE | 1184971624 | 148121454 | 18 | 148240680 | 10000 |
| 2 | COSMIC GM | 83275111 | 3620658 | 23 | 9004153 | 1400 |
| 3 | COSMIC CNV | 8633104 | 863332 | 10 | 921690 | 122 |
| 4 | COSMIC Methylation | 170300300 | 8292057 | 22 | 603135 | 2800 |
| 5 | TCGA-OV | 81188714 | 10974200 | 15 | 4774584 | 3774 |
| 6 | TCGA-CESC | 3763470 | 627652 | 43 | 481227 | 49557 |
| 7 | TCGA-UCEC | 553271744 | 19233824 | 91 | 68370614 | 84687 |
| 8 | TCGA-UCS | 1120873 | 183602 | 36 | 188970 | 10018 |
| 9 | KEGG | 50197150 | 6533307 | 141 | 6792319 | 4302 |
| 10 | REACTOME | 12471494 | 2465218 | 237 | 4218300 | 957 |
| 11 | GOA | 28058541 | 5950074 | 36 | 6575678 | 5858 |
Fig. 4Example Links between COSMIC, TCGA, KEGG, REACTOME, and GO Datasets
Fig. 5Link Statistics
Fig. 6Tree-based two level source selection
Fig. 9Linked annotations for MYH7 - COSMIC
Query Execution Time (QE=Query Execution)
| Query | QE Time (msec) | Results (No. of Triples) | Datasets |
|---|---|---|---|
| Listing 2 | 2110 | 21390 | (TCGA)(COSMIC) |
| Listing 3 | 5732 | 33264 | (TCGA)(COSMIC) |
| Listing 4 | 43092 | 63765 | (TCGA)(COSMIC)(GOA) |
| Listing 5 | 263463 | 232848 | (TCGA)(GOA)(REACTOME)(KEGG) |
| Listing 6 | 3481 | 25669 | (TCGA)(COSMIC) |
Fig. 7HBM: List of genes expressed in all tissues and highly expressed
Fig. 8TCGA query output from cBIO Portal (Blue:Deletion, Red:Amplification, Green:Mutation, Brown:Multiple Alterations) [43]
loci information for highly expressed gene in ovarian cancer from HBM 2.0
| Chr | Star-End | Mutation Type | Genes | PMID |
|---|---|---|---|---|
| 19 | 90910 -715430 | GAIN | FGF22, RNF126, TG | 2066845 120668450 |
| 9 | 4069657-4684967 591967-608659 11090336-11098891 8009428-8015596 8109010-8121257 1373387-1383725 11090336-11098891 10547511-10547923 3113846 -3134738 8115293 -8121487 9269903 -9294415 46587-510700 5106680-5106800 | LOSS/GAIN | LKB1,P16INK4A,TRAF2,XPA, PTCH1,FANCC,DMRT3,WNK2,C9orf89, SYK,CKS2,CTSL1,NTRK2,KIF27,PTPRD, TLE4,CEP78,GNAQ,PRKACG | 21062161 17311676 1658517020668451 21781307 |
| 6 | 149661-384546 | LOSS | TAP1,NOL7,CD83,POUF3,MYH7,PLN,PKIB,PDSS2 OSTM1,NUS1,TG,NT5DC1,NR2E1,NKAIN2 | 21062161 20668451 21781307 20668451 21720365 |
| 5 | 15532-24132 | GAIN | TRIP13, TRIO,TARS,SUB1,SLC12A7, SKP2,SDHA,RPL37,MYH7,RNASEN,RAI14, RAD1,POLS,PDCD6,PAIP1,OSMR,NNT | 18559093 21062161 |
| 14 | 23857092-23886486 23857082-23886607 | LOSS | MYH6, MYH7, TG, ACTA1 | 18559093 21062161 |
Fig. 10Promotor level methylation changes in biomaker genes
Fig. 11Three pathways causing promoter changes in four gynecological cancer types