| Literature DB >> 29145625 |
Po-Jung Huang1,2, Ling-Ya Chiu3, Chi-Ching Lee2,4, Yuan-Ming Yeh2,3, Kuo-Yang Huang5, Cheng-Hsun Chiu2,6, Petrus Tang3,6.
Abstract
Cancer is a genetic disease caused by somatic mutations; however, the understanding of the causative biological processes generating these mutations is limited. A cancer genome bears the cumulative effects of mutational processes during tumor development. Deciphering mutational signatures in cancer is a new topic in cancer research. The Wellcome Trust Sanger Institute (WTSI) has categorized 30 reference signatures in the COSMIC database based on the analyses of ∼10 000 sequencing datasets from TCGA and ICGC. Large cohorts and bioinformatics skills are required to perform the same analysis as WTSI. The quantification of known signatures in custom cohorts is not possible under the current framework of the COSMIC database, which motivates us to construct a database for mutational signatures in cancers and make such analyses more accessible to general researchers. mSignatureDB (http://tardis.cgu.edu.tw/msignaturedb) integrates R packages and in-house scripts to determine the contributions of the published signatures in 15 780 individual tumors from 73 TCGA/ICGC cancer projects, making comparison of signature patterns within and between projects become possible. mSignatureDB also allows users to perform signature analysis on their own datasets, quantifying contributions of signatures at sample resolution, which is a unique feature of mSignatureDB not available in other related databases.Entities:
Mesh:
Year: 2018 PMID: 29145625 PMCID: PMC5753213 DOI: 10.1093/nar/gkx1133
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of mSignatureDB. Somatic mutation profiles were gathered from TCGA/ICGC large-scale genomics studies. mSignatureDB comprises four components: (i) browse; (ii) search; (iii) analysis and (iv) download. In the ‘Browse’ page, the landscapes of mutational signatures can be inspected by cancer project, primary site or country. Users can search the database using the names of cancer projects. Hierarchically-clustered heamap is used to reveal dominant signatures in a cancer project according to the contribution of each signature. By displaying mutations according to substitution types and along a reference genome, users can easily depict dominant mutation types and localized regions of mutation hotspots. The signature profiles and the clinical associations can be downloaded through the ‘Download’ page. The web interfaces for two popular mutational signature analysis tools, the deconstructSigs and the WTSI Mutational Signature Framework, are provided to facilitate custom data analyses.
Figure 2.Output features of mSignatureDB. (A) Dot matrix is used to render the landscape of mutational signatures in each project. Explaining texts such as associated etiology and contribution of individual signatures are integrated in the plot and shown as pop-up windows. Flexible control elements are also available for the manipulation of the dot matrix. (B) Since the TCGA/ICGC mutation profiles and clinical information have been complied into mSignatureDB, users are able to compare mutational signatures between subsets of patients through the filters and the iterative heatmap. (C) Mutation hotspots are displayed as rainfall plot and box plots along a reference genome and according to substitution types, respectively. (D) Functional profiling of the most frequently mutated genes can be performed according to each substitution type to facilitate the users to identify their target of interests or potential therapeutic targets.
Figure 3.Verification of Mutational Signature. The reference mutational signatures categorized in the COSMIC database were identified by the WTSI Mutational Signature Framework. Although the WTSI framework can perform de novo signature analysis and decompose signatures from mutation profiles, the signature assignment that can be achieved by cosine similarity analysis is always neglected by exiting tools, making the assignment of the decomposed signatures to published signatures very inconvenient. To address this issue and give more confidence in the similarity analysis, the bootstrapped cosine similarity method is used to calculate statistical significance of similarity between mutational signatures. As shown in this figure, a bootstrapped tree that summarizes the significance of cosine similarity between mutational signatures is provided to facilitate known signature assignment and novel signature identification while alleviate the exhausting and error-prone activity of visual inspection.
Figure 4.Example of Use. We have applied our application to analyze two public datasets reporting somatic mutation catalogs on 106 cases of OSCC from India and 510 cases of HNSC from America. Because mSignatureDB can determine the composition of COSMIC reference signatures in individual tumor specimens, signature landscapes can be compared at the sample resolution. As shown in this figure, signatures of active mutational processes such as aging (COSMIC signature 1) and over-activity of APOBEC enzymes (COSMIC signature 2 and 13) can be easily identified through the clustered heatmaps. Signatures originated from external mutagen exposures and habits (e.g. smoking and tobacco chewing) between different populations can also be identified using the visual analytic method.