| Literature DB >> 29059366 |
Justin Y Newberg1,2, Karen M Mann1,2, Michael B Mann1,2, Nancy A Jenkins1,3, Neal G Copeland1,3.
Abstract
Large-scale oncogenomic studies have identified few frequently mutated cancer drivers and hundreds of infrequently mutated drivers. Defining the biological context for rare driving events is fundamentally important to increasing our understanding of the druggable pathways in cancer. Sleeping Beauty (SB) insertional mutagenesis is a powerful gene discovery tool used to model human cancers in mice. Our lab and others have published a number of studies that identify cancer drivers from these models using various statistical and computational approaches. Here, we have integrated SB data from primary tumor models into an analysis and reporting framework, the Sleeping Beauty Cancer Driver DataBase (SBCDDB, http://sbcddb.moffitt.org), which identifies drivers in individual tumors or tumor populations. Unique to this effort, the SBCDDB utilizes a single, scalable, statistical analysis method that enables data to be grouped by different biological properties. This allows for SB drivers to be evaluated (and re-evaluated) under different contexts. The SBCDDB provides visual representations highlighting the spatial attributes of transposon mutagenesis and couples this functionality with analysis of gene sets, enabling users to interrogate relationships between drivers. The SBCDDB is a powerful resource for comparative oncogenomic analyses with human cancer genomics datasets for driver prioritization.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29059366 PMCID: PMC5753260 DOI: 10.1093/nar/gkx956
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Published and unpublished mouse models of human cancer in SBCDDB
| Datasets by Organ | Description | Ref. | Diagnoses within datasets | Tumors(n) | Mice(n) | Progression driver genes | Trunk driver genes |
|---|---|---|---|---|---|---|---|
| Bone | Osteosarcomas with |
| Osteosarcoma | 45 | 36 | 131 | 41 |
| Brain | Medulloblastomas with | ( | Medulloblastoma | 93 | 93 | 37 | 18 |
| Breast | Breast tumors with | ( | Adenocarcinoma, adenoma, adenomyoepithelioma, adenosquamous carcinoma | 39 | 33 | 137 | 36 |
| Intestine | Intestine tumors with | ( | Intestine cancer (inferred) | 839 | 190 | 673 | 30 |
| Liver | Hepatocellular adenomas and carcinomas with | ( | Hepatocellular adenoma, hepatocellular carcinoma (inferred) | 597 | 110 | 2073 | 60 |
| Lymph node and Thymus | Lymphomas with |
| B-cell lymphoma, lymphoblastic lymphoma (non-T non-B), T-cell lymphoblastic lymphoma, T-cell lymphoma | 39 | 39 | 34 | 7 |
| Muscle | Rhabdomyosarcomas with |
| Rhabdomyosarcoma | 43 | 37 | 106 | 21 |
| Pancreas | Pancreatic ductal adenocarcinomas with | ( | Pancreatic ductal early carcinoma, Pancreatic ductal adenocarcinoma | 172 | 122 | 813 | 50 |
| Prostate | Adenocaricnomas of the prostate with |
| Adenocarcinoma | 54 | 53 | 182 | 32 |
| Skin | Cutaneous keratoacanthomas, melanomas, and squamous cell carcinomas with mutant | ( | Keratoacanthoma, melanoma, squamous cell carcinoma | 199 | 137 | 273 | 118 |
| Spleen | Myeloid leukemias with | ( | Undifferentiated myeloid leukemia | 168 | 168 | 34 | 29 |
| Stomach | Gastric tumors with | ( | Gastric cancer (inferred) | 66 | 24 | 770 | 27 |
| Total* |
|
|
|
|
|
|
* The non-redundant sum of progression and trunk driver genes across the various datasets. Unpub. are unpublished studies.
The datasets are grouped by the organ in which cancer develops, with an accompanying description of the respective tumor type(s). Published datasets include references to original reports. All other datasets are provisional. For most datasets a veterinary pathologist diagnosed tumors; in some cases diagnoses were inferred.
Figure 1.SBCDDB functionality. SBCDDB is comprised of three main sections. The ‘Search’ section on the main page enables users to query the database for a gene of interest using gene symbols or locus information. The output shows the frequency of insertions across tumor types modeled by SB and in which biological context(s) the gene is defined as a driver. The ‘Analysis’ section is accessed by clicking on the hyperlinked ‘Search’ output. Here, users can interrogate individual tumor models or biological contexts for a gene of interest. Clicking on the ‘Dataset’ hyperlink for a given tumor type provides details of all the defined drivers and the individual tumors within the dataset. Drivers defined by the biological context (organ, transposon, sensitizing allele, etc) can be found by clicking on the frequency charts. This analysis output can also be accessed from the main page by clicking on the pie chart under ‘Datasets.’ Finally, the ‘Documentation’ section of the SBCDDB provides a tutorial for users, defines the terms and allele information for the models, and provides additional information. This section can be accessed by a series of links at the top of each page.
Figure 2.Gene report for Stag2, a tumor suppressor and driver. A gene report details conditions in which a gene may be a cancer driver. (A) Genes are matched to official mouse and human gene symbols, and links to external annotation and cancer databases are provided for the given gene. (B) The frequency of tumors with SB insertions in Stag2 is shown across all tumor types, with an indication of whether the gene is a trunk driver (black bars), a progression driver (gray bars), or is not significantly altered (white bars). (C) An insertion map shows the various mapped insertions (triangles) in Stag2 transcripts across all of the primary tumor models. Right facing arrows (above transcripts) show forward insertions, while left facing arrows (below transcripts) correspond to reverse insertions. The presence of insertions across the Stag2 locus, in both the forward and reverse orientation, indicates that this locus is selectively inactivated. (D) A genomic context map roughly depicts insertion densities and genes around Stag2. Insertion densities are only shown in datasets in which Stag2 is a driver. Blue bars on the top track represent forward insertion density (with respect to the chromosome), while red bars represent reverse insertion density. Transcripts are drawn below this, and their strand is denoted by their color (blue = +, red = –). Genomic regions containing significant insertion densities are shown in the gray bars, with the darker bars associated with the gene of interest. E) The insertion map for Stag2 is shown for melanoma. BRCA, breast cancer; GAS, gastric cancer; HCA, hepatocellular adenoma; HCC, hepatocellular carcinoma; INT, Intestinal cancers; KA, keratoacanthoma; LYM, lymphoma; MB, medulloblastoma; MEL, melanoma; ML, myeloid leukemia; OS, osteosarcoma; PCA, prostate cancer; PDAC, pancreatic ductal adenocarcinoma; RMS, rhabdomyosarcoma; SCC, cutaneous squamous cell carcinoma.
Figure 3.Dataset report for melanoma model highlighting sets of mutated cancer genes.Cancer models can be used to identify sets of significantly altered genes across tumors. (A) A summary table provides general information about the melanoma dataset with a list of significantly altered genes. Pathway analysis via Enrichr can be run on this set of genes. (B) The melanoma dataset consists of tumors derived from various transposons. (C) The top ten most frequently mutated statistically significant genes are shown along with the proportion of tumors for each transposon. Datasets can contain tumors with different transposons; the chromosome containing the original SB transposon array is called the donor. Genes that map to a donor chromosome in some tumors can still be identified as significant in other tumors with a different donor (asterisks). (D) A waterfall plot shows occurrences of frequently mutated melanoma genes. Rows are genes and columns are tumors. Blue and red hashmarks denote forward and reverse insertions, respectively. White gaps appear when a gene is on the donor chromosome. (E) A table of statistically significant melanoma genes with metrics that can be used in subsequent analysis to rank genes of interest. (F) Significantly altered melanoma genes can be viewed in each melanoma tumor, and these gene sets can be run through Enrichr for pathway analysis. Cdkn2a, Cyclin-dependent kinase inhibitor 2A; Lpp, LIM domain containing preferred translocation partner in lipoma; Pten, Phosphatase and tensin homolog; Cep350, Centrosomal protein 350; Tcf12, Transcription factor 12; Gnaq, Guanine nucleotide binding protein, alpha q polypeptide; Nipbl, Nipped-B homolog (Drosophila); Crebbp, CREB binding protein; Fto, Fat mass and obesity associated; Setd2, SET domain containing 2; Nf1, Neurofibromin 1; Sae1, SUMO1 activating enzyme subunit 1.