| Literature DB >> 35078504 |
Lisa Dressler1,2,3, Michele Bortolomeazzi1,2, Mohamed Reda Keddar1,2, Hrvoje Misetic1,2, Giulia Sartini1,2, Amelia Acha-Sagredo1,2, Lucia Montorsi1,2, Neshika Wijewardhane1,2, Dimitra Repana1,2, Joel Nulsen1,2, Jacki Goldman4, Marc Pollitt4, Patrick Davis4, Amy Strange4, Karen Ambrose4, Francesca D Ciccarelli5,6.
Abstract
BACKGROUND: Genetic alterations of somatic cells can drive non-malignant clone formation and promote cancer initiation. However, the link between these processes remains unclear and hampers our understanding of tissue homeostasis and cancer development.Entities:
Keywords: Cancer initiation; Driver genes; Somatic evolution; Systems-level properties
Mesh:
Year: 2022 PMID: 35078504 PMCID: PMC8790917 DOI: 10.1186/s13059-022-02607-z
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Collection of a comprehensive repertoire of cancer and healthy drivers. a Literature review and driver annotation workflow. Expert literature curation of 331 publications led to a repertoire of cancer and healthy drivers in a variety of cancer and non-cancer tissues. Combining multiple data sources, a set of properties and annotations was computed for all these drivers. b Intersection of canonical drivers from three sources [17–19] that passed our manual curation. c Classification of canonical cancer drivers in tumor suppressors and oncogenes. Eighty-one cancer drivers had a dual role or could not be classified. d Intersection of canonical and candidate driver genes from 310 sequencing screens. Genes whose driver role had only statistical support were considered candidate cancer drivers. e Intersection between cancer drivers with coding and non-coding alterations. f Level of support for the driver role of 531 cancer genes with non-coding driver alterations only. Level 1 means that the gene was predicted as a driver only in one cancer sequencing screen; levels 2, 3, and 4 mean that it was predicted by two, three, or four screens or that it had experimental support. Experimental support was gathered from the 19 publications reporting non-coding cancer drivers (Additional file 1, Table S1) and from the CNCDatabase [20] and included in vitro and in vivo experiments, modification of gene expression, and survival association. g Proportion of healthy drivers that are also canonical or candidate cancer drivers, classified as canonical and candidate healthy drivers, respectively
Fig. 2Distribution of driver annotations by organ system. a Correlation between numbers of sequenced donors and identified cancer drivers across organ systems. Spearman correlation coefficient R and associated p-value are shown. b Number of canonical, candidate, and healthy drivers in each organ system. Horizontal lines indicate the median number of canonical (92), candidate (160), and healthy (17) drivers across organ systems. c Proportion of canonical drivers detected in each organ system over canonical drivers detected in all cancer screens (421). The horizontal line indicates the median across all organ systems (22%). d Proportion of genes with non-coding driver alterations over all cancer drivers in each organ system. The horizontal line indicates the median across all organ systems (4%). Number of canonical (e), candidate (f), and healthy (g) drivers across screens and organ systems. Representative genes with different recurrence between cancer and healthy tissues are indicated. h Organ system distribution of the top eight recurrent healthy drivers. The full list is provided as Additional file 6, Table S5. i Correlation between numbers of sequenced donors and identified healthy drivers across organ systems. Spearman correlation coefficient R and associated p-value are shown
Fig. 3Damaging alteration pattern of drivers in TCGA. a Identification of damaged drivers in 7953 TCGA samples. Mutations, gene deletions, and amplifications were annotated according to their predicted damaging effect. This allowed to distinguish drivers acquiring loss-of-function (LoF) or gain-of-function (GoF) alterations. b Number of TCGA samples with damaging alterations (all, LoF, GoF) in canonical drivers that were detected (421) or undetected (170) by cancer driver detection methods. c Proportion of TCGA samples with GoF and LoF alterations in tumor suppressors, oncogenes, and canonical drivers with a dual or unclassified role. Proportion of TCGA samples with GoF and LoF alterations in (d) canonical drivers and (e) candidate drivers. Genes mentioned in the text are highlighted. The two-dimensional Gaussian kernel density estimations were calculated for each driver group using the R density function. f Number of TCGA samples with damaging alterations (all, LoF, GoF) in drivers previously reported in coding and non-coding sequences. g Proportion of samples with variable numbers of all damaged drivers or only canonical drivers. h Proportion of TCGA samples with GoF and LoF alterations in healthy drivers. Canonical and candidate healthy drivers correspond to genes with a known or predicted cancer driver role. i Number of TCGA samples with damaged canonical, candidate, and remaining healthy drivers and the rest of human genes. All distributions were compared using a two-sided Wilcoxon rank-sum test
Fig. 4Systems-level properties of cancer and healthy drivers. Comparisons of systems-level properties between (a) canonical or candidate cancer drivers and the rest of human genes, (b) tumor suppressors and oncogenes, and (c) cancer genes with coding driver alterations and cancer genes with non-coding driver alterations. The normalized property score was calculated as the normalized difference between the median (continuous properties) or proportion (categorical properties) values in each driver group and the rest of human genes (the “Methods” section). Comparisons of systems-level properties between (d) candidate oncogenes with non-coding driver alterations (324) and canonical tumor suppressors, (e) candidate oncogenes (1405) and canonical tumor suppressors, and (f) candidate tumor suppressors (1318) and canonical oncogenes. g. Comparisons of systems-level properties between canonical healthy, candidate healthy, and remaining healthy drivers and the rest of human genes. Proportions of old (pre-metazoan), duplicated, essential genes, and proteins involved in the complexes were compared using a two-sided Fisher’s exact test. Distributions of gene and protein expression, protein-protein, miRNA-gene interactions, and germline variation were compared using a two-sided Wilcoxon rank-sum test. False discovery rate (FDR) was corrected for using Benjamini-Hochberg
Fig. 5NCGHD annotations of driver genes. a Example of the type of annotation provided in NCGHD for cancer and healthy drivers (in this case PTEN). Annotation boxes can be expanded for further details, with the possibility of intersecting data interactively (for example, in the case of protein-protein or miRNA-gene interactions) and downloading data for local use. b Proportion of Reactome levels 2–8 enriched pathways mapping to the respective level 1 in each driver group. Enrichment was measured comparing the proportion of drivers in each pathway to that of the rest of human genes with a one-sided Fisher’s exact test. FDR was calculated using Benjamini-Hochberg. The numbers of drivers and enriched Reactome pathways are reported for each group. Proportion of canonical and candidate cancer divers and rest of genes that are (c) targets of FDA-approved antineoplastic drugs or biomarkers of response or resistance to oncological drugs in (d) cancer cell lines and (e) clinical studies. The corresponding numbers for each group are also shown