| Literature DB >> 26689913 |
Charles Lu1, Mingchao Xie1,2,3, Michael C Wendl1,3,4, Jiayin Wang1,2, Michael D McLellan1, Mark D M Leiserson5,6, Kuan-Lin Huang1,2, Matthew A Wyczalkowski1,2, Reyka Jayasinghe1,2, Tapahsama Banerjee7, Jie Ning1,2, Piyush Tripathi2, Qunyuan Zhang1, Beifang Niu1, Kai Ye1,3, Heather K Schmidt1, Robert S Fulton1,3, Joshua F McMichael1, Prag Batra1, Cyriac Kandoth1, Maheetha Bharadwaj1, Daniel C Koboldt1, Christopher A Miller1, Krishna L Kanchi1, James M Eldred1, David E Larson1,3, John S Welch2,8, Ming You9, Bradley A Ozenberger1,2, Ramaswamy Govindan2,8, Matthew J Walter2,8, Matthew J Ellis2,8, Elaine R Mardis1,2,3,8, Timothy A Graubert2,8, John F Dipersio2,8, Timothy J Ley1,2,3,8, Richard K Wilson1,2,3,8, Paul J Goodfellow7, Benjamin J Raphael5,6, Feng Chen2,8, Kimberly J Johnson10, Jeffrey D Parvin7,11, Li Ding1,2,3,8.
Abstract
Large-scale cancer sequencing data enable discovery of rare germline cancer susceptibility variants. Here we systematically analyse 4,034 cases from The Cancer Genome Atlas cancer cases representing 12 cancer types. We find that the frequency of rare germline truncations in 114 cancer-susceptibility-associated genes varies widely, from 4% (acute myeloid leukaemia (AML)) to 19% (ovarian cancer), with a notably high frequency of 11% in stomach cancer. Burden testing identifies 13 cancer genes with significant enrichment of rare truncations, some associated with specific cancers (for example, RAD51C, PALB2 and MSH6 in AML, stomach and endometrial cancers, respectively). Significant, tumour-specific loss of heterozygosity occurs in nine genes (ATM, BAP1, BRCA1/2, BRIP1, FANCM, PALB2 and RAD51C/D). Moreover, our homology-directed repair assay of 68 BRCA1 rare missense variants supports the utility of allelic enrichment analysis for characterizing variants of unknown significance. The scale of this analysis and the somatic-germline integration enable the detection of rare variants that may affect individual susceptibility to tumour development, a critical step toward precision medicine.Entities:
Mesh:
Year: 2015 PMID: 26689913 PMCID: PMC4703835 DOI: 10.1038/ncomms10086
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Case numbers from individual cancer types and basic clinical features for cancer cases included in this study.
Figure 1Characteristics of the data.
Data are distributed by age, cancer, cohort and carrier frequency. (a) Age of onset by cancer type. Average age varies across cancer types, from 43 years in LGG to 67.7 years in LUSC. Note that LGG, LUAD and STAD show clear bimodal characteristics. (b) Age distributions for discovery, validation and control cohorts. (c) Comparison of cancer gene truncation carrier frequencies across 12 cancer types. The distribution of rare germline truncation variants for 12 cancer types (represented as the per cent of cases in each cancer type with rare germline truncation mutation) in 2 different groups of cancer-associated genes (labelled on top of each bar plot): 114 cancer susceptibility genes from Rahman et al.1 and 47 genes associated with the DNA repair (Fanconi Anaemia) pathway3. There are 15 genes common to both groups. The total number of unique genes from these 2 groups is 131.
Figure 2Burden analysis reveals distinct set of cancer susceptibility genes across 12 cancer types.
A total of 34 genes-of-interest were identified by burden analysis by comparing the frequencies of rare truncation variants in Caucasian cancer cases (n=3,125) versus their frequencies in the WHI control population (n=1,039). Two oncogenes (ABL2 and BCR) were omitted. (a) Significant genes across Pan-Cancer types. Data were analysed with the total frequency test (TFT) followed by false discovery rate (FDR) ranking. Dark horizontal line indicates the 5% FDR threshold, which is satisfied by five genes, including BRCA1, BRCA2, ATM, BRIP1 and PALB2. Inset shows closer visual resolution. (b) Significant genes for specific cancer types. Each plot shows the top tested genes, by FDR, from the same TFT analysis procedure for all 12 individual cancer types. Eight genes in addition to the five shown in a are significant at the 5% FDR level from cancer-type-specific analysis. (c) Cohort frequencies of genes. Bubble plot shows frequency of rare truncation mutation as a percentage of cases in each cohort (all 4,034 cases included for frequency calculation). The x-axis denotes the test group of a specific cancer type, the Pan-Cancer discovery cohort (4,034) and the validation cohort (1,627). Genes found to be significant at 5% FDR using the Pan-Cancer discovery cohort are labelled in boldface. Rings indicate genes that are significant (TFT, FDR ≤5%) for a particular cohort on the x-axis. (d) Percentage of cases carrying rare truncation in the 34 genes-of-interest across 12 cancer types in the discovery cohort.
Figure 3Analysis of loss of heterozygosity in rare truncation and missense variants.
(a) Bar plot shows individual truncations from nine genes (FDR shown) with lengths representing ratios of tumour-to-normal variant allele fractions (that is, the fraction of reads containing the variant allele). Statistically significant events, defined as FDR≤5%, are shaded boldly, while non-significant events are muted, with colours corresponding to genes. Cancer source of each truncation is shown underneath, for example, most BRCA1 variants occur in ovarian and breast cancers and all BAP1 variants in KIRC. (b) Bar plot for individual missense variants from four genes having elevated frequencies of such variants that show very significant LOH, that is, at the 1% FDR level. (c) Dot plot shows individual missense variants where abscissa and ordinate are amino acid positions and the ratio of tumour-to-normal variant allele fraction, respectively. Blue and red indicate significant (FDR ≤5%) and non-significant events, respectively, with size of dots proportional to negative log of the FDR. Annotated domains from the PFAM database are aligned with position, while shaded areas indicate ‘hotspot' regions where variants having significant LOH cluster more than the rate explainable by chance. Plots are shown for ATM, BRCA1, BRCA2, FANCA and FANCM.
Figure 4Molecular interactions between rare germline variants and somatic mutations within and across cancer types.
(a) Heatmap demonstrates the significance of interactions between 34 burden test significant genes and 54 cancer-associated genes (top 30 are shown) with recurrently mutated somatic variants across cancer types. Red–white colour scale and blue–white colour scale depict the negative log of P-value for mutual exclusivity and co-occurrence, respectively. Both are based on the MuSiC permutation test (n=10,000). (b) Abacus plot displays the distribution of significant, mutually exclusive rare germline variants and somatic mutations across all 12 cancer types. Unique combinations of germline and somatic variants contribute to the development of individual cancer types. Bigger dots indicate recurrent genes across cancer types, while smaller dots indicate cancer-type-enriched genes.
Figure 5Germline variants correlate with somatic mutations and age at diagnosis.
(a) Barplot illustrates the distribution of BRCA1, BRCA2 and ATM somatic and germline mutations across cancer types. (b,c) Panels display genes significantly correlated with somatic mutation frequency and younger age of onset in different cancer types and in Pan-Cancer. The width of the shape indicates the density, and the horizontal line indicates the median. P value is calculated by the Wilcoxon rank-sum test and is indicated by the size of the uppermost circles.
Figure 6Functional validation of BRCA1 missense and truncation variants.
(a) 68 rare missense and 4 truncation variant sites were tested by HDR assay. All samples were depleted of endogenous BRCA1 by transfection of a siRNA targeting the 3′-untranslated region. Indicated in the legend are the plasmids transfected to test for rescue of BRCA1 activity. ‘pcDNA3' is empty vector and ‘WT' represents wild-type BRCA1 plasmid. The y-axis denotes the HDR activity relative to the wild-type BRCA1 protein. Error bars depict s.d. from the mean. Dots on the x-axis represent LOH status, each dot corresponding to one case. Blue, red, dark grey and light grey denote statistical significance, non-significance, unknown LOH (due to lack of sufficient coverage) and untested, respectively. Variants in different functional domains are indicated with colours as follows: orange, RING domain; green, nuclear localization signal (NLS); blue, DNA-binding region; purple, a SQ/TQ cluster domain (SCD); and red, BRCA1 C-terminal domain (BRCT). All the HDR assays were tested in triplicate. (b) Crystal structure of the BRCA1 RING (left) domain in complex with the BARD1 RING domain (labelled in grey) and BRCT domain (right panel) are displayed, with HDR-defective variants labelled in red and partial HDR-defective variants tagged in orange. Variants in yellow are functional in the HDR assay.