| Literature DB >> 21441214 |
Andrej Fischer1, Chris Greenman, Ville Mustonen.
Abstract
A key goal in cancer research is to find the genomic alterations that underlie malignant cells. Genomics has proved successful in identifying somatic variants at a large scale. However, it has become evident that a typical cancer exhibits a heterogenous mutation pattern across samples. Cases where the same alteration is observed repeatedly seem to be the exception rather than the norm. Thus, pinpointing the key alterations (driver mutations) from a background of variations with no direct causal link to cancer (passenger mutations) is difficult. Here we analyze somatic missense mutations from cancer samples and their healthy tissue counterparts (germline mutations) from the viewpoint of germline fitness. We calibrate a scoring system from protein domain alignments to score mutations and their target loci. We show first that this score predicts to a good degree the rate of polymorphism of the observed germline variation. The scoring is then applied to somatic mutations. We show that candidate cancer genes prone to copy number loss harbor mutations with germline fitness effects that are significantly more deleterious than expected by chance. This suggests that missense mutations play a driving role in tumor suppressor genes. Furthermore, these mutations fall preferably onto loci in sequence neighborhoods that are high scoring in terms of germline fitness. In contrast, for somatic mutations in candidate onco genes we do not observe a statistically significant effect. These results help to inform how to exploit germline fitness predictions in discovering new genes and mutations responsible for cancer.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21441214 PMCID: PMC3122307 DOI: 10.1534/genetics.111.127480
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
FMutation and locus scores. An example alignment window is shown that illustrates the scoring system described in the text. We want to score a mutation aref = C → A = a (colored with red) in position i. First, we evaluate the difference in the position-specific score between the final and the initial states, as defined by the alignment column (left panel, vertical red box). Second, we evaluate a score for the target locus onto which the mutation falls by summing up the scores of the amino acids within a window w (right panel, horizontal red box). Locus score information is derived from several loci and gives a scale for how evolutionarily important the target locus and its surroundings are.
FGene-level observables averaged over candidate tumor suppressor genes. Histograms denote the obtained averages in 105 synthetic sets (null model) and blue dots denote the values in the data. (A–C) Germline mutations (green). (A) Count scores c show no significant effect. In contrast, scores for germline mutations are less deleterious than for mutations in the null. (B) Δs, P-value = 2 × 10−5, effect size 0.50 (evaluated as data value divided by the mean of synthetic sets). (C) , P-value = 2 × 10−5, effect size 0.45. (D–F) Somatic mutations (red). (D) Count scores c, P-value = 0.02, effect size 1.33. There is a surplus of counts over what would be expected within the null. Furthermore, germline fitness scores for somatic mutations are more deleterious than for mutations in the null. (E) Δs, P-value = 0.002, effect size 1.80. (F) , P-value = 0.0003, effect size 2.10.
Number of (available) mutations in the different categories
| Opportunity (average) (105) | Somatic | Germline | |||||||
| All | T. supp. | Onco | All | Tumor suppressor | Onco | All | Tumor suppressor | Onco | |
| Total | 29.37 | 3.63 | 3.68 | 620 | 100 | 83 | 2423 | 277 | 264 |
| Scored | 14.26 | 1.78 | 1.87 | 324 | 56 | 49 | 1018 | 125 | 102 |
Mutational biases
| Channel | Opportunity (%) | Somatic (%) | Germline (%) |
| A:T > T:A | 17 | 7 | 5 |
| A:T > C:G | 19 | 3 | 5 |
| A:T > G:C | 16 | 10 | 21 |
| C:G > G:C | 19 | 13 | 11 |
| C:G > A:T | 16 | 10 | 9 |
| C:G > T:A | 13 | 57 | 49 |
FGermline polymorphism density. P(Δs) is shown in units of P(0): blue squares are data and the red line is the theory curve from Equation 11 with m = 210. Predicted polymorphism density is proportional to the values measured from the germline variation, somewhat underestimating the reduction of strongly deleterious mutations (error bars evaluated with ).
Germline variation in kinases
| Germline | |||
| Score | Level | Effect size | |
| Δ | Locus | <10−5 | 0.61 |
| Gene | <10−5 | 0.52 | |
| Locus | <10−5 | 0.56 | |
| Gene | <10−5 | 0.49 | |
Genomic observables for candidate tumor suppressor genes
| Somatic | Germline | ||||
| Score | Level | Effect size | Effect size | ||
| Locus | 0.004 | 1.39 | NS | — | |
| Gene | 0.02 | 1.33 | NS | — | |
| Δ | Locus | 0.003 | 1.69 | 0.00006 | 0.55 |
| Gene | 0.002 | 1.80 | 0.00002 | 0.50 | |
| Locus | 0.0007 | 1.98 | 0.00007 | 0.49 | |
| Gene | 0.0003 | 2.10 | 0.00002 | 0.45 | |
Results for sSIFT and ΔsHMM scores
| Germline all | Somatic candidate tumor suppressor | Germline candidate tumor suppressor | |||||
| Score | Level | Effect size | Effect size | Effect size | |||
| Locus | <10−5 | 0.61 | 0.02 | 1.61 | 0.0001 | 0.46 | |
| Gene | <10−5 | 0.55 | 0.02 | 1.73 | 0.00008 | 0.43 | |
| Δ | Locus | <10−5 | 0.51 | 0.006 | 1.75 | <10−5 | 0.43 |
| Gene | <10−5 | 0.45 | 0.003 | 1.91 | <10−5 | 0.39 | |