| Literature DB >> 24603726 |
Sheli L Ostrow1, Ruth Barshir2, James DeGregori3, Esti Yeger-Lotem2, Ruth Hershberg1.
Abstract
Cancer is an evolutionary process in which cells acquire new transformative, proliferative and metastatic capabilities. A full understanding of cancer requires learning the dynamics of the cancer evolutionary process. We present here a large-scale analysis of the dynamics of this evolutionary process within tumors, with a focus on breast cancer. We show that the cancer evolutionary process differs greatly from organismal (germline) evolution. Organismal evolution is dominated by purifying selection (that removes mutations that are harmful to fitness). In contrast, in the cancer evolutionary process the dominance of purifying selection is much reduced, allowing for a much easier detection of the signals of positive selection (adaptation). We further show that, as a group, genes that are globally expressed across human tissues show a very strong signal of positive selection within tumors. Indeed, known cancer genes are enriched for global expression patterns. Yet, positive selection is prevalent even on globally expressed genes that have not yet been associated with cancer, suggesting that globally expressed genes are enriched for yet undiscovered cancer related functions. We find that the increased positive selection on globally expressed genes within tumors is not due to their expression in the tissue relevant to the cancer. Rather, such increased adaptation is likely due to globally expressed genes being enriched in important housekeeping and essential functions. Thus, our results suggest that tumor adaptation is most often mediated through somatic changes to those genes that are important for the most basic cellular functions. Together, our analysis reveals the uniqueness of the cancer evolutionary process and the particular importance of globally expressed genes in driving cancer initiation and progression.Entities:
Mesh:
Year: 2014 PMID: 24603726 PMCID: PMC3945297 DOI: 10.1371/journal.pgen.1004239
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Increased proportion of functional substitutions in BrCa compared to the germline.
Depicted are dN/dS and dMF/dLF values calculated based on germline mutations segregating at a frequency of >0.1 (black), and dN/dS and dMF/dLF values calculated based on BrCa somatic substitutions (gray). The dashed line represents a dN/dS and dMF/dLF ratio of 1. We focus on germline substitutions occurring at a higher frequency of >0.1 in the human population, because rare germline substitutions are expected to be less affected by natural selection [51]. This is because rare polymorphisms have not yet had time to be strongly affected by selection and therefore still contain many deleterious substitutions that with time would be removed from the population. The full data regarding numbers of non-synonymous, and synonymous, MF and LF substitutions, and also regarding dN/dS and dMF/dLF of all germline substitutions (including those appearing at frequencies lower than 0.1) is presented in Table S1.
Globally expressed genes are enriched for functional BrCa somatic substitutions compared to genes that are not globally expressed.
| Non-synonymous vs. synonymous | SIFT | Polyphen | |||||||
| # non-syn | # syn | dN/dS | # MF | # LF | dMF/dLF | # MF | # LF | dMF/dLF | |
| Globally expressed genes | 10653 | 3463 | 0.91 | 5960 | 4355 | 1.91 | 5252 | 4415 | 1.01 |
| Non globally expressed genes | 18014 | 6911 | 0.78 | 9210 | 7906 | 1.55 | 7922 | 7779 | 0.91 |
| Non-globally expressed genes, expressed in breast | 8863 | 3275 | 0.81 | 4706 | 3937 | 1.60 | 4149 | 3866 | 0.94 |
| Non-globally expressed genes, not expressed in breast | 9151 | 3636 | 0.76 | 4504 | 3969 | 1.51 | 3773 | 3913 | 0.88 |
| Globally expressed genes, known cancer genes removed | 9623 | 3310 | 0.86 | 5202 | 4091 | 1.77 | 4616 | 4074 | 0.97 |
| Non globally expressed genes, known cancer genes removed | 17580 | 6765 | 0.78 | 8972 | 7717 | 1.55 | 7719 | 7592 | 0.91 |
| Cancer associated genes | 1504 | 318 | 1.39 | 1026 | 463 | 3.08 | 851 | 537 | 1.34 |
| Genes not yet associated with cancer | 29022 | 10804 | 0.80 | 14828 | 12423 | 1.61 | 12718 | 12125 | 0.92 |
According to the Catalogue of Somatic Mutations in Cancer (COSMIC) [33]
Higher dN/dS and dMF/dLF values for globally, compared to non-globally expressed genes in 13 additional cancer types.
| Non-synonymous vs. synonymous | SIFT | Polyphen-2 | |||||||||||
| Cancer | Globally expressed | # non-syn | # syn | dN/dS | Significant difference | # MF | # LF | dMF/dLF | Significant difference | # MF | # LF | dMF/dLF | Significant difference |
| BLCA | Yes | 10008 | 3738 | 0.79 | Yes | 5721 | 4179 | 1.91 | Yes | 5169 | 4504 | 0.98 | Yes |
| No | 15105 | 6507 | 0.70 | 8087 | 6606 | 1.63 | 7248 | 7066 | 0.92 | ||||
| COAD | Yes | 22212 | 10896 | 0.60 | No | 12531 | 9550 | 1.83 | Yes | 11251 | 10191 | 0.94 | Yes |
| No | 43179 | 22109 | 0.59 | 22065 | 20515 | 1.44 | 19599 | 20981 | 0.83 | ||||
| GBM | Yes | 3926 | 1404 | 0.82 | Yes | 2306 | 1580 | 2.03 | Yes | 2106 | 1650 | 1.09 | Yes |
| No | 10091 | 4041 | 0.75 | 5293 | 4398 | 1.61 | 4584 | 4626 | 0.88 | ||||
| HNSC | Yes | 12871 | 4661 | 0.81 | Yes | 7541 | 5196 | 2.02 | Yes | 6743 | 5537 | 1.04 | Yes |
| No | 24719 | 10007 | 0.74 | 13664 | 10273 | 1.77 | 12036 | 10880 | 0.99 | ||||
| KIRC | Yes | 8600 | 3119 | 0.81 | No | 4859 | 3549 | 1.91 | Yes | 4453 | 3847 | 0.99 | Yes |
| No | 13368 | 5090 | 0.79 | 7154 | 5767 | 1.66 | 6214 | 6180 | 0.90 | ||||
| LUAD | Yes | 31215 | 10392 | 0.89 | Yes | 13664 | 9007 | 2.11 | Yes | 16701 | 13267 | 1.07 | No |
| No | 80079 | 28938 | 0.83 | 33323 | 23390 | 1.90 | 40593 | 33862 | 1.07 | ||||
| LUSC | Yes | 13014 | 4647 | 0.83 | No | 7637 | 5262 | 2.02 | Yes | 6897 | 5620 | 1.05 | No |
| No | 28855 | 10788 | 0.81 | 16462 | 11901 | 1.85 | 14539 | 12530 | 1.04 | ||||
| OV | Yes | 6196 | 1884 | 0.97 | Yes | 694 | 1212 | 0.80 | Yes | 1533 | 1250 | 1.04 | No |
| No | 10954 | 3697 | 0.89 | 1128 | 2255 | 0.67 | 2612 | 2304 | 1.01 | ||||
| READ | Yes | 6873 | 2229 | 0.91 | Yes | 3999 | 2835 | 1.97 | Yes | 3664 | 2970 | 1.05 | Yes |
| No | 12988 | 4953 | 0.79 | 7076 | 5779 | 1.63 | 6227 | 5976 | 0.93 | ||||
| SKCM | Yes | 30030 | 16236 | 0.55 | No | 18112 | 11630 | 2.17 | Yes | 14825 | 12085 | 1.04 | Yes |
| No | 91840 | 50015 | 0.55 | 50791 | 38809 | 1.75 | 42835 | 40112 | 0.95 | ||||
| STAD | Yes | 24671 | 10017 | 0.73 | No | 14321 | 10164 | 1.96 | Yes | 12984 | 10945 | 1.01 | No |
| No | 44461 | 18847 | 0.71 | 24621 | 18808 | 1.75 | 22150 | 19741 | 1.00 | ||||
| THCA | Yes | 2207 | 754 | 0.86 | Yes | 1364 | 821 | 2.31 | Yes | 1185 | 927 | 1.09 | Yes |
| No | 3292 | 1352 | 0.73 | 1657 | 1426 | 1.55 | 1415 | 1592 | 0.79 | ||||
| UCEC | Yes | 39676 | 13266 | 0.88 | Yes | 23313 | 15909 | 2.04 | Yes | 20915 | 16232 | 1.10 | Yes |
| No | 67448 | 24386 | 0.83 | 37068 | 28292 | 1.75 | 31898 | 28458 | 1.00 | ||||
BLCA - Bladder Urothelial Carcinoma, COAD - Colon adenocarcinoma, GBM - Glioblastoma multiforme, HNSC - Head and Neck squamous cell carcinoma, KIRC - Kidney renal clear cell carcinoma, LUAD - Lung adenocarcinoma, LUSC - Lung squamous cell carcinoma, OV - Ovarian serous cystadenocarcinoma, READ - Rectum adenocarcinoma, SKCM - Skin Cutaneous Melanoma, STAD - Stomach adenocarcinoma, THCA - Thyroid carcinoma, UCEC - Uterine Corpus Endometrioid Carcinoma.
Non – Synonymous.
Synonymous.
Figure 2Cancer-associated genes tend to more frequently be globally expressed, and less frequently be expressed in a tissue specific manner than other genes.
Genes that are known to be associated with cancer (black) and all remaining genes (gray) were grouped based on the number of tissues in which their expression has been detected (out of 16 examined tissues). The frequency of genes within each bin is depicted. Cancer genes display a significant (P<0.0001, according to a χ2 test) enrichment for global expression patterns (defined as expression across all 16 examined tissues). At the same time, cancer associated genes are ∼2.5 times less likely than other genes to not be expressed in any tissue, or be expressed in a tissue specific manner (1–3 tissues, a significant depletion, P<0.0001).