| Literature DB >> 31492840 |
Tatiana Cajuso1,2, Päivi Sulo1,2, Tomas Tanskanen1,2, Riku Katainen1,2, Aurora Taira1,2, Ulrika A Hänninen1,2, Johanna Kondelin1,2, Linda Forsström1,2, Niko Välimäki1,2, Mervi Aavikko1,2, Eevi Kaasinen1,2, Ari Ristimäki1,3, Selja Koskensalo4, Anna Lepistö4, Laura Renkonen-Sinisalo4, Toni Seppälä4, Teijo Kuopio5,6, Jan Böhm6, Jukka-Pekka Mecklin7,8, Outi Kilpivaara1,2, Esa Pitkänen1,2, Kimmo Palin1,2, Lauri A Aaltonen9,10.
Abstract
Genomic instability pathways in colorectal cancer (CRC) have been extensively studied, but the role of retrotransposition in colorectal carcinogenesis remains poorly understood. Although retrotransposons are usually repressed, they become active in several human cancers, in particular those of the gastrointestinal tract. Here we characterize retrotransposon insertions in 202 colorectal tumor whole genomes and investigate their associations with molecular and clinical characteristics. We find highly variable retrotransposon activity among tumors and identify recurrent insertions in 15 known cancer genes. In approximately 1% of the cases we identify insertions in APC, likely to be tumor-initiating events. Insertions are positively associated with the CpG island methylator phenotype and the genomic fraction of allelic imbalance. Clinically, high number of insertions is independently associated with poor disease-specific survival.Entities:
Mesh:
Year: 2019 PMID: 31492840 PMCID: PMC6731219 DOI: 10.1038/s41467-019-11770-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Distribution of somatic insertions across 202 colorectal tumors and over replication time. a Frequency of somatic insertion counts in 202 colorectal tumors. b Insertion density over replication time. The genome was stratified by replication time in five categories where 0 referred to the earliest replication timing. Each point represents insertion density in the corresponding category for each of the 202 tumors. Boxplot shows median, interquartile range (IQR), and whiskers extend to the most extreme data points which are no more than 1.5 times the IQR
Fig. 2Retrotransposon insertions in protein-coding genes. Gene expression (median TPM values from 34 tumors) over gene insertion count groups. Boxplot shows median, interquartile range (IQR), and whiskers extend to the most extreme data points which are no more than 1.5 times the IQR
Genes from the Cancer Gene Census with two or more insertions
| Gene ID | Gene name | Number of insertions ( | Cancer census role |
|---|---|---|---|
| ENSG00000168702 |
| 19 | TSG |
| ENSG00000178568 |
| 7 | Oncogene, TSG |
| ENSG00000171094 |
| 5 | Oncogene, fusion |
| ENSG00000196090 |
| 3 | TSG |
| ENSG00000046889 |
| 3 | Oncogene |
| ENSG00000185811 |
| 3 | TSG, fusion |
| ENSG00000183454 |
| 2 | TSG |
| ENSG00000144218 |
| 2 | Oncogene, fusion |
| ENSG00000157168 |
| 2 | TSG, fusion |
| ENSG00000079102 |
| 2 | Oncogene, TSG, fusion |
| ENSG00000151702 |
| 2 | Oncogene, fusion |
| ENSG00000134982 |
| 2 | TSG |
| ENSG00000189283 |
| 2 | TSG, fusion |
| ENSG00000085276 |
| 2 | Oncogene, fusion |
| ENSG00000196159 |
| 2 | TSG |
Gene names are shown in italics. Cancer census role, role in cancer as defined by the Cancer Gene Census 30
TSG tumor suppressor gene
Fig. 3Retrotransposon insertion distribution in LRP1B and DLG2. Mapping of retrotransposon insertions identified in 202 colorectal tumors, HPV integration hotspots reported in 135 cervical cancers and allelic imbalance breakpoints identified in 1,699 CRCs[31,36]. Figure plotted with genoPlotR[64]. Source data are provided as a Source Data file
Fig. 4Distribution of non-synonymous changes and LINE-1 insertions on the linear protein of APC. Non-synonymous changes in 187 MSS CRCs, small lollipops. LINE-1 insertions, larger lollipops. p.N809L1 (c1049.1T) and p.P1526L1 (c310.1T), turquoise lollipops; p.F1396L1 and p.P1526L1[24,25] black lollipops. Figure modified from cBio cancer genomics portal[59,60]
Fig. 5Insertion and AI frequency in 21 fragile sites. a Insertion fraction over the fraction of allelic imbalance in 21 fragile sites. b Gene expression (median TPM values from 34 tumors) in fragile sites with high insertion fraction and fragile sites with high allelic imbalance fraction (Supplementary Data 3). Boxplot shows median, interquartile range (IQR), and whiskers extend to the most extreme data points which are no more than 1.5 times the IQR
Multiple linear regression model for log insertion counts
| Coefficient | Std. err. |
|
| Signif. | |
|---|---|---|---|---|---|
| Intercept | 0.408 | 0.647 | 0.630 | 5.29e-01 | |
| CIMP-H | 0.607 | 0.169 | 3.60 | 3.22e-04 | *** |
| Allelic Imbalance (/10% of reference) | 0.0826 | 0.0284 | 2.91 | 3.64e-03 | ** |
| TP53 mutation | −0.0684 | 0.134 | −0.509 | 6.10e-01 | |
| MSI | 0.150 | 0.285 | 0.527 | 5.98e-01 | |
| Mean coverage (/10 reads) | 0.309 | 0.0862 | 3.59 | 3.33e-04 | *** |
| Age at diagnosis (/10 years) | 0.0331 | 0.0603 | 0.549 | 5.83e-01 | |
| Male | 0.0570 | 0.123 | 0.464 | 6.42e-01 | |
| Dukes B | −0.0578 | 0.168 | −0.344 | 7.31e-01 | |
| Dukes C | −0.0483 | 0.186 | −0.259 | 7.96e-01 | |
| Dukes D | 0.00490 | 0.211 | 0.0232 | 9.81e-01 | |
| Proximal location | 0.206 | 0.144 | 1.43 | 1.52e-01 |
MSI microsatellite instability, CIMP-H CpG methylator phenotype high
Significance codes: *** ≤ 0.001 < ** ≤ 0.01 < * ≤ 0.05 < . ≤ 0.1
Cox proportional hazards model for disease-specific survival
| Coefficient | Std. err. |
|
| HR [95% CI] | Signif. | |
|---|---|---|---|---|---|---|
| Insertion count (/10) | 0.108 | 0.0362 | 2.98 | 2.93e-03 | 1.11 [1.04, 1.20] | ** |
| MSI | −0.258 | 0.642 | −0.402 | 6.88e-01 | 0.773 [0.219, 2.72] | |
| CIMP-H | 0.174 | 0.341 | 0.510 | 6.10e-01 | 1.19 [0.610, 2.32] | |
| BRAF mutation | 0.790 | 0.447 | 1.77 | 7.68e-02 | 2.20 [0.918, 5.29] | . |
| Age [55, 75) years | −0.147 | 0.408 | −0.360 | 7.19e-01 | 0.863 [0.388, 1.92] | |
| Age ≥ 75 years | 0.188 | 0.427 | 0.439 | 6.60e-01 | 1.21 [0.523, 2.78] | |
| Male | 0.311 | 0.232 | 1.34 | 1.80e-01 | 1.37 [0.866, 2.15] | |
| Dukes B | 0.452 | 0.449 | 1.01 | 3.13e-01 | 1.57 [0.652, 3.79] | |
| Dukes C | 1.77 | 0.431 | 4.12 | 3.82e-05 | 5.89 [2.53, 13.7] | *** |
| Dukes D | 2.78 | 0.454 | 6.12 | 9.07e-10 | 16.2 [6.64, 39.4] | *** |
| Allelic Imbalance (/10% of reference) | −0.0583 | 0.0539 | −1.08 | 2.80e-01 | 0.943 [0.849, 1.05] |
The model was stratified by tumor location
HR Hazard ratio, CI confidence interval, MSI microsatellite instability, CIMP-H CpG island methylator phenotype high
Significance codes: *** ≤ 0.001 < ** ≤ 0.01 < * ≤ 0.05 < . ≤ 0.1
Fig. 6Kaplan–Meier curves by insertion count. Tumors with less than 20 somatic insertions (blue line) and tumors with 20 or more insertions (red line)