| Literature DB >> 23131835 |
Alireza Hadj Khodabakhshi1, Ryan D Morin, Anthony P Fejes, Andrew J Mungall, Karen L Mungall, Madison Bolger-Munro, Nathalie A Johnson, Joseph M Connors, Randy D Gascoyne, Marco A Marra, Inanc Birol, Steven J M Jones.
Abstract
Somatic hypermutation (SHM) in the variable region of immunoglobulin genes (IGV) naturally occurs in a narrow window of B cell development to provide high-affinity antibodies. However, SHM can also aberrantly target proto-oncogenes and cause genome instability. The role of aberrant SHM (aSHM) has been widely studied in various non-Hodgkin's lymphoma particularly in diffuse large B-cell lymphoma (DLBCL). Although, it has been speculated that aSHM targets a wide range of genome loci so far only twelve genes have been identified as targets of aSHM through the targeted sequencing of selected genes. A genome-wide study aiming at identifying a comprehensive set of aSHM targets recurrently occurring in DLBCL has not been previously undertaken. Here, we present a comprehensive assessment of the somatic hypermutated genes in DLBCL identified through an analysis of genomic and transcriptome data derived from 40 DLBCL patients. Our analysis verifies that there are indeed many genes that are recurrently affected by aSHM. In particular, we have identified 32 novel targets that show same or higher level of aSHM activity than genes previously reported. Amongst these novel targets, 22 genes showed a significant correlation between mRNA abundance and aSHM.Entities:
Mesh:
Year: 2012 PMID: 23131835 PMCID: PMC3717795 DOI: 10.18632/oncotarget.653
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Recurrent SHM-targets in DLBCL
The list of the SHM-targets that are mutated at a rate equal or higher than known aSHM targets in B cells. The results are sorted by the number of mutations in the region (i.e. column 3). Columns 5, 6 and 7 are various feature values reported as the hallmark of SHM. These features were calculated after correction for base composition in the region (i.e. they are normalized by the frequency of the bases in those regions). The p-value associated for each feature is calculated using the exact Fisher test method. The last three columns are the transcript RPKM values corresponding to the target region that is extracted from RNA-seq data of the available samples.
| Gene names | SHM indicator | Total SNVs | Mutated Samples | Transition/Transvertion (Pvalue) | Motif Bias (P-values) | C:G over A:T (P-value) | RPKM fold change between mutated vs. unmutated samples | Average RPKM in Tumor | Avearge RPKM Normal Bcell |
|---|---|---|---|---|---|---|---|---|---|
| BCL6 | 0.1389 | 179 | 27 | 1.27(0.06) | 1.41(0.0919) | 0.77(0.5) | 0.55739 | 61.4600 | 160.93086 |
| BCL2 | 0.2642 | 146 | 11 | 0.8(0.5) | 1.47(0.0738) | 0.79(0.5) | 1.29298 | 20.7300 | 2.59639 |
| 0.0123 | 55 | 18 | 1.04(0.45) | 2.78(0.0002) | 1.05(0.0172) | -0.27272 | 149.6800 | 223.5928 | |
| 0.0201 | 52 | 17 | 0.79(0.5) | 1.69(0.1114) | 1.41(0.0001) | 0.11158 | 1485.8800 | 1017.2736 | |
| 0.0000 | 52 | 16 | 1.17(0.29) | 4.18(0) | 1.26(0.0009) | 0.05879 | 50.4900 | 142.76265 | |
| 0.0509 | 42 | 17 | 0.68(0.5) | 2.91(0.0005) | 0.81(0.5) | 0.01346 | 76.7300 | 352.06877 | |
| SERPINA9 | 0.1296 | 36 | 7 | 0.57(0.5) | 2.15(0.0345) | 1.03(0.1261) | 5.48905 | 277.4700 | 237.10067 |
| 0.0006 | 34 | 8 | 1.13(0.37) | 3.49(0.0001) | 1.67(0) | 1.08042 | 162.1900 | 478.47502 | |
| 0.0000 | 34 | 5 | 0.62(0.5) | 5.5(0) | 1.37(0.0103) | 0.1586 | 2.9000 | 4.48411 | |
| 0.0083 | 32 | 14 | 1.46(0.14) | 4.29(0) | 0.9(0.5) | 0.73039 | 31.1700 | 96.05465 | |
| BACH2 | 0.5000 | 30 | 8 | 0.25(0.5) | 0.67(0.5) | 0.75(0.5) | 0.30362 | 8.0700 | 52.5643 |
| 0.0794 | 23 | 10 | 1.3(0.27) | 2.72(0.0156) | 1.15(0.1208) | 1.81466 | 142.6400 | 189.28412 | |
| BIRC3 | 0.1158 | 21 | 12 | 1.1(0.41) | 2.03(0.0975) | 1.4(0.0385) | -0.10012 | 80.9500 | 175.95683 |
| 0.0009 | 19 | 9 | 1.71(0.13) | 4.95(0) | 1.47(0.0123) | 0 | 0.2000 | 0.08058 | |
| TCL1A | 0.2012 | 17 | 8 | 0.55(0.5) | 1.03(0.4869) | 1.48(0.0335) | -0.07685 | 248.7300 | 709.73845 |
| ST6GAL1 | 0.2318 | 15 | 8 | 0.88(0.5) | 2.17(0.1233) | 1.03(0.202) | 0.23782 | 64.4800 | 149.40245 |
| 0.0032 | 14 | 8 | 0.56(0.5) | 5.18(0) | 1.7(0.0061) | 0.44198 | 10559.9000 | 8227.8865 | |
| 0.0272 | 14 | 5 | 1.33(0.3) | 3.3(0.0117) | 1.38(0.0058) | 0.16955 | 26.1800 | 39.5316 | |
| IRF8 | 0.2448 | 13 | 9 | 1.6(0.2) | 1.19(0.4275) | 1.14(0.1694) | -0.0691 | 174.1000 | 462.84745 |
| 0.0683 | 13 | 9 | 1.17(0.39) | 3.55(0.0076) | 1.22(0.1065) | 0.12187 | 191.6600 | 975.71198 | |
| 0.0008 | 13 | 9 | 1.6(0.2) | 6.69(0) | 1.11(0.2004) | 0 | 0.0000 | 0 | |
| LRMP | 0.2823 | 13 | 7 | 0.63(0.5) | 1.08(0.4667) | 1.48(0.0965) | 0.22716 | 149.9900 | 276.99144 |
| 0.0208 | 13 | 4 | 5.5(0.01) | 2.63(0.0714) | 1.28(0.0201) | 1.82701 | 106.0800 | 29.07161 | |
| 0.0003 | 12 | 9 | 1(0.5) | 6.29(0) | 1.78(0.001) | 0.49221 | 25.6600 | 23.75111 | |
| 0.0294 | 12 | 8 | 3(0.04) | 3.71(0.0059) | 1.26(0.1041) | 0.42032 | 87.7300 | 151.20776 | |
| 0.0025 | 12 | 7 | 0.71(0.5) | 5.9(0) | 1.68(0.002) | 0.42432 | 143.9600 | 968.41417 | |
| 0.0146 | 12 | 7 | 1(0.5) | 4.6(0.0003) | 1.47(0.0255) | 0.96916 | 84.0200 | 165.35743 | |
| S1PR2 | 0.0183 | 11 | 7 | 1.75(0.18) | 5.25(0.0005) | 1.19(0.0689) | 0.59678 | 22.3300 | 96.04705 |
| MALAT1 | 0.1786 | 11 | 7 | 1.2(0.38) | 2.6(0.0729) | 1.21(0.2048) | 0 | 0.0000 | 0 |
| SPRED2 | 0.2356 | 11 | 6 | 0.57(0.5) | 2.89(0.0523) | 0.75(0.5) | 1.46507 | 12.2400 | 22.09212 |
| 0.0114 | 10 | 7 | 1.5(0.26) | 6.39(0.0001) | 1.39(0.0726) | -0.2793 | 52.5200 | 127.01243 | |
| 0.0239 | 10 | 3 | 2.33(0.1) | 3.36(0.0301) | 2.28(0.0044) | 1.50279 | 10.5300 | 3.6875 | |
| LLT1 | 0.2591 | 10 | 3 | 2.33(0.1) | 1.49(0.338) | 0.49(0.5) | -0.21925 | 47.9800 | 86.73398 |
| ETS1 | 0.1877 | 9 | 8 | 0.5(0.5) | 2.08(0.2211) | 1.61(0.0598) | 0.40109 | 58.3700 | 102.81003 |
| 0.0040 | 9 | 4 | 2(0.16) | 6.18(0) | 1.18(0.0532) | 0.65633 | 119.7600 | 160.9238 | |
| 0.0609 | 8 | 5 | 0(0.5) | 4.1(0.0127) | 1.71(0.0355) | 0 | 0.0000 | 0 | |
| POU2AF1 | 0.5000 | 7 | 6 | 0.75(0.5) | 0(0.5) | 0.61(0.5) | -0.12034 | 153.9300 | 429.77219 |
| GADD45B | 0.1136 | 7 | 6 | 6(0.03) | 2.58(0.1562) | 0.93(0.3192) | -0.04866 | 30.9900 | 132.9862 |
| MS4A1 | 0.1944 | 7 | 4 | 6(0.03) | 0(0.5) | 0.66(0.5) | 0.03938 | 644.0700 | 715.41695 |
| P2RY8 | 0.3182 | 7 | 3 | 1.33(0.35) | 2.34(0.1826) | 0.92(0.5) | 0 | 0.4900 | 1.30263 |
| GRHPR | 0.1429 | 6 | 5 | 2(0.21) | 0(0.5) | 1.81(0.0282) | -0.17425 | 57.6200 | 27.42158 |
| NCOA3 | 0.1770 | 6 | 4 | 5(0.05) | 0(0.5) | 1.39(0.2165) | 0.22822 | 42.8100 | 76.49762 |
| 0.0140 | 6 | 3 | 6(0.01) | 5.29(0.0032) | 1.57(0.1199) | -0.31589 | 67.8200 | 239.48779 | |
| 0.0630 | 6 | 3 | 1(0.5) | 5.38(0.0029) | 1.42(0.1713) | 0.63538 | 22.5300 | 27.42303 |
Genes marked by a * / have been previously reported as targets of aSHM.
Genes with SHM indicator less than 0.1 are bold.
Average SHM feature values per group
The average feature values in each group of SHM-targets. The last row contains the IG loci. Groups I, II and III are divided based on the mutation rate in the SHM-targets.
| Groups | SHM indicator | Mutation enrichment in WRCY (P-value) | C:G over A:T (P-value) | Transition over Transversion (P-value) | Average RPKM in Mutated Samples | Average RPKM in Unmutated Samples | RPKM fold change | Average RPKM in Normal |
|---|---|---|---|---|---|---|---|---|
| Group 1 (mutation rate > 8e-5) | 0.11 | 3.12(0.13) | 1.25(0.17) | 1.67(0.32) | 502.7 | 357.1 | 0.59 | 463.3 |
| Group 2 (mutation rate > 4e-5) | 0.27 | 2.02(0.35) | 1.25(0.33) | 1.74(0.31) | 50.96 | 57.34 | 0.03 | 74.4 |
| Group 3 | 0.38 | 1.17(0.45) | 1.1(0.51) | 0.72(0.33) | 50.29 | 50 | 0.03 | 48.72 |
| IGH | 0.14 | 2.7(0.15) | 1.19(0.25) | 1.3(0.31) | 4482 | 2202 | 0.39 | 2846 |
Figure 1Mutation density in SHM-targets
The mutation density curves in a 12 kb region downstream of transcription start sites. The red bars indicate the median of the SNV distance to the transcription start sites. As the plots show the concentration of SNVs moves further away from the transcription start sites as we move from group I to group III. A Two-sample Kolmogorov-Smirnov test (conducted using the ks.test R package) also suggests that the SNV distance distribution in group one is significantly different from that of group two and three (P < 2.2e −16) while the distance distributions in group two and three show a much higher degree of similarity (P = 0.03457).
Figure 2Transcription rate in SHM genes
The left and middle plots depict RPKM fold change between mutated and unmuated samples in SHM-target region across IGH and non-IGH loci in group I, respectively. Here a positive value indicates an up-regulation in samples with mutation. Expression change is set to zero for the genes with low level of expression (i.e. RPKM less than 5). As the data in the middle plot suggests, there are more targets with a positive expression change amongst those with high mutation rate. More precisely, while over 70% of the target regions in group 1 are up-regulated in mutated samples, this ratio is 50% for targets in other groups (i.e. as expected on a random basis). The right plot depicts the average RPKM values for all the genes that has at least two mutations in their SHM-target region. The data in this plot shows that the absolute expression level in genes with higher SHM activities is also higher on average. The red smooth curves in the plots are polynomial regression fittings over the values computed using the loess R package. The targets on x-axis are sorted by mutation density in their SHM-target regions.
Figure 3Correlation between mutations and rearrangements
Distribution of somatic mutations in SHM-targets and correlation with genome rearrangements. A circos diagram [47] showing the distribution of somatic mutations in recurrently mutated SHM-targets and genomic rearrangements such as translocations and inversions. The purple circles represent the count of SNVs in the corresponding SHM-targets, and the arcs represent the chromosomal translocation events. The red and purple arcs represent translocation involving IGH loci and non-IGH loci, respectively. The size of the circles and the gene labels are proportional to the number of mutations in the SHM-target.
Somatic hypermutation and genomic rearrangements
Our observations show that somatic hypermutation commonly occurs in the absence of genomic rearrangements. Even for the BCL2 where aSHM previously reported in the context of (14:18) transclocation, we observed aSHM in the lack of any genomic rearrangment in several cases.
| Gene | Samples with mutations and rearrangements | Samples with mutations only | Samples without mutations or rearrangement |
|---|---|---|---|
| BCL6 | 7 | 20 | 13 |
| BCL2 | 8 | 3 | 29 |
| BTG2 | 0 | 18 | 22 |
| TMSL2 | 0 | 17 | 23 |
| ZFP36L1 | 0 | 16 | 24 |
| RHOH | 0 | 17 | 23 |
| SERPINA9 | 0 | 7 | 33 |
| CD83 | 0 | 8 | 32 |
| SGK1 | 0 | 5 | 35 |
| BCL7A | 0 | 14 | 26 |
| BACH2 | 1 | 7 | 32 |
| LTB | 0 | 10 | 30 |
| BIRC3 | 0 | 12 | 28 |
| HIST1H2AC | 0 | 9 | 31 |
| TCL1A | 0 | 8 | 32 |
| ST6GAL1 | 0 | 8 | 32 |
| CD74 | 0 | 8 | 32 |
| SOCS1 | 0 | 5 | 35 |
| IRF8 | 0 | 9 | 31 |
| BTG1 | 0 | 9 | 31 |
| LRMP | 0 | 7 | 33 |
| IRF4 | 0 | 4 | 36 |
| CIITA | 0 | 9 | 31 |
| DTX1 | 0 | 8 | 32 |
| CXCR4 | 0 | 7 | 33 |
| PIM1 | 1 | 6 | 33 |
| S1PR2 | 0 | 7 | 33 |
| SPRED2 | 0 | 6 | 34 |
| PAX5 | 0 | 7 | 33 |
| DMD | 0 | 3 | 37 |
| CLEC2D | 0 | 3 | 37 |
| ETS1 | 0 | 8 | 32 |
| DUSP2 | 0 | 4 | 36 |
| POU2AF1 | 0 | 6 | 34 |
| GADD45B | 0 | 6 | 34 |
| MS4A1 | 0 | 4 | 36 |
| P2RY8 | 0 | 3 | 37 |
| GRHPR | 0 | 5 | 35 |
| NCOA3 | 0 | 4 | 36 |
| UBE2J1 | 0 | 3 | 37 |
| MYC | 1 | 2 | 37 |