| Literature DB >> 27506935 |
Jianmei Zhao1,2, Xuecang Li1, Qianlan Yao3, Meng Li1, Jian Zhang1, Bo Ai1, Wei Liu4, Qiuyu Wang5, Chenchen Feng1, Yuejuan Liu1, Xuefeng Bai1, Chao Song5, Shang Li6, Enmin Li2, Liyan Xu2, Chunquan Li1,2.
Abstract
While gene fusions have been increasingly detected by next-generation sequencing (NGS) technologies based methods in human cancers, these methods have limitations in identifying driver fusions. In addition, the existing methods to identify driver gene fusions ignored the specificity among different cancers or only considered their local rather than global topology features in networks. Here, we proposed a novel network-based method, called RWCFusion, to identify phenotype-specific cancer driver gene fusions. To evaluate its performance, we used leave-one-out cross-validation in 35 cancers and achieved a high AUC value 0.925 for overall cancers and an average 0.929 for signal cancer. Furthermore, we classified 35 cancers into two classes: haematological and solid, of which the haematological got a highly AUC which is up to 0.968. Finally, we applied RWCFusion to breast cancer and found that top 13 gene fusions, such as BCAS3-BCAS4, NOTCH-NUP214, MED13-BCAS3 and CARM-SMARCA4, have been previously proved to be drivers for breast cancer. Additionally, 8 among the top 10 of the remaining candidate gene fusions, such as SULF2-ZNF217, MED1-ACSF2, and ACACA-STAC2, were inferred to be potential driver gene fusions of breast cancer by us.Entities:
Keywords: cancer; driver; gene fusion; network
Mesh:
Substances:
Year: 2016 PMID: 27506935 PMCID: PMC5308635 DOI: 10.18632/oncotarget.11064
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Figure 1The performance of RWCFusion evaluated by leave-one-out cross validation
(A, B and C) showed the AUC results for overall cancers, haematological and solid classes respectively based on leave-one-out cross validation. (D, E and F) showed the number of priviously known high-risk gene fusions identified by RWCFusion.
AUC performance of RWCFusion and existed method fusion centrality for single 35 single, haematological, solid and overall phenotype
| OMIM ID | Disease name/class | AUC value | ||
|---|---|---|---|---|
| RWCFusion | Centrality | RWCFusion-Centrality | ||
| 159595 | MYELOPROLIFERATIVE SYNDROME, TRANSIENT | 0.997 | 0.970 | 0.027 |
| 613024 | FOLLICULAR LYMPHOMA, SUSCEPTIBILITY TO, 1; FL1 | 0.997 | 0.969 | 0.028 |
| 613065 | LEUKEMIA, ACUTE LYMPHOBLASTIC; ALL | 0.949 | 0.873 | 0.076 |
| 113970 | BURKITT LYMPHOMA; BL | 0.997 | 0.997 | 0 |
| 114480 | BREAST CANCER | 0.784 | 0.700 | 0.084 |
| 131440 | MYELOPROLIFERATIVE DISORDER, CHRONIC, WITH EOSINOPHILIA | 0.997 | 0.975 | 0.022 |
| 137800 | GLIOMA SUSCEPTIBILITY 1; GLM1 | 0.759 | 0.763 | −0.004 |
| 144700 | RENAL CELL CARCINOMA, NONPAPILLARY; RCC | 0.978 | 0.746 | 0.232 |
| 150699 | LEIOMYOMA, UTERINE; UL | 0.996 | 0.690 | 0.306 |
| 151400 | LEUKEMIA, CHRONIC LYMPHOCYTIC; CLL | 0.995 | 0.992 | 0.003 |
| 155601 | MELANOMA, CUTANEOUS MALIGNANT, SUSCEPTIBILITY TO, 2; CMM2 | 0.703 | 0.883 | −0.18 |
| 167000 | OVARIAN CANCER | 0.624 | 0.598 | 0.026 |
| 176807 | PROSTATE CANCER | 0.914 | 0.794 | 0.12 |
| 181030 | SALIVARY GLAND ADENOMA, PLEOMORPHIC | 0.993 | 0.838 | 0.155 |
| 188550 | THYROID CARCINOMA, PAPILLARY | 0.995 | 0.954 | 0.041 |
| 211980 | LUNG CANCER | 0.914 | 0.843 | 0.071 |
| 215300 | CHONDROSARCOMA | 0.964 | 0.781 | 0.183 |
| 236000 | LYMPHOMA, HODGKIN | 0.657 | 0.667 | −0.01 |
| 259500 | OSTEOGENIC SARCOMA | 0.73 | 0.707 | 0.023 |
| 268210 | RHABDOMYOSARCOMA 1; RMS1 | 0.997 | 0.959 | 0.038 |
| 268220 | RHABDOMYOSARCOMA 2;RMS2 | 1 | 0.968 | 0.032 |
| 300854 | RENAL CELL CARCINOMA, Xp11-ASSOCIATED; RCCX1 | 1 | 0.731 | 0.269 |
| 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.98 | 0.903 | 0.077 |
| 605027 | LYMPHOMA, NON-HODGKIN, FAMILIAL | 0.999 | 0.923 | 0.076 |
| 607685 | HYPEREOSINOPHILIC SYNDROME, IDIOPATHIC; HES | 0.999 | 0.971 | 0.028 |
| 607785 | JUVENILE MYELOMONOCYTIC LEUKEMIA; JMML | 0.994 | 0.981 | 0.013 |
| 608232 | LEUKEMIA, CHRONIC MYELOID; CML | 0.971 | 0.933 | 0.038 |
| 612160 | HISTIOCYTOMA, ANGIOMATOID FIBROUS | 1 | 0.995 | 0.005 |
| 612219 | EWING SARCOMA; ES | 1 | 0.900 | 0.1 |
| 612237 | CHONDROSARCOMA, EXTRASKELETAL MYXOID | 1 | 0.768 | 0.232 |
| 612376 | ACUTE PROMYELOCYTIC LEUKEMIA; APL | 0.996 | 0.954 | 0.042 |
| 613488 | MYXOID LIPOSARCOMA | 0.936 | 0.858 | 0.078 |
| 614286 | MYELODYSPLASTIC SYNDROME; MDS | 0.979 | 0.893 | 0.086 |
| 254700 | MYELOPROLIFERATIVE DISEASE, AUTOSOMAL RECESSIVE | 0.735 | 0.929 | −0.194 |
| 300813 | SARCOMA, SYNOVIAL | 0.996 | 0.427 | 0.569 |
| Solid | 0.968 | 0.903 | 0.065 | |
| Haematological | 0.867 | 0.77 | 0.097 | |
| Overall | 0.925 | 0.845 | 0.080 | |
Figure 2Performance compare between RWCFusion and existing method: fusion centrality
(A–C) showed the AUC compare for overall cancers, haematological and solid classes respectively. (D) showed the top 10 AUC gap of single cancer.
The comparison between RWCFusion and existed method fusion centrality when scoring a common candidate gene fusion which is fused in several different cancers simultaneously
| gene1 | gene2 | OMIMID | Disease name | Cen-score | Our score |
|---|---|---|---|---|---|
| FGFR1OP | FGFR1 | 159595 | MYELOPROLIFERATIVE SYNDROME, TRANSIENT | 0.028 | 0.0019 |
| 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.028 | 0.00017 | ||
| FGFR1OP2 | FGFR1 | 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.024 | 0.00012 |
| 159595 | MYELOPROLIFERATIVE SYNDROME, TRANSIENT | 0.024 | 0.00098 | ||
| NUP98 | TOP1 | 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.027 | 0.00046 |
| 159595 | MYELOPROLIFERATIVE SYNDROME, TRANSIENT | 0.027 | 0.00028 | ||
| PAPOLA | AK7 | 114480 | BREAST CANCER | 0.0087 | 0.000010 |
| 176807 | PROSTATE CANCER | 0.0087 | 0.000019 | ||
| MYO9B | FCHO1 | 176807 | PROSTATE CANCER | 0.0042 | 0.000003 |
| 114480 | BREAST CANCER | 0.0042 | 0.000016 | ||
| PAX3 | NCOA1 | 268210 | RHABDOMYOSARCOMA 1; RMS1 | 0.022 | 0.00044 |
| 268220 | RHABDOMYOSARCOMA 2; RMS2 | 0.022 | 0.000027 | ||
| MLL | CREBBP | 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.036 | 0.00027 |
| 114480 | BREAST CANCER | 0.036 | 0.00010 | ||
| PAX3 | FOXO1 | 268210 | RHABDOMYOSARCOMA 1; RMS1 | 0.031 | 0.00050 |
| 268220 | RHABDOMYOSARCOMA 2; RMS2 | 0.031 | 0.00069 | ||
| BCR | ABL1 | 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.040 | 0.00037 |
| 608232 | LEUKEMIA, CHRONIC MYELOID; CML | 0.040 | 0.00128 | ||
| BCR | JAK2 | 131440 | MYELOPROLIFERATIVE DISORDER, CHRONIC, WITH EOSINOPHILIA | 0.049 | 0.00057 |
| 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.049 | 0.00033 | ||
| CBFB | MYH11 | 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.015 | 0.00024 |
| 608232 | LEUKEMIA, CHRONIC MYELOID; CML | 0.015 | 0.00025 | ||
| FIP1L1 | PDGFRA | 607685 | HYPEREOSINOPHILIC SYNDROME, IDIOPATHIC; HES | 0.021 | 0.00072 |
| 601626 | LEUKEMIA, ACUTE MYELOID; AML | 0.021 | 0.00014 | ||
| CCDC6 | PDGFRB | 607785 | JUVENILE MYELOMONOCYTIC LEUKEMIA; JMML | 0.025 | 0.00044 |
| 608232 | LEUKEMIA, CHRONIC MYELOID; CML | 0.025 | 0.00020 | ||
| NDE1 | PDGFRB | 608232 | LEUKEMIA, CHRONIC MYELOID; CML | 0.030 | 0.00021 |
| 607785 | JUVENILE MYELOMONOCYTIC LEUKEMIA; JMML | 0.030 | 0.00042 | ||
| PAX7 | FOXO1 | 268210 | RHABDOMYOSARCOMA 1; RMS1 | 0.026 | 0.00042 |
| FOXO1 | PAX7 | 268220 | RHABDOMYOSARCOMA 2; RMS2 | 0.026 | 0.00064 |
| PDGFRB | NIN | 607685 | HYPEREOSINOPHILIC SYNDROME, IDIOPATHIC; HES | 0.025 | 0.00034 |
| NIN | PDGFRB | 131440 | MYELOPROLIFERATIVE DISORDER, CHRONIC, WITH EOSINOPHILIA | 0.025 | 0.00075 |
Cen-score are scores of gene fusions by scoring method fusion centrality.
Our score are scores of gene fusions by our method RWCFusion.
The previously known high-risk gene fusions of breast cancer identified by RWCFusion
| left gene | right gene | score |
|---|---|---|
| BCAS3 | BCAS4 | 0.006677 |
| NOTCH1 | NUP214 | 0.006672 |
| MED13 | BCAS3 | 0.006652 |
| CARM1 | SMARCA4 | 0.006652 |
| RPS6KB1 | SNF8 | 0.006636 |
| VMP1 | RPS6KB1 | 0.006635 |
| ARFGEF2 | SULF2 | 0.006611 |
| GLB1 | CMTM7 | 0.006606 |
| MED1 | STXBP4 | 0.006604 |
| VAPB | IKZF3 | 0.006588 |
| PKIA | RARA | 0.006582 |
| MYO9B | RAB22A | 0.006577 |
| CYTH1 | EIF3H | 0.006574 |
The top 10 of remaining gene fusions of breast cancer ranked by RWCFusion apart from the previously known high-risk gene fusions in Table 3
| left gene | left chr | left coordinates | right gene | right chr | right coordinates | RWCFusion score |
|---|---|---|---|---|---|---|
| SULF2 | 20q13.12-q13.13 | 46415148 | ZNF217 | 02q13 | 52210294 | 0.000333 |
| 52210645 | ||||||
| MED1 | 17q12 | 37595417 | ACSF2 | 17q21 | 48548388 | 0.00332 |
| ACACA | 17q12 | 35479452 | STAC2 | 17q12 | 37374425 | 0.00329 |
| STARD3 | 17q12 | 37793483 | DOK5 | 20q13 | 53259996 | 0.00329 |
| USP32 | 17q23 | 58342772 | PPM1D | 17q23 | 58679978 | 0.0000642 |
| 46371087 | ||||||
| THRA | 17q21 | 38243105 | SKAP1 | 17q21 | 46371708 | 0.0000567 |
| 46384692 | ||||||
| PPP1R12A | 12q21 | 80211173 | SEPT10 | 2q13 | 110343414 | 0.0000322 |
| AHCTF1 | 1q44 | 247094879 | NAAA | 4q21 | 76846963 | 0.0000241 |
| TOB1 | 17q21 | 48943418 | SYNRG | 17q12 | 35880750 | 0.0000238 |
| SUMF1 | 3p26 | 4418013 | LRRFIP2 | 3p22 | 37170639 | 0.0000161 |
Partner genes that involved in a previously known high-risk gene fusion of breast cancer according to CHiTaRs at the same time.
Partner genes that play important roles in the occurrence and development of breast cancer according to literatures evidence.
Figure 3The fusion sites of the top 2 potential driver gene fusions of breast cancer identified by RWFusion
(A, C) showed the chromosome location the fusion happened. (B, D) showed the reads that covered the breakpoints.
Figure 4The flow diagram of RWCFusion
(A) candidate gene fusions. (B) high-risk gene fusions. (C) and (D). the process of coring by RWCfusion. The nodes with rectangle shape represent partner genes of candidate gene fusions predicted by TopHat-Fusion, and the color of them reflect their scores by RWR: gray represent initial scores 0 and different level of red color reflects scores after RWR, the deeper the red color is, the higher RWR score it has. The nodes with circle shape represent partner genes of previously known high-risk gene fusions. The red color represent their initial score 1 before RWR. The straight line between two nodes (B, D) indicate that they are known or predicted to be fused.