| Literature DB >> 26544691 |
Heewon Park1, Seiya Imoto1, Satoru Miyano1.
Abstract
Uncovering driver genes is crucial for understanding heterogeneity in cancer. L1-type regularization approaches have been widely used for uncovering cancer driver genes based on genome-scale data. Although the existing methods have been widely applied in the field of bioinformatics, they possess several drawbacks: subset size limitations, erroneous estimation results, multicollinearity, and heavy time consumption. We introduce a novel statistical strategy, called a Recursive Random Lasso (RRLasso), for high dimensional genomic data analysis and investigation of driver genes. For time-effective analysis, we consider a recursive bootstrap procedure in line with the random lasso. Furthermore, we introduce a parametric statistical test for driver gene selection based on bootstrap regression modeling results. The proposed RRLasso is not only rapid but performs well for high dimensional genomic data analysis. Monte Carlo simulations and analysis of the "Sanger Genomics of Drug Sensitivity in Cancer dataset from the Cancer Genome Project" show that the proposed RRLasso is an effective tool for high dimensional genomic data analysis. The proposed methods provide reliable and biologically relevant results for cancer driver gene selection.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26544691 PMCID: PMC4636151 DOI: 10.1371/journal.pone.0141869
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Running timings for regression modeling for σ = 1 via glmnet package in R (unit: minute).
| Type 1 | Type 2 | Type 3 | Type 4 | Type 5 | Type 6 | Type 7 | Type 8 | |
|---|---|---|---|---|---|---|---|---|
| RCS.RD.LA | 14.9 | 16.1 | 16.3 | 16.2 | 9.2 | 9.1 | 8.7 | 8.7 |
| RD.LA | 116.1 | 123.1 | 121.6 | 122.2 | 58.7 | 58.9 | 58.3 | 58.5 |
Average of importance measures for predictor variables with non-zero and zero coefficients.
| RCS.RD.EL | RCS.RD.LA | RD.LA | |||||
|---|---|---|---|---|---|---|---|
| Non.ZERO | ZERO | Non.ZERO | ZERO | Non.ZERO | ZERO | ||
|
| Type1 | 0.33(0.32) | 0.07(0.10) | 0.34(0.29) | 0.06(0.05) | 0.27(0.24) | 0.05(0.06) |
| Type2 | 0.29(0.28) | 0.06(0.09) | 0.30(0.24) | 0.05(0.05) | 0.24(0.21) | 0.05(0.05) | |
| Type3 | 0.29(0.33) | 0.12(0.19) | 0.28(0.22) | 0.11(0.09) | 0.23(0.21) | 0.11(0.01) | |
| Type4 | 0.23(0.27) | 0.11(0.17) | 0.22(0.18) | 0.10(0.08) | 0.19(0.18) | 0.09(0.10) | |
| Type5 | 0.08(0.09) | 0.02(0.04) | 0.08(0.08) | 0.02(0.02) | 0.05(0.07) | 0.02(0.02) | |
| Type6 | 0.08(0.09) | 0.02(0.04) | 0.08(0.08) | 0.02(0.02) | 0.05(0.07) | 0.02(0.02) | |
| Type7 | 0.07(0.12) | 0.05(0.09) | 0.07(0.07) | 0.04(0.04) | 0.06(0.08) | 0.04(0.05) | |
| Type8 | 0.07(0.11) | 0.05(0.08) | 0.06(0.06) | 0.04(0.04) | 0.05(0.07) | 0.04(0.05) | |
|
| Type1 | 0.33(0.31) | 0.07(0.01) | 0.34(0.73) | 0.06(0.18) | 0.26(0.24) | 0.05(0.06) |
| Type2 | 0.28(0.29) | 0.06(0.09) | 0.29(0.26) | 0.05(0.05) | 0.23(0.20) | 0.05(0.05) | |
| Type3 | 0.29(0.34) | 0.13(0.19) | 0.28(0.22) | 0.11(0.09) | 0.23(0.21) | 0.11(0.11) | |
| Type4 | 0.22(0.27) | 0.11(0.17) | 0.21(0.18) | 0.10(0.08) | 0.18(0.18) | 0.09(0.10) | |
| Type5 | 0.07(0.09) | 0.02(0.04) | 0.07(0.08) | 0.02(0.02) | 0.05(0.07) | 0.02(0.02) | |
| Type6 | 0.07(0.08) | 0.02(0.04) | 0.07(0.07) | 0.02(0.02) | 0.05(0.07) | 0.02(0.02) | |
| Type7 | 0.07(0.12) | 0.05(0.09) | 0.07(0.07) | 0.04(0.04) | 0.06(0.08) | 0.04(0.05) | |
| Type8 | 0.07(0.11) | 0.04(0.08) | 0.06(0.06) | 0.04(0.04) | 0.05(0.07) | 0.04(0.05) | |
|
| Type1 | 0.30(0.32) | 0.07(0.11) | 0.31(0.26) | 0.06(0.05) | 0.24(0.22) | 0.06(0.06) |
| Type2 | 0.28(0.28) | 0.07(0.10) | 0.29(0.25) | 0.06(0.05) | 0.22(0.21) | 0.06(0.06) | |
| Type3 | 0.29(0.34) | 0.13(0.19) | 0.27(0.23) | 0.11(0.09) | 0.23(0.23) | 0.11(0.11) | |
| Type4 | 0.23(0.27) | 0.12(0.16) | 0.22(0.18) | 0.10(0.08) | 0.18(0.18) | 0.09(0.10) | |
| Type5 | 0.07(0.09) | 0.02(0.04) | 0.07(0.07) | 0.02(0.02) | 0.06(0.07) | 0.02(0.02) | |
| Type6 | 0.07(0.09) | 0.02(0.04) | 0.07(0.07) | 0.02(0.02) | 0.05(0.07) | 0.02(0.02) | |
| Type7 | 0.08(0.12) | 0.05(0.10) | 0.07(0.07) | 0.04(0.05) | 0.06(0.08) | 0.04(0.05) | |
| Type8 | 0.07(0.10) | 0.05(0.08) | 0.06(0.06) | 0.04(0.04) | 0.05(0.07) | 0.04(0.05) | |
Fig 1Prediction error: Root mean squared error.
Fig 2Variable selection results: Average of T.P and T.N.
Average of root mean squared errors of 99 regression models and average of running timings (unit: minute).
| RCS.RD.EL | RCS.RD.LA | RD.LA | ELA | AD.LA | LASSO | |
|---|---|---|---|---|---|---|
| MSE | 1.70 | 1.70 | 1.70 | 1.80 | 1.74 | 1.83 |
| Running timings | 211.2 | 32.3 | 398.9 | - | - | - |
Identified potential driver genes of anti-cancer drugs and their evidences.
| Drug | Gene | Reference | Disease |
|---|---|---|---|
| Doxorubicin |
| [ | Breast carcinoma cells, Colon cancer |
|
| [ | Liver, Breast cancers | |
|
| [ | Colorectal cancer | |
|
| [ | Oral, Breast, Lung, Pancreatic, Gastric cancers | |
|
| [ | Muscles of electrically stunned chikens | |
|
| [ | Breast, Liver Cancers | |
|
| [ | Ovarian carcinoma | |
|
| [ | Lung cancer | |
|
| [ | Pancreatic cancer | |
|
| [ | Lung cancer | |
| Docetaxel |
| [ | Breast, Bladder cancers |
|
| [ | Lung squamous cell carcinoma, Breast cancer | |
|
| [ | Lung cancer | |
|
| [ | Pancreatic ductal carcinoma | |
|
| [ | Esophageal, Ovarian cancers | |
|
| [ | Chronic lymphocytic leukemia | |
|
| [ | Breast, Ovarian cancers | |
|
| [ | Breast, Ovarian cancers | |
| F | [ | Breast carcinoma | |
|
| [ | Prostate cancer | |
| Gemcitabine |
| [ | Breast, Lung cancers |
|
| [ | Renal carcinoma | |
|
| [ | Bladder cancer, Lung squamous cell cancer | |
|
| [ | Lung cancer | |
|
| [ | Breast, Gastric cancers | |
|
| [ | Gastric cancer | |
|
| [ | Lung cancer | |
|
| [ | Ovarian, Bladder cancers | |
|
| [ | Breast, Ovarian cancers | |
|
| [ | Breast cancer | |
| Vinorelbine |
| [ | Laryngeal, Head and neck, Gastric cancers |
|
| [ | Breast cancer | |
|
| [ | Breast cancer | |
|
| [ | Gastric, Colorectal cancers | |
|
| [ | Glioblastoma multiforme | |
|
| [ | Prostate cancer | |
|
| [ | Breast cancers | |
|
| [ | Prostate, Lung cancers | |
|
| [ | Ovarian cancer | |
|
| [ | Colorectal adenocarcinomas | |
| Cisplatin |
| [ | Colorectal cancer |
|
| [ | Colorectal cancer | |
|
| [ | Squamous, Ovarian cancers | |
|
| [ | Colorectal cancer | |
|
| [ | Breast cancer | |
|
| [ | Lung Cancer | |
|
| [ | Colorectal, Colon cancers | |
|
| [ | Colorectal cancer | |
|
| [ | Prostate, Colorectal cancers | |
|
| [ | Parkinson’s disease, Breast cancer |
Fig 3Network for selected driver genes and genes having PPI with identified driver genes.
Importance measures for gene with large subnetwork.
| Drug |
|
|
| |
|---|---|---|---|---|
|
| Vinorelbine | 0.0139 | 0.0028 | 0.0013 |
|
| Vinorelbine | 0.0134 | 0.0028 | 0.0013 |
|
| Dexorubicin | 0.0405 | 0.0032 | 0.0015 |
|
| Dexorubicin | 0.0347 | 0.0032 | 0.0015 |
|
| Docetaxel | 0.0204 | 0.0048 | 0.0015 |
|
| Gemcitabine | 0.1997 | 0.0160 | 0.0062 |
|
| Cisplatin | 0.0022 | 0.0039 | 0.0019 |