| Literature DB >> 25329067 |
Heewon Park1, Teppei Shimamura1, Satoru Miyano1, Seiya Imoto1.
Abstract
The personal genomics era has attracted a large amount of attention for anti-cancer therapy by patient-specific analysis. Patient-specific analysis enables discovery of individual genomic characteristics for each patient, and thus we can effectively predict individual genetic risk of disease and perform personalized anti-cancer therapy. Although the existing methods for patient-specific analysis have successfully uncovered crucial biomarkers, their performance takes a sudden turn for the worst in the presence of outliers, since the methods are based on non-robust manners. In practice, clinical and genomic alterations datasets usually contain outliers from various sources (e.g., experiment error, coding error, etc.) and the outliers may significantly affect the result of patient-specific analysis. We propose a robust methodology for patient-specific analysis in line with the NetwrokProfiler. In the proposed method, outliers in high dimensional gene expression levels and drug response datasets are simultaneously controlled by robust Mahalanobis distance in robust principal component space. Thus, we can effectively perform for predicting anti-cancer drug sensitivity and identifying sensitivity-specific biomarkers for individual patients. We observe through Monte Carlo simulations that the proposed robust method produces outstanding performances for predicting response variable in the presence of outliers. We also apply the proposed methodology to the Sanger dataset in order to uncover cancer biomarkers and predict anti-cancer drug sensitivity, and show the effectiveness of our method.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25329067 PMCID: PMC4201473 DOI: 10.1371/journal.pone.0108990
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Iteration for coefficients in the regularized regression modeling with lasso (i.e., δ = 1) penalty.
Figure 2Coefficient functions of varying coefficient model.
Comparison prediction accuracy of model with p = 1000 and p = 200.
| 5% | 10% | 15% | 20% | |||||
|
|
|
|
|
|
|
|
| |
|
|
| 0.333 | 0.266 |
| 0.259 | 0.289 | 0.251 | 0.274 |
|
| 0.280 |
| 0.266 | 0.290 |
|
| 0.251 |
|
Results of simulation 1 with Outlier for N(5, 1).
| Type 1 | Type 2 | ||||||
| T.P | T.N | P.E | T.P | T.N | P.E | ||
| ELA | - | - | 0.338 | - | - | 0.324 | |
| 5% | NP | 0.71 | 1.00 | 0.290 | 0.70 | 1.00 | 0.276 |
| R | 0.71 | 1.00 |
| 0.70 | 1.00 |
| |
| ELA | - | - | 0.325 | - | - | 0.329 | |
| 10% | NP | 0.69 | 1.00 | 0.290 | 0.70 | 1.00 | 0.310 |
| R | 0.69 | 1.00 |
| 0.70 | 1.00 |
| |
| ELA | - | - | 0.289 | - | - | 0.294 | |
| 15% | NP | 0.71 | 1.00 | 0.288 | 0.70 | 1.00 | 0.264 |
| R | 0.71 | 1.00 |
| 0.70 | 1.00 |
| |
| ELA | - | - | 0.285 | - | - | 0.259 | |
| 20% | NP | 0.71 | 1.00 | 0.254 | 0.69 | 1.00 | 0.258 |
| R | 0.71 | 1.00 |
| 0.69 | 1.00 |
| |
Results of simulation 2 with Outlier for N(5, 5).
| Type 1 | Type 2 | ||||||
| T.P | T.N | P.E | T.P | T.N | P.E | ||
| ELA | - | - | 0.321 | - | - | 0.314 | |
| 5% | NP | 0.69 | 1.00 | 0.280 | 0.70 | 1.00 | 0.277 |
| R | 0.69 | 1.00 |
| 0.70 | 1.00 |
| |
| ELA | - | - | 0.298 | - | - | 0.280 | |
| 10% | NP | 0.70 | 1.00 | 0.266 | 0.70 | 1.00 | 0.251 |
| R | 0.70 | 1.00 |
| 0.70 | 1.00 |
| |
| ELA | - | -0 | 0.261 | - | -0 | 0.255 | |
| 15% | NP | 0.71 | 1.00 | 0.227 | 0.69 | 1.00 | 0.240 |
| R | 0.71 | 1.00 |
| 0.69 | 1.00 |
| |
| ELA | - | - | 0.290 | - | - | 0.229 | |
| 20% | NP | 0.71 | 1.00 | 0.251 | 0.70 | 1.00 | 0.214 |
| R | 0.71 | 1.00 |
| 0.70 | 1.00 |
| |
Figure 3Sorted C values of 138 dataset.
Prediction results of drug sensitivity by using NetworkProfiler based on 133 and 500 genes.
| FTI.277 | DMOG | NSC.87877 | AKT.inhibitor.VIII | Midostaurin | |
| p500 | 0.402 |
| 0.208 | 0.303 | 0.263 |
| p133 |
| 0.239 | 0.211 |
|
|
Comparison of prediction accuracy of drug sensitivity.
| FTI.277 | DMOG | NSC.87877 | AKT.inhibitor.VIII | Midostaurin | |
| R | 0.293 |
|
|
|
|
| NP | 0.291 | 0.239 | 0.211 | 0.232 | 0.134 |
| Elastic net |
| 0.561 | 0.323 | 0.447 | 0.477 |
Figure 4Identified biomarkers on each anti-cancer drug.
Identified biomarkers shown the top 10 highest frequency.
| Gene | Freq | Reference | Disease |
| FN1 | 1,019 |
| breast cancer, colorectal cancer |
| TACSTD2 | 962 |
| breast cancer |
|
| 960 |
| metastatic caners, Germ Cell Tumors |
| IGKCIGKV1-5 | 957 |
| leukocytes in human peripheral blood, breast cancer |
|
| 939 |
| Germ Cell Tumors |
| COL1A2 | 935 |
| breast cancer, prostate cancer |
| SERPINE2 | 935 |
| chronic obstructive pulmonary disease |
| CD24 | 855 |
| Breast Cancer |
| IFITM3 | 855 |
| breast cancer, Colon Cancer |
| LDHB | 833 |
| Breast Cancer |