| Literature DB >> 30917548 |
Oneeb Rehman1, Hanqi Zhuang2, Ali Muhamed Ali3, Ali Ibrahim4, Zhongwei Li5.
Abstract
Certain small noncoding microRNAs (miRNAs) are differentially expressed in normal tissues and cancers, which makes them great candidates for biomarkers for cancer. Previously, a selected subset of miRNAs has been experimentally verified to be linked to breast cancer. In this paper, we validated the importance of these miRNAs using a machine learning approach on miRNA expression data. We performed feature selection, using Information Gain (IG), Chi-Squared (CHI2) and Least Absolute Shrinkage and Selection Operation (LASSO), on the set of these relevant miRNAs to rank them by importance. We then performed cancer classification using these miRNAs as features using Random Forest (RF) and Support Vector Machine (SVM) classifiers. Our results demonstrated that the miRNAs ranked higher by our analysis had higher classifier performance. Performance becomes lower as the rank of the miRNA decreases, confirming that these miRNAs had different degrees of importance as biomarkers. Furthermore, we discovered that using a minimum of three miRNAs as biomarkers for breast cancers can be as effective as using the entire set of 1800 miRNAs. This work suggests that machine learning is a useful tool for functional studies of miRNAs for cancer detection and diagnosis.Entities:
Keywords: breast cancer detection; cancer biomarkers; classification; feature selection; machine learning; miRNAs
Year: 2019 PMID: 30917548 PMCID: PMC6468888 DOI: 10.3390/cancers11030431
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Schematics for Cancer Detection with Machine Learning.
Clinically Verified miRNA.
| miRNA [ | |||
|---|---|---|---|
| hsa-mir-10b | hsa-let-7d | hsa-mir-206 | hsa-mir-34a |
| hsa-mir-125b-1 | hsa-let-7f-1 | hsa-mir-17 | hsa-mir-27b |
| hsa-mir-145 | hsa-let-7f-2 | hsa-mir-335 | hsa-mir-126 |
| hsa-mir-21 | hsa-mir-206 | hsa-mir-373 | hsa-mir-101-1 |
| hsa-mir-125a | hsa-mir-30a | hsa-mir-520c | hsa-mir-101-2 |
| hsa-mir-17 | hsa-mir-30b | hsa-mir-27a | hsa-mir-146a |
| hsa-mir-125b-2 | hsa-mir-203a | hsa-mir-221 | hsa-mir-146b |
| hsa-let-7a-2 | hsa-mir-203b | hsa-mir-222 | hsa-mir-205 |
| hsa-let-7a-3 | has-mir-213 | hsa-mir-200c | |
| hsa-let-7c | hsa-mir-155 | hsa-mir-31 | |
Performance Metrics across different thresholds of miRNA Features (3, 5, 10).
| Classifier | Method | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|
| RF | 0.996 | 1.000 | 0.952 | 0.999 | |
| IG-10 | 0.995 | 0.998 | 0.962 | 0.996 | |
| IG-5 | 0.996 | 0.997 | 0.977 | 0.998 | |
| IG-3 | 0.997 | 0.997 | 0.990 | 0.999 | |
| CHI2-10 | 0.995 | 0.999 | 0.952 | 0.995 | |
| CHI2-5 | 0.996 | 0.999 | 0.979 | 0.996 | |
| CHI2-3 | 0.996 | 0.997 | 0.981 | 0.999 | |
| LASS-10 | 0.996 | 0.998 | 0.971 | 0.997 | |
| LASS-5 | 0.995 | 0.997 | 0.965 | 0.998 | |
| LASS-3 | 0.994 | 0.997 | 0.962 | 0.999 | |
| SVM-RBF | 0.989 | 1.000 | 0.875 | 0.938 | |
| IG-10 | 0.994 | 0.998 | 0.952 | 0.995 | |
| IG-5 | 0.996 | 1.000 | 0.990 | 0.985 | |
| IG-3 | 0.998 | 0.998 | 0.990 | 0.980 | |
| CHI2-10 | 0.994 | 0.999 | 0.951 | 0.995 | |
| CHI2-5 | 0.996 | 0.998 | 0.983 | 0.993 | |
| CHI2-3 | 0.998 | 0.999 | 0.990 | 0.980 | |
| LASS-10 | 0.995 | 0.998 | 0.962 | 0.996 | |
| LASS-5 | 0.995 | 0.999 | 0.974 | 0.985 | |
| LASS-3 | 0.996 | 0.999 | 0.962 | 0.980 | |
| SVM | 0.997 | 0.999 | 0.971 | 0.985 | |
| IG-10 | 0.997 | 0.999 | 0.971 | 0.997 | |
| IG-5 | 0.997 | 0.999 | 0.985 | 0.989 | |
| IG-3 | 0.998 | 0.999 | 0.990 | 0.981 | |
| CHI2-10 | 0.997 | 0.999 | 0.971 | 0.997 | |
| CHI2-5 | 0.996 | 1.000 | 0.988 | 0.987 | |
| CHI2-3 | 0.998 | 0.999 | 0.990 | 0.991 | |
| LASS-10 | 0.994 | 0.997 | 0.962 | 0.996 | |
| LASS-5 | 0.995 | 0.999 | 0.956 | 0.993 | |
| LASS-3 | 0.997 | 1.000 | 0.962 | 0.981 |
Top Ranked Features Under Different Feature Selection Techniques.
| Info Gain | CHI2 | Lasso |
|---|---|---|
| hsa-mir-10b | hsa-mir-10b | hsa-let-7a-3 |
| hsa-let-7c | hsa-let-7c | hsa-let-7c |
| hsa-mir-145 | hsa-mir-145 | hsa-let-7d |
| hsa-mir-125b-1 | hsa-mir-125b-2 | hsa-mir-101-1 |
| hsa-mir-125b-2 | hsa-mir-125b-1 | hsa-mir-10b |
| hsa-mir-335 | hsa-mir-335 | hsa-mir-125b-2 |
| hsa-mir-126 | hsa-mir-126 | hsa-mir-145 |
| hsa-mir-125a | hsa-mir-125a | hsa-mir-206 |
| hsa-let-7a-2 | hsa-let-7a-2 | hsa-mir-27b |
| hsa-let-7a-3 | hsa-let-7a-3 | hsa-mir-335 |
Subset Selection of Ranked miRNA.
| Subset 1 | Subset 2 | Subset 3 | Subset 4 | Subset 5 | Subset 6 | Subset 7 | Subset 8 |
|---|---|---|---|---|---|---|---|
| hsa-mir-10b | hsa-let-7c | hsa-mir-145 | hsa-mir-125b-1 | hsa-mir-125b-2 | hsa-mir-335 | hsa-mir-126 | hsa-mir-125a |
| hsa-let-7c | hsa-mir-145 | hsa-mir-125b-1 | hsa-mir-125b-2 | hsa-mir-335 | hsa-mir-126 | hsa-mir-125a | hsa-let-7a-2 |
| hsa-mir-145 | hsa-mir-125b-1 | hsa-mir-125b-2 | hsa-mir-335 | hsa-mir-126 | hsa-mir-125a | hsa-let-7a-2 | hsa-let-7a-3 |
Figure 2Specificity Across Different Clinical miRNA Subsets.