| Literature DB >> 30866472 |
Mingxin Tao1,2, Tianci Song3, Wei Du4, Siyu Han5, Chunman Zuo6, Ying Li7, Yan Wang8, Zekun Yang9.
Abstract
It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.Entities:
Keywords: CNV; MKL; breast cancer subtypes; mRNA; methylation data
Mesh:
Substances:
Year: 2019 PMID: 30866472 PMCID: PMC6471546 DOI: 10.3390/genes10030200
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
The definition of distinct breast cancer subtypes. TNBC: Triple-negative breast cancer; HER2: human epidermal growth factor receptor 2.
| Breast Cancer Subtypes | Definition |
|---|---|
| Luminal A | ER/PR+, Her2− |
| Luminal B | ER/PR+, Her2+ |
| TNBC | ER/PR−, Her2− |
| HER2 (+) | ER/PR−, Her2+ |
| Unclear | Other samples |
Figure 1The process of breast cancer subtypes prediction. TCGA: The Cancer Genome Atlas; CNV: copy number variation.
The numbers of distinct breast cancer subtypes.
| Breast Subtypes | Cancer Patients |
|---|---|
| Luminal A | 277 |
| Luminal B | 40 |
| TNBC | 70 |
| HER2 (+) | 11 |
| Unclear | 208 |
The accuracies of any two breast cancer subtypes with three kernels. Bold numbers are the best performance of this binary classification.
| Breast Cancer Subtypes | mRNA | Methylation | CNV | MKL |
|---|---|---|---|---|
| Luminal A vs. luminal B | 0.436 | 0.436 | 0.490 |
|
| Luminal A vs. HER2 (+) | 0.739 | 0.566 | 0.739 |
|
| Luminal A vs. TNBC |
| 0.867 | 0.604 | 0.859 |
| Luminal A vs. Unclear | 0.760 |
| 0.473 | 0.831 |
| Luminal B vs. HER2 (+) | 0.732 | 0.776 | 0.485 |
|
| Luminal B vs. TNBC | 0.871 |
| 0.855 | 0.873 |
| Luminal B vs. Unclear | 0.696 | 0.748 |
| 0.747 |
| HER2 (+) vs. TNBC | 0.5 | 0.5 | 0.5 |
|
| HER2 (+) vs. Unclear | 0.495 | 0.498 | 0.5 |
|
| TNBC vs. Unclear | 0.806 | 0.836 | 0.717 |
|
| Mean | 0.690 | 0.696 | 0.613 |
|
The AUC of any two breast cancer subtypes with three kernels. Bold numbers are the best performance of this binary classification.
| Breast Cancer Subtypes | mRNA | Methylation | CNV | MKL |
|---|---|---|---|---|
| Luminal A vs. luminal B | 0.835 | 0.632 | 0.810 |
|
| Luminal A vs. HER2 (+) | 0.973 | 0.903 | 0.979 |
|
| Luminal A vs. TNBC |
| 0.926 | 0.909 | 0.930 |
| Luminal A vs. Unclear | 0.824 | 0.878 | 0.589 |
|
| Luminal B vs. HER2 (+) | 0.843 | 0.824 | 0.725 |
|
| Luminal B vs. TNBC |
| 0.932 | 0.941 | 0.945 |
| Luminal B vs. Unclear | 0.875 | 0.808 | 0.835 |
|
| HER2 (+) vs. TNBC | 0.867 | 0.778 | 0.741 |
|
| HER2 (+) vs. Unclear | 0.925 | 0.873 | 0.859 |
|
| TNBC vs. Unclear | 0.902 | 0.918 | 0.834 |
|
| Mean | 0.893 | 0.847 | 0.822 |
|
Figure 2The gene expression of ESR1.
Figure 3The accuracy of multi-classification in breast cancer subtypes.
Figure 4The accuracy of multi-classification in TNBC subtypes.
Figure 5The Accuracy of Random Forest, Neural Network, SMO-MKL on any two breast cancer subtypes.
Figure 6The AUC of Random Forest, Neural Network, SMO-MKL on any two breast cancer subtypes.
Figure 7The heatmap of breast cancer subtypes in mRNA data.
Figure 8The heatmap of breast cancer subtypes in Methylation data.
Figure 9The heatmap of breast cancer subtypes in CNV data.