| Literature DB >> 34159849 |
Taobo Hu1, Guiyang Zhao2, Yiqiang Liu3, Mengping Long3.
Abstract
Triple-negative breast cancer is a heterogeneous disease with different molecular and histological subtypes. The Androgen receptor is expressed in a portion of triple-negative breast cancer cases and the activation of the androgen receptor pathway is thought to be a molecular subtyping signature as well as a therapeutic target for triple-negative breast cancer. Thus, identification of the androgen receptor pathway status is important for both molecular characterization andclinical management. In this study, we investigate the expression of the androgen receptor pathway in metaplastic breast cancer and luminal androgen receptor subtypes of triple-negative breast cancer and found that the androgen receptor pathway was downregulated in metaplastic breast cancer compared to luminal androgen receptor subtype. Using random forest, we found that the two subtypes of breast cancer can be molecularly classified with the gene expression of the androgen receptor pathway.Entities:
Keywords: AR; LAR; TNBC; metaplastic breast cancer; random forest
Mesh:
Substances:
Year: 2021 PMID: 34159849 PMCID: PMC8226237 DOI: 10.1177/15330338211027900
Source DB: PubMed Journal: Technol Cancer Res Treat ISSN: 1533-0338
Clinical Features of Selected Patients.
| Dependent | LAR | MBC |
| |
|---|---|---|---|---|
| Age | <50 | 8 (21.1) | 2 (14.3) | 0.879 |
| ≥ 50 | 30 (78.9) | 12 (85.7) | ||
| Ethnicity | Hispanic or Latino | 2 (5.3) | 1 (7.1) | 0.546 |
| Not Hispanic or Latino | 33 (86.8) | 13 (92.9) | ||
| Not reported | 3 (7.9) | 0 (0.0) | ||
| Tumor stage | Stage I | 7 (18.4) | 2 (21.4) | 0.942 |
| Stage IIa | 15 (39.5) | 5 (35.7) | ||
| Stage IIb | 6 (15.8) | 3 (21.4) | ||
| Stage III | 1 (2.6) | 0 (0.0) | ||
| Stage IIIa | 4 (10.5) | 2 (14.3) | ||
| Stage IIIb | 1 (2.6) | 1 (7.1) | ||
| Stage IIIc | 3 (7.9) | 0 (0.0) | ||
| Not reported | 1 (2.6) | 0 (0.0) | ||
| Tumor | T1 | 5 (13.2) | 2 (14.3) | 0.200 |
| T2 | 17 (44.7) | 2 (14.3) | ||
| T3 | 1 (2.6) | 0 (0.0) | ||
| Tx | 1 (2.6) | 0 (0.0) | ||
| Not reported | 14 (36.8) | 10 (71.4) | ||
| Lymph node | N0 | 9 (23.7) | 4 (28.6) | 0.082 |
| N1 | 9 (23.7) | 0 (0.0) | ||
| N2 | 3 (7.9) | 0 (0.0) | ||
| N3 | 3 (7.9) | 0 (0.0) | ||
| Not reported | 14 (36.8) | 10 (71.4) | ||
Differentially Expressed AR Pathway Genes Between MBC and LAR Cancers.a
| Name | Ensemble Id | log FC | Ave expr | t |
| B |
|---|---|---|---|---|---|---|
| RUNX2 | ENSG00000124813 | 1.69 | 15.63 | 5.84 | 3.51E-07 | 6.48 |
| SPDEF | ENSG00000124664 | −4.41 | 18.47 | −5.06 | 5.67E-06 | 3.86 |
| FOXA1 | ENSG00000129514 | −3.81 | 17.56 | −4.35 | 6.44E-05 | 1.59 |
| DDC | ENSG00000132437 | −5.73 | 11.54 | −4.30 | 7.47E-05 | 1.45 |
| AR | ENSG00000169083 | −2.79 | 16.30 | −3.87 | 0.000308289 | 0.14 |
| FKBP4 | ENSG00000004478 | −0.81 | 20.09 | −3.61 | 0.00069105 | −0.60 |
| SLC25A4 | ENSG00000151729 | −0.70 | 16.84 | −3.50 | 0.000974849 | −0.92 |
| ETV5 | ENSG00000244405 | 1.25 | 16.41 | 3.26 | 0.001946062 | −1.55 |
| FLNA | ENSG00000196924 | 0.84 | 21.03 | 3.20 | 0.002348338 | −1.72 |
| SMAD3 | ENSG00000166949 | 0.86 | 16.82 | 3.15 | 0.002696079 | −1.85 |
| SIRT1 | ENSG00000096717 | −0.70 | 17.01 | −3.00 | 0.00413892 | −2.23 |
| RCHY1 | ENSG00000163743 | −0.49 | 16.04 | −2.99 | 0.004228911 | −2.25 |
| TGIF1 | ENSG00000177426 | −0.51 | 17.64 | −2.96 | 0.004681079 | −2.34 |
| TGFB1I1 | ENSG00000140682 | 0.96 | 16.89 | 2.84 | 0.006513312 | −2.64 |
| NCOR2 | ENSG00000196498 | 0.48 | 18.05 | 2.81 | 0.006993992 | −2.70 |
| NCOA4 | ENSG00000266412 | −0.47 | 19.99 | −2.80 | 0.007114918 | −2.72 |
| HSP90AA1 | ENSG00000080824 | −0.61 | 22.56 | −2.76 | 0.007991213 | −2.82 |
| SVIL | ENSG00000197321 | 0.69 | 17.68 | 2.72 | 0.008962078 | −2.92 |
| SF1 | ENSG00000168066 | 0.23 | 19.67 | 2.69 | 0.009689098 | −2.99 |
| PRDX1 | ENSG00000117450 | −0.56 | 22.56 | −2.62 | 0.011373878 | −3.13 |
| HDAC1 | ENSG00000116478 | 0.39 | 19.47 | 2.54 | 0.014042583 | −3.32 |
| GPER1 | ENSG00000164850 | 1.05 | 13.97 | 2.50 | 0.015470474 | −3.40 |
| GTF2H2 | ENSG00000145736 | 0.88 | 12.40 | 2.46 | 0.017417516 | −3.51 |
| CASP8 | ENSG00000064012 | −0.56 | 16.41 | −2.39 | 0.020523257 | −3.65 |
| CDC25B | ENSG00000101224 | 0.63 | 18.83 | 2.27 | 0.027069795 | −3.89 |
| KAT5 | ENSG00000172977 | −0.27 | 17.29 | −2.13 | 0.038277859 | −4.18 |
| AHR | ENSG00000106546 | 0.67 | 18.12 | 2.11 | 0.039491891 | −4.21 |
| CDK1 | ENSG00000170312 | −0.75 | 18.05 | −2.10 | 0.040386094 | −4.23 |
| CAV1 | ENSG00000105974 | 0.75 | 18.77 | 2.07 | 0.043886593 | −4.30 |
| NR0B2 | ENSG00000131910 | −3.18 | 5.85 | −2.03 | 0.047145316 | −4.36 |
| GTF2F2 | ENSG00000188342 | 0.34 | 17.31 | 2.02 | 0.048112999 | −4.37 |
| FHL2 | ENSG00000115641 | 0.75 | 17.56 | 2.02 | 0.048381632 | −4.38 |
a The columns of the table are the gene name, the gene id, the estimated contrast, the expression mean over both groups, contrast t-value, contrast P-value and the estimated log-odds probability ratio (B) that the gene is differentially expressed.
Figure 1.AR pathway genes were differentially expressed in MBC and LAR. AR was highly expressed in the LAR group while its expression in MBC was low (left panel). The membrane-bound estrogen receptor, GPER1 showed a higher expression in MBC than in LAR (middle panel). As the gene with most significant expression difference, RUNX2 was upregulated in MBC while downregulated in LAR (right panel).
Figure 2.Classifying MBC and LAR using random forest algorithm. Clustering of MBC samples (blue) and LAR sample (red) using 167 AR pathway genes (A). Genes that contributed most to the classification were listed using 2 different parameters (B and C).
Classification Accuracy of the Random Forest Model.
| Actual classification | Predicted classification | |
|---|---|---|
| MBC | LAR | |
| MBC | 38 | 0 |
| LAR | 0 | 14 |
| Prediction accuracy | 100% | |
Figure 3.Visualization of 2 representative trees with the maximum and minimum nodes generated by random forest. The tree with maximum nodes used SPDEF gene expression value as the root node and the expression of other 9 genes as internal nodes, making the total nodes number to be 21. It was a 2-class split for each root and internal node which was determined by the gene expression value of the specific gene in the node. The cutoff value for the binary split in each node was calculated automatically (A). The tree with minimum nodes used the expression of the RUNX2 gene as the single root and internal node, generating 2 leaf nodes.
Figure 4.Cross-validation of the random forest algorithm for classification of MBC and LAR. A 5-fold cross-validation was performed for 100 times with the number of variables ranging from 1 to 166. The average value and standard deviation for cross-validation were plotted.