| Literature DB >> 35204589 |
Rossana Castaldo1, Nunzia Garbino1, Carlo Cavaliere1, Mariarosaria Incoronato1, Luca Basso1, Renato Cuocolo2,3,4, Leonardo Pace5, Marco Salvatore1, Monica Franzese1, Emanuele Nicolai1.
Abstract
Radiomics is rapidly advancing in precision diagnostics and cancer treatment. However, there are several challenges that need to be addressed before translation to clinical use. This study presents an ad-hoc weighted statistical framework to explore radiomic biomarkers for a better characterization of the radiogenomic phenotypes in breast cancer. Thirty-six female patients with breast cancer were enrolled in this study. Radiomic features were extracted from MRI and PET imaging techniques for malignant and healthy lesions in each patient. To reduce within-subject bias, the ratio of radiomic features extracted from both lesions was calculated for each patient. Radiomic features were further normalized, comparing the z-score, quantile, and whitening normalization methods to reduce between-subjects bias. After feature reduction by Spearman's correlation, a methodological approach based on a principal component analysis (PCA) was applied. The results were compared and validated on twenty-seven patients to investigate the tumor grade, Ki-67 index, and molecular cancer subtypes using classification methods (LogitBoost, random forest, and linear discriminant analysis). The classification techniques achieved high area-under-the-curve values with one PC that was calculated by normalizing the radiomic features via the quantile method. This pilot study helped us to establish a robust framework of analysis to generate a combined radiomic signature, which may lead to more precise breast cancer prognosis.Entities:
Keywords: PCA; breast cancer; machine learning; molecular biomarkers; normalization; radiomic features
Year: 2022 PMID: 35204589 PMCID: PMC8871349 DOI: 10.3390/diagnostics12020499
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Analysis Workflow.
Figure 2Workflow for PC reduction. PCA: principal component analysis; QNT: quantile normalization method; WHT: whitening normalization method; PCs: principal components; Th: threshold; MD: median.
Patient characteristics.
| Variables [N] | Number of Missing Patients | Median | Range | Mean | SD |
|---|---|---|---|---|---|
| Age (years) [N = 27] | 0 | 57 | 82–35 | 55.259 | 13.75 |
| circulating miR-125b-5p [N = 22] | 5 | 0.017 | 0.102–0.006 | 0.026 | 0.024 |
| circulating miR-143-3p [N = 22] | 5 | 0.009 | 0.061–0.002 | 0.018 | 0.018 |
| circulating miR-145-5p [N = 22] | 5 | 0.006 | 0.045–0.002 | 0.012 | 0.012 |
| circulating miR_100_5p [N = 19] | 8 | 0.010 | 0.051–0.004 | 0.017 | 0.014 |
| circulating miR_23a_3p [N = 19] | 8 | 0.155 | 0.438–0.039 | 0.19 | 0.13 |
| ESTROGEN RECEPTOR STATUS (%) [N = 23] | 4 | 90 | 99–0.5 | 75.87 | 32.289 |
| PROGESTERONE RECEPTOR STATUS (%) [N = 24] | 3 | 55 | 99–0.5 | 52.979 | 38.606 |
| HER2 STATUS (%) [N = 10] | 17 | 90 | 99–60 | 84.2 | 15.747 |
| Ki-67 (%) [N = 24] | 3 | 40 | 80–5 | 41.25 | 26.996 |
|
|
| ||||
| Molecular subtype classification ER/PR/HER [N = 24] | 3 | ||||
| +/−/+ | 1 | 4.17 | |||
| +/+/− | 13 | 54.17 | |||
| +/+/+ | 10 | 41.67 | |||
| Grading [N = 19] | 8 | ||||
| G2 | 11 | 57.89 | |||
| G3 | 8 | 42.11 | |||
Figure 3Cumulative variance plots for the 4 datasets, based on a threshold of 0.6. (a) Normalized only as the ratio of malignant and healthy radiomic features; (b) Z-Score; (c) quantile; and (d) whitening.
Figure 4Box plots for tumor grade (i.e., G2 and G3). (a) Circulating miRNA_125b_5p; (b) PC3 from the quantile dataset.
Figure 5Performance measures to classify tumor grade. (a) Algorithm performance via PC3 to classify tumor grade. (b) ROC curves of the three classifiers. AUC: area under the curve; RF: random forest; LDA: linear discriminant analysis.
Figure 6Box plots for high and low values of Ki-67 (i.e., Class 1 and 2). (a) Age; (b) Progesterone status; (c) PC3 from the quantile dataset.
Figure 7Performance measures to classify high and low values of Ki-67. (a) Algorithm performance via PC3 to classify high and low values of Ki-67. (b) ROC curves of the three classifiers. AUC: area under the curve; RF: random forest; LDA: linear discriminant analysis.
Figure 8Box Plots for the tumor subtypes (i.e., Luminal A and Luminal B). (a) Ki-67; (b) PC6 from the non-normalized dataset; (c) PC6 from z-score dataset; (d) PC3 from the quantile dataset; (e) PC4 from the quantile dataset.
Figure 9Performance measures to classify the tumor subtypes. (a) Algorithm performance via PC3 to classify the tumor subtypes. (b) ROC curves of the three classifiers. AUC: area under the curve; RF: random forest; LDA: linear discriminant analysis.