| Literature DB >> 33020566 |
Camilla Nero1,2, Francesca Ciccarone3, Luca Boldrini4, Jacopo Lenkowicz4, Ida Paris3, Ettore Domenico Capoluongo5, Antonia Carla Testa3, Anna Fagotti3, Vincenzo Valentini4, Giovanni Scambia3.
Abstract
Radiogenomics is a specific application of radiomics where imaging features are linked to genomic profiles. We aim to develop a radiogenomics model based on ovarian US images for predicting germline BRCA1/2 gene status in women with healthy ovaries. From January 2013 to December 2017 a total of 255 patients addressed to germline BRCA1/2 testing and pelvic US documenting normal ovaries, were retrospectively included. Feature selection for univariate analysis was carried out via correlation analysis. Multivariable analysis for classification of germline BRCA1/2 status was then carried out via logistic regression, support vector machine, ensemble of decision trees and automated machine learning pipelines. Data were split into a training (75%) and a testing (25%) set. The four strategies obtained a similar performance in terms of accuracy on the testing set (from 0.54 of logistic regression to 0.64 of the auto-machine learning pipeline). Data coming from one of the tested US machine showed generally higher performances, particularly with the auto-machine learning pipeline (testing set specificity 0.87, negative predictive value 0.73, accuracy value 0.72 and 0.79 on training set). The study shows that a radiogenomics model on machine learning techniques is feasible and potentially useful for predicting gBRCA1/2 status in women with healthy ovaries.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33020566 PMCID: PMC7536234 DOI: 10.1038/s41598-020-73505-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1A ROI mask applied to the image used to extract the pixel values (three orthogonal diameters of left ovary).
Figure 2Flow chart of the study.
The clinical characteristics of the testing and training dataset.
| Overall | Testing dataset | Training dataset | ||||
|---|---|---|---|---|---|---|
| g-BRCA 1–2 p | g-BRCA WT | g-BRCA 1–2 p | g-BRCA WT | |||
| Number (%) | 255 (100) | 25 (39) | 39 (61) | 73 (33) | 118 (67) | 1.000 |
| Mean age (year) | 45.5 | 45 | 46 | 47 | 44 | 0.878 |
| Age range (year) | 23–79 | 23–73 | 26–79 | |||
| Fertile, no E/P | 92 (36) | 8 (32) | 7 (18) | 30 (41.1) | 47 (39.8) | 0.454 |
| Iatrogenic menopause | 97 (38) | 9 (36) | 23 (59) | 18 (24.6) | 47 (39.8) | 1.000 |
| Menopause | 61 (24) | 7 (28) | 7 (18) | 24 (32.9) | 22 (18.7) | 0.941 |
| HT in menopause | 1 (0.4) | – | 1 (2.5) | – | – | – |
| Unknown | 4 (1,6) | 1 (4) | 1 (2.5) | 1 (1.4) | 2 (1.7) | – |
| Family history of cancer | 45 (17.5) | 6 (24) | 2 (5.1) | 24 (32.9) | 12 (10.2) | 0.460 |
| Personal history of cancer | 204 (80) | 19 (76) | 35 (89.7) | 49 (67.1) | 102 (86.4) | 0.843 |
| Unknown | 6 (2.5) | – | 2 (5.2) | – | 4 (3.4) | – |
| Yes | 15 (6) | 1 (4) | 3 (7.7) | 7 (9.6) | 4 (3.4) | 0.458 |
| No | 240 (94) | 24 (96) | 36 (92.3) | 66 (90.4) | 114 (96.6) | 0.758 |
| GE Healthcare Voluson E10 | 101 (40) | 7 (28) | 19 (48.7) | 29 (39.7) | 47 (40) | 0.425 |
| Canon Toshiba Aplio-i900 | 53 (21) | 4 (16) | 9 (23.1) | 17 (23.3) | 23 (20) | 0.670 |
| Samsung Elite | 62 (24) | 10 (40) | 5 (12.8) | 20 (27.4) | 27 (23) | 0.183 |
| Esaote-My LabTM Twice | 37 (15) | 4 (16) | 6 (15.4) | 7 (9.6) | 20 (17) | 0.669 |
| 1 | 5 (1.9) | 0 (0.0) | 1 (2.5) | 4 (5.4) | 0 (0.0) | 0.401 |
| 2 | 42 (16.5) | 4 (16.0) | 6 (15.3) | 14 (19.2) | 18 (15.2) | 1.000 |
| 3 | 36 (14.1) | 3 (12.0) | 6 (15.3) | 6 (8.2) | 21 (17.8) | 0.824 |
| 4 | 172 (67.4) | 18 (72.0) | 26 (66.7) | 49 (67.1) | 79 (66.9) | 0.897 |
| Right | 22.1 | 21.3 | 18.3 | 21.9 | 23.6 | 0.771 |
| Left | 20.4 | 20.7 | 18.7 | 19.2 | 21.6 | 0.794 |
| Right | 4878.0 | 3545.7 | 2973.3 | 7138.3 | 4536.2 | 0.340 |
| Left | 4228.8 | 3787.6 | 3045.4 | 3715.9 | 4998.9 | 0.788 |
Number of features selected excluding (also, in percentage of the number of selected features).
| Dataset | Number of features selected with Pearson correlation test | Number of features selected with Wilcoxon–Mann–Whitney test |
|---|---|---|
| All | 61 | 6 (9.8%) |
| Voluson | 45 | 16 (35%) |
| Toshiba | 42 | 3 (7.1%) |
| Samsung | 44 | 3 (6.8%) |
| Esaote | 40 | 2 (5.0%) |
Radiomics features whose association test with the binary outcome was statistically significative with corresponding p-value, for the different datasets.
| US type | Feature name | |
|---|---|---|
| All | F_cm.info.corr.2 | 2.00 e−04 |
| All | F_cm.joint.var | 2.55 e−02 |
| All | L_major | 2.72 e−02 |
| All | F_rlm.gl.var | 3.02 e−02 |
| All | F_cm.sum.entr | 3.03 e−02 |
| All | FD_0.100 | 3.40 e−02 |
| Voluson | F_cm.info.corr.2 | 3.24 e−06 |
| Voluson | F_cm.inv.diff | 7.93 e−06 |
| Voluson | F_cm.sum.avg | 9.36 e−06 |
| Voluson | F_cm.inv.diff.norm | 4.13 e−05 |
| Voluson | F_rlm.sre | 4.42 e−04 |
| Voluson | F_cm.energy | 5.38 e−04 |
| Voluson | F_rlm.lrlrlm | 6.70 e−04 |
| Voluson | F_rlm.lgre | 1.05 e−03 |
| Voluson | F_rlm.srlge | 1.62 e−03 |
| Voluson | F_cm.joint.max | 6.80 e−03 |
| Voluson | F_rlm.lrhge | 6.85 e−03 |
| Voluson | F_szm.sze | 7.42 e−03 |
| Voluson | F_szm.zsnu.norm | 9.19 e−03 |
| Voluson | F_cm.corr | 1.36 e−02 |
| Voluson | F_cm.clust.prom | 2.38 e−02 |
| Voluson | FD_0.100 | 2.60 e−02 |
| Toshiba | F_stat.range | 2.81 e−03 |
| Toshiba | F_rlm.rl.entr | 3.79 e−02 |
| Toshiba | F_szm.lze | 4.38 e−02 |
| Samsung | F_morph.surface | 7.20 e−03 |
| Samsung | F_szm.z.entr | 1.46 e−02 |
| Samsung | F_rlm.glnu | 4.54 e−02 |
| Esaote | F_cm.2.5Dmerged.corr | 1.80 e−02 |
| Esaote | F_cm.clust.prom | 4.58 e−02 |
Figure 3Different machine learning models performance for the different datasets. Metrics from left to right : accuracy on training set, accuracy on testing set, specificity on testing set, sensitivity on testing set, negative predictive value on testing set, positive predictive value on testing set.