| Literature DB >> 32986110 |
Jacopo Troisi1,2,3, Antonio Raffone4, Antonio Travaglino5, Gaetano Belli6, Carmen Belli6, Santosh Anand7,8, Luigi Giugliano1, Pierpaolo Cavallo9,10, Giovanni Scala11, Steven Symes12,13, Sean Richards13,14, David Adair13, Alessio Fasano3, Vincenzo Bottigliero1, Maurizio Guida2,4.
Abstract
Importance: Endometrial carcinoma (EC) is the most commonly diagnosed gynecologic cancer. Its early detection is advisable because 20% of women have advanced disease at the time of diagnosis. Objective: To clinically validate a metabolomics-based classification algorithm as a screening test for EC. Design, Setting, and Participants: This diagnostic study enrolled 2 cohorts. A multicenter prospective cohort, with 50 cases (postmenopausal women with EC; International Federation of Gynecology and Obstetrics stage I-III and grade G1-G3) and 70 controls (no EC but matched on age, years from menopause, tobacco use, and comorbidities), was used to train multiple classification models. The accuracy of each trained model was then used as a statistical weight to produce an ensemble machine learning algorithm for testing, which was validated with a subsequent prospective cohort of 1430 postmenopausal women. The study was conducted at the San Giovanni di Dio e Ruggi d'Aragona University Hospital of Salerno (Italy) and Lega Italiana per la Lotta contro i Tumori clinic in Avellino (Italy). Data collection was conducted from January 2018 to February 2019, and analysis was conducted from January to March 2019. Main Outcomes and Measures: The presence or absence of EC based on evaluation of the blood metabolome. Metabolites were extracted from dried blood samples from all participants and analyzed by gas chromatography-mass spectrometry. A confusion matrix was used to summarize test results. Performance indices included sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios, and accuracy. Confirmation or exclusion of EC in women with a positive test result was by means of hysteroscopy. Participants with negative results were followed up 1 year after enrollment to investigate the appearance of EC signs.Entities:
Year: 2020 PMID: 32986110 PMCID: PMC7522698 DOI: 10.1001/jamanetworkopen.2020.18327
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Enrollment Flow Chart
The first enrollment was the training set, and the second was the test set. EC indicates endometrial cancer.
Study Population Features
| Characteristic | Participants, No. (%) | ||
|---|---|---|---|
| Training set | Test set (n = 1430) | ||
| With no EC (n = 70) | With EC (n = 50) | ||
| Age, mean (SD), y | 68.2 (11.7) | 69.4 (13.8) | 59.7 (7.7) |
| Age at last period, mean (SD), y | 52.3 (3.9) | 50.1 (4.4) | 49.8 (4.4) |
| Time from menopause, mean (SD), y | 11.5 (7.4) | 11.9 (8.8) | 12.0 (8.2) |
| Blood pressure, mean (SD), mm Hg | |||
| Systolic | 128.5 (4.1) | 129.1 (6.5) | 120.5 (11.2) |
| Diastolic | 81.2 (5.3) | 82.9 (6.0) | 75.6 (7.8) |
| Hypertension | |||
| Yes | 27 (38.6) | 23 (45.0) | 582 (40.7) |
| No | 43 (61.4) | 28 (55.0) | 848 (59.3) |
| Heart rate, mean (SD), bpm | 78.5 (4.3) | 81.0 (6.7) | 70.4 (6.4) |
| Weight, mean (SD), kg | 73.2 (10.4) | 75.6 (11.8) | 69.1 (12.5) |
| Height, mean (SD), cm | 162.1 (4.8) | 160.7 (5.1) | 160.6 (5.7) |
| BMI | |||
| Mean (SD) | 27.6 (4.3) | 29.3 (4.9) | 26.8 (4.6) |
| Underweight, No. (%) | 1 (1.4) | 1 (2.0) | 6 (0.4) |
| Normal weight, No. (%) | 13 (18.6) | 9 (18.0) | 285 (19.9) |
| Overweight, No. (%) | 41 (58.6) | 29 (57.5) | 829 (58.0) |
| Obesity, No. (%) | 15 (20.8) | 12 (23.0) | 310 (21.7) |
| Endometrial thickness, mean (SD), mm | <4 | 22.5 (14.0) | <4 |
| Abdominal circumference, mean (SD), cm | 78.2 (16.5) | 82.3 (18.9) | 79.5 (24.5) |
| Tobacco use | |||
| Current | 15 (21.4) | 13 (26.0) | 349 (24.4) |
| Never | 40 (57.1) | 26 (52.0) | 831 (58.1) |
| Former | 15 (21.5) | 11 (22.0) | 250 (17.5) |
| ≥30 Packages/y among all participants | 4 (5.3) | 2 (4.0) | 83 (5.8) |
| ≥30 Packages/y among participants with current tobacco use, No./total No. (%) | 4/15 (26.7) | 2/13 (15.4) | 83/349 (23.8) |
| Cigarette packs per y, mean (SD) | 12.2 (7.6) | 15.7 (12.8) | 16.4 (14.1) |
| Metrorrhagia in last year | |||
| Yes | 2 (2.8) | 47 (94.0) | 51 (3.6) |
| No | 68 (97.2) | 3 (6.0) | 1379 (96.4) |
| Diabetes | |||
| Yes | 6 (8.6) | 5 (10.0) | 127 (8.9) |
| No | 64 (91.4) | 45 (90.0) | 1303 (91.1) |
| Hypertriglyceridemia | |||
| Yes | 4 (5.7) | 4 (8.0) | 76 (5.3) |
| No | 66 (94.3) | 46 (92.0) | 1354 (94.7) |
| Hyperuricemia | |||
| Yes | 1 (1.4) | 1 (2.0) | 13 (0.9) |
| No | 69 (98.6) | 49 (98.0) | 1417 (99.1) |
| Vasculopathies | |||
| Yes | 6 (8.6) | 5 (10.0) | 142 (9.9) |
| No | 64 (91.4) | 45 (90.0) | 1288 (90.1) |
| Cholecystectomies | |||
| Yes | 6 (8.6) | 6 (12.0) | 134 (9.4) |
| No | 64 (91.4) | 44 (88.0) | 1296 (90.6) |
| CABG | |||
| Yes | 0 (0.0) | 1 (2.0) | 11 (0.8) |
| No | 70 (100.0) | 49 (98.0) | 1419 (99.2) |
Abbreviations: BMI, body mass index (calculated as weight in kilograms divided by height in meters squared); CABG, coronary artery bypass graft surgery; EC, endometrial carcinoma.
P < .001.
P < .01.
P < .05.
Figure 2. Partial Least Square–Discriminant Analysis (PLS-DA) Model
A, The figure illustrates separation between the training set data, indicating that the measured metabolomic profiles of patients with vs without endometrial carcinoma (EC). B, The best performance (denoted with a) was achieved when using 5 principal components as the basis for classification. C, PLS-DA model validation by permutation test based on separation distance. Perm indicates permutation-based.
Classification Models and EML Diagnostic Performance
| Enrollment | Classification model | % (SE) | Likelihood ratio | Accuracy | ||||
|---|---|---|---|---|---|---|---|---|
| Sensitivity | Specificity | PPV | NPV | Positive | Negative | |||
| Training set | Decision tree | 95.0 (3.4) | 97.5 (1.7) | 95.0 (3.4) | 0.97.5 (1.7) | 38.0 | 0.05 | 96.7 |
| Naive Bayes | 65.0 (7.5) | 96.3 (2.1) | 89.7 (5.7) | 84.6 (3.8) | 17.3 | 0.36 | 85.8 | |
| Random forest | 87.5 (5.2) | 100.0 (0.0) | 100.0 (0.0) | 94.1 (2.6) | ND | 0.13 | 95.8 | |
| k–Nearest neighbors | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | ND | 0.00 | 100.0 | |
| Artificial neural network | 92.5 (4.2) | 100.0 (0.0) | 100.0 (0.0) | 96.4 (2.0) | ND | 0.08 | 97.5 | |
| Linear discriminant analysis | 50.0 (7.9) | 100.0 (0.0) | 100.0 (0.0) | 80.0 (4.0) | ND | 0.50 | 83.3 | |
| Support vector machine | 55.0 (7.9) | 100.0 (0.0) | 100.0 (0.0) | 81.6 (3.9) | ND | 0.45 | 85.0 | |
| Linear regression | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | ND | 0.00 | 100.0 | |
| Deep learning | 97.5 (2.5) | 98.8 (1.2) | 97.5 (2.5) | 98.8 (1.2) | 78.0 | 0.03 | 98.3 | |
| Partial least squares–discriminant analysis | 92.5 (4.2) | 100.0 (0.0) | 100.0 (0.0) | 96.4 (2.0) | ND | 0.08 | 97.5 | |
| EML | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | 100.0 (0.0) | ND | 0.00 | 100.0 | |
| Test set | EML | 100.0 (0.0) | 99.9 (1.0) | 88.9 (7.4) | 100.0 (0.0) | 707.0 | 0.0 | 99.9 |
Abbreviations: EML, ensemble machine learning; ND, not determinable; NPV, negative predictive value; PPV, positive predictive value.
Figure 3. Endometrial Cancer Ensemble Machine Learning (EC-EML) Score
A, Circles represent EC-EML scores. The orange line represents the cutoff value evaluated by Youden index optimization, while the blue line represents EC-EML score of 0, which was the projected cutoff. B, The dotted lines represent the sensitivity while the continuous lines represent the specificity. C, Receiver operating characteristic (ROC) curve of the EC-EML score.