| Literature DB >> 35022502 |
Sofiane Bendifallah1,2,3, Anne Puchar4,5, Stéphane Suisse6, Léa Delbos7,8, Mathieu Poilblanc9,10, Philippe Descamps7,8, Francois Golfier9,10, Cyril Touboul4,5, Yohann Dabi4,5, Emile Daraï4,5.
Abstract
Endometriosis-a systemic and chronic condition occurring in women of childbearing age-is a highly enigmatic disease with unresolved questions. While multiple biomarkers, genomic analysis, questionnaires, and imaging techniques have been advocated as screening and triage tests for endometriosis to replace diagnostic laparoscopy, none have been implemented routinely in clinical practice. We investigated the use of machine learning algorithms (MLA) in the diagnosis and screening of endometriosis based on 16 key clinical and patient-based symptom features. The sensitivity, specificity, F1-score and AUCs of the MLA to diagnose endometriosis in the training and validation sets varied from 0.82 to 1, 0-0.8, 0-0.88, 0.5-0.89, and from 0.91 to 0.95, 0.66-0.92, 0.77-0.92, respectively. Our data suggest that MLA could be a promising screening test for general practitioners, gynecologists, and other front-line health care providers. Introducing MLA in this setting represents a paradigm change in clinical practice as it could replace diagnostic laparoscopy. Furthermore, this patient-based screening tool empowers patients with endometriosis to self-identify potential symptoms and initiate dialogue with physicians about diagnosis and treatment, and hence contribute to shared decision making.Entities:
Mesh:
Year: 2022 PMID: 35022502 PMCID: PMC8755739 DOI: 10.1038/s41598-021-04637-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow chart of population for model development and validation.
Demographic characteristics of the training dataset for patient with and without endometriosis.
| Patient with endometriosis | Patient without endometriosis | P < value | |
|---|---|---|---|
| Age (mean ± SD) | 29 ± 8 | 28 ± 9 | < 0.001 |
| BMI (body mass index) (mean ± SD) | 23.41 ± 4.88 | 23.10 ± 4.56 | 0.12 |
| Mother/daughter history of endometriosis | |||
| Yes | 21 (1.9%) | 4 (0.7%) | |
| No | 1105 (98.1%) | 604 (99.3%) | 0.056 |
| Dysmenorrhea/VAS of Dysmenorrhea (mean ± SD) | 6 ± 3.4 | 5 ± 3.2 | < 0.001 |
| Maximum length of periods (mean ± SD) | 6 ± 4 | 5 ± 3 | < 0.001 |
| Abdominal pain outside menstruation | |||
| Yes | 721 (64.1%) | 179 (29.4%) | < 0.001 |
| No | 405 (35.9%) | 429 (70.6%) | |
| Pain suggesting sciatica | |||
| Yes | 427 (37.9%) | 61 (10.1%) | |
| No | 699 (62.1%) | 547 (89.9%) | < 0.001 |
| Pain on sexual intercourse | 3.8 ± 3.5 | 2.3 ± 3.0 | < 0.001 |
| Lower back pain outside menstruation | |||
| Yes | 693 (61.5%) | 200 (32.9%) | |
| No | 433 (38.5%) | 408 (67.1%) | < 0.001 |
| Painful defecation (mean ± SD) | 3.2 ± 3.3 | 1.5 ± 2.4 | < 0.001 |
| Alternating diarrhea/constipation during menstruation | |||
| Yes | 718 (63.7%) | 234 (38.5%) | |
| No | 408 (36.3%) | 374 (61.5%) | < 0.001 |
| Urinary pain during menstruation (mean ± SD) | 1.4 ± 2.5 | 0.5 ± 1.4 | < 0.001 |
| Blood in the stools during menstruation | |||
| Yes | 179 (15.9%) | 45 (7.4%) | < 0.001 |
| No | 947 (84.1%) | 563 (92.6%) | |
| Blood in urine during menstruation | |||
| Yes | 150 (13.3%) | 61 (10.1%) | |
| No | 976 (86.7%) | 547 (89.9%) | 0.046 |
| Absenteeism duration in the last 6 months (mean ± SD) | 7 ± 22 | 3 ± 12 | < 0.001 |
| Number of non-hormonal pain treatments used (mean ± SD) | 1 ± 1 | 0 ± 1 | < 0.001 |
Demographic characteristics of the training and validation dataset.
| Training set | Validation set | P < value | |
|---|---|---|---|
| Age (mean ± SD) | 29 ± 8 | 31 ± 5 | < 0.001 |
| BMI (body mass index) (mean ± SD) | 23.41 ± 4.88 | 24.3 ± 4.82 | < 0.001 |
| Mother/daughter history of endometriosis | |||
| Yes | 21 (1.9%) | 8 (8%) | |
| No | 1105 (98.1%) | 92 (92%) | 0.001 |
| Dysmenorrhea/VAS of dysmenorrhea (mean ± SD) | 6 ± 3.4 | 7.3 ± 3 | < 0.001 |
| Maximum length of periods (mean ± SD) | 6 ± 4 | 8 ± 4 | < 0.001 |
| Abdominal pain outside menstruation | |||
| Yes | 721 (64.1%) | 67 (67%) | |
| No | 405 (35.9%) | 33 (33%) | 0.5527 |
| Pain suggesting sciatica | |||
| Yes | 427 (37.9%) | 53 (53%) | 0.003 |
| No | 699 (62.1%) | 47 (47%) | |
| Pain on sexual intercourse | 3.8 ± 3.5 | 5.1 ± 3.5 | < 0.001 |
| Lower back pain outside menstruation | |||
| Yes | 693 (61.5)% | 79 (79%) | 0.00053 |
| No | 433 (38.5)% | 21 (21%) | |
| Painful defecation (mean ± SD) | 3.2 ± 3.3 | 4.2 ± 3.3 | < 0.001 |
| Alternating diarrhea/constipation during menstruation | |||
| Yes | 718 (63.7%) | 80 (80%) | |
| No | 408 (36.3%) | 20 (20%) | 0.0010 |
| Urinary pain during menstruation (mean ± SD) | 1.4 ± 2.5 | 1.9 ± 2.9 | < 0.001 |
| Blood in the stools during menstruation | |||
| Yes | 179 (15.9%) | 20 (20%) | 0.2862 |
| No | 947 (84.1%) | 80 (80%) | |
| Blood in urine during menstruation | |||
| Yes | 150 (13.3%) | 17 (17%) | 0.3040 |
| No | 976 (86.7%) | 83 (83%) | |
| Absenteeism duration in the last 6 months (mean ± SD) | 7 ± 22 | 23 ± 31 | < 0.001 |
| Number of non-hormonal pain treatments used (mean ± SD) | 1 ± 1 | 2 ± 2 | < 0.001 |
A summary of the 16 dataset features considered in the training approach.
| Mother/daughter history of endometriosis |
| History of surgery for endometriosis |
| Age |
| BMI (body mass index) |
| Dysmenorrhea/VAS of dysmenorrhea |
| Abdominal pain outside menstruation |
| Pain suggesting of sciatica |
| Pain during sexual intercourse |
| Lower back pain outside menstruation |
| Painful defecation |
| Urinary pain during menstruation |
| Right shoulder pain near or during menstruation |
| Blood in the stools during menstruation |
| Blood in urine during menstruation |
| Absenteeism duration in the last 6 months |
| Number of non-hormonal pain treatments used |
Comparison between classification metrics of the different models in the training and validation sets.
| Models | Training set | Validation set | ||||||
|---|---|---|---|---|---|---|---|---|
| Sensitivity | Specificity | F1-score | AUC | Sensitivity | Specificity | F1-score | AUC | |
| Random forest (RF) | 0.98 | 0.8 | 0.88 | 0.89 | 0.92 | 0.92 | 0.92 | 0.92 |
| Logistic regression (LR) | 1 | 0 | 0 | 0.5 | 0.95 | 0.81 | 0.87 | 0.88 |
| Decision tree (DT) | 0.82 | 0.8 | 0.81 | 0.82 | 0.91 | 0.66 | 0.77 | 0.78 |
| eXtreme gradient boosting (XGB) | 0.98 | 0.8 | 0.88 | 0.89 | 0.93 | 0.92 | 0.92 | 0.93 |
| Voter classifier soft | 0.98 | 0.6 | 0.74 | 0.75 | 0.93 | 0.88 | 0.9 | 0.90 |
| Voter classifier hard | 0.95 | 0.8 | 0.87 | 0.88 | 0.91 | 0.92 | 0.91 | 0.92 |
Figure 2Correlation matrix of 16 features for the training set.
Figure 3Correlation matrix of the 16 features for the validation set.
Figure 4ROC curve analysis of models in training set.
Figure 5ROC curve analysis of different models in validation set.