| Literature DB >> 32802799 |
Asadi F1, Salehnasab C2, Ajori L3.
Abstract
BACKGROUND: Compared to other genital cancers, cervical cancer is the most prevalent and the main cause of mortality in females in third-world countries, affected by different factors, including smoking, poor nutritional status, immune-deficiency, long-term use of contraceptives and so on.Entities:
Keywords: Cervical Cancer; Decision Trees; Machine Learning; Neural networks; Prediction; Support Vector Machine
Year: 2020 PMID: 32802799 PMCID: PMC7416093 DOI: 10.31661/jbpe.v0i0.1912-1027
Source DB: PubMed Journal: J Biomed Phys Eng ISSN: 2251-7200
Figure 1Mechanism of pre-processing the data and developing machine learning classification algorithms
Important variables obtained from library studies.
| Row | Variable | Type | Role |
|---|---|---|---|
| 1 | Age | Continuous | Input |
| 2 | Marital status | Nominal | Input |
| 3 | Education level | Nominal | Input |
| 4 | Social status | Nominal | Input |
| 5 | Economic status | Nominal | Input |
| 6 | Personal health level | Nominal | Input |
| 7 | Family history of cervical cancer | Nominal | Input |
| 8 | The dose of contraceptives used | Continuous | Input |
| 9 | Age at the first childbirth | Continuous | Input |
| 10 | Number of childbirths by caesarean | Nominal | Input |
| 11 | Number of pregnancies | Continuous | Input |
| 12 | Period of smoking consumption | Continuous | Input |
| 13 | Period of alcohol consumption | Continuous | Input |
| 14 | Immunodeficiency | Nominal | Input |
| 15 | HPV | Nominal | Input |
| 16 | Nominal | Input | |
| 17 | Number of sex partners | Nominal | Input |
| 18 | Marriage Age | Continuous | Input |
| 19 | Nominal | Input | |
| 20 | Chlamydia | Nominal | Input |
| 21 | Number of sexually-transmitted diseases | Nominal | Input |
| 22 | History of chronic diseases | Nominal | Input |
| 23 | Given/Not Given cervical cancer | Flag | Target |
Excluded in the pre-processing stage
Predictors excluded in the first stage
| Predictor | SVM | C&R Tree | QUEST | RBF | MLP | Occurrence |
|---|---|---|---|---|---|---|
| Number of sexually-transmitted diseases | √ | 1 | ||||
| Number of sex partners | √ | 1 | ||||
| Marriage Age | √ | 1 | ||||
| HPV | √ | 1 | ||||
| History of chronic diseases | √ | 1 | ||||
| Economic status | √ | 1 | ||||
| Family history of cervical cancer | 0 | |||||
| Duration of smoking | 0 | |||||
| Chlamydia | 0 | |||||
Evaluating the algorithms in the second modelling stage arranged by the accuracy of the test data
| Row | ML algorithms | %Accuracy | %Sensitivity | %Specificity | %AUC |
|---|---|---|---|---|---|
| 1 | QUEST Tree | 95.55 | 90.48 | 100.00 | 95.20 |
| 2 | C&R Tree | 95.55 | 90.48 | 100.00 | 95.20 |
| 3 | RBF-ANNs | 95.45 | 90.00 | 100.00 | 91.50 |
| 4 | SVM | 93.33 | 90.48 | 95.83 | 95.80 |
| 5 | MLP-ANNs | 90.90 | 90.00 | 91.67 | 91.50 |
Figure 2The ROC curve for classification algorithms
Significant predictors in the second stage of modelling
| Predictor | SVM | C&R Tree | QUEST | RBF | MLP | Occurrence |
|---|---|---|---|---|---|---|
| Personal health level | √ | √ | √ | √ | √ | 5 |
| Marital status | √ | √ | √ | √ | √ | 5 |
| Social status | √ | √ | √ | √ | √ | 5 |
| Dose of contraceptives used | √ | √ | √ | √ | √ | 5 |
| Education level | √ | √ | √ | √ | √ | 5 |
| Number of childbirths by caesarean | √ | √ | √ | √ | √ | 5 |
| Age | √ | √ | √ | 3 | ||
| Age at the first childbirth | √ | √ | √ | 3 | ||
| Number of pregnancies | √ | √ | √ | 3 | ||
| Immunodeficiency | √ | √ | √ | 3 | ||
Figure 3The importance of the important predictors in the algorithms