| Literature DB >> 34268095 |
Elham Nikookar1, Ebrahim Naderi2, Ali Rahnavard3.
Abstract
BACKGROUND: Cervical cancer is a significant cause of cancer mortality in women, particularly in low-income countries. In regular cervical screening methods, such as colposcopy, an image is taken from the cervix of a patient. The particular image can be used by computer-aided diagnosis (CAD) systems that are trained using artificial intelligence algorithms to predict the possibility of cervical cancer. Artificial intelligence models had been highlighted in a number of cervical cancer studies. However, there are a limited number of studies that investigate the simultaneous use of three colposcopic screening modalities including Greenlight, Hinselmann, and Schiller.Entities:
Keywords: Aggregation strategy; artificial intelligence; cervical cancer; ensemble classifier; machine learning
Year: 2021 PMID: 34268095 PMCID: PMC8253312 DOI: 10.4103/jmss.JMSS_16_20
Source DB: PubMed Journal: J Med Signals Sens ISSN: 2228-7477
Review of machine-learning and image-processing applications in cervical cancer domain
| Reference | Research focus | Method used | One modality | Three modalities |
|---|---|---|---|---|
| [ | Determination of a patient’s cervical type | Development and implementation of a CNN-based algorithm for distinguishing between three cervical types | √ | |
| [ | Cell detection in cervix images for the automation of cell-based experiments in cervical cancer domain | Identifying a set of candidate cell-like regions in cervix image and evaluating each candidate region using a statistical model of the cell appearance. Dynamic programming is used to pick a set of nonoverlapping regions that match the model | √ | |
| [ | Classifying cervical cells as normal or cancer | Provide a large ensemble of segmentations which separate normal and cancer cases by using votes of different segments | √ | |
| [ | Improving the predictive performance of artificial intelligence-based system for screening of cervical cancer | Creating a hybrid ensemble which is, in fact, an ensemble of ensemble classifiers | √ | |
| [ | Predict cross-modality individual risk and cross-expert subjective quality assessment of colposcopic images for different modalities | Transfer knowledge gained from one modality to another | √ | |
| [ | Investigating the performance of CNN features for cervical disease classification | Applying different classifiers to their data to find optimal parameters of each classifier | √ |
CNN – Convolutional neural network √ – One modality or Three modality dataset is used
Figure 1Greenlight, Hinselmann, and Schiller colposcopy modalities
Figure 2The general flowchart of the proposed model
Parameters in Aggregation function
| Symbol | Description |
|---|---|
| Subjective judgment from expert physician j for cervical image from modality i | |
| Consensus for cervical image from modality i | |
| Total number of modalities in this study | |
| Number of subjective judgments for each modality |
Figure 3A general flowchart of applying the algorithm on single colposcopy modality datasets
Results of applying seven single classifiers on DSmvs dataset
| Method | Sensitivity (%) | Specificity (%) | ROC area | Mean±STD | |
|---|---|---|---|---|---|
| NavieBayes | 72 | 43 | 66 | 0.43 | 0.8±0.16 |
| AdaBoost | 68 | 32 | 56 | 0.32 | 0.8±0.22 |
| Random Forest | 67 | 35 | 56 | 0.32 | 0.8±0.10 |
| SVM | 68 | 32 | 56 | 0.32 | 0.8±0.08 |
| Decision tree | 63 | 29 | 53 | 0.49 | 0.8±0.28 |
| Logit boost | 63 | 46 | 53 | 0.29 | 0.8±0.14 |
Random tree is the best performing classifier on the dataset acquired by applying Fmvs aggregation function. STD – Standard deviation; SVM – Support vector machine; ROC – Receiver operating characteristic
Results of applying seven single classifiers on DSopc dataset
| Method | Sensitivity (%) | Specificity (%) | ROC area | Mean±STH | |
|---|---|---|---|---|---|
| NavieBayes | 68 | 52 | 69 | 0.69 | 0.96±0.28 |
| AdaBoost | 58 | 52 | 58 | 0.57 | 0.96±0.15 |
| Random forest | 68 | 32 | 56 | 0.32 | 0.96±0.22 |
| SVM | 68 | 24 | 60 | 0.46 | 0.96±0.14 |
| Decision tree | 59 | 21 | 54 | 0.43 | 0.96±0.23 |
| Logit boost | 63 | 29 | 53 | 0.29 | 0.96±0.18 |
Random tree is the best performing classifier on the dataset acquired by applying Fopc aggregation function. STD – Standard deviation; SVM – Support vector machine; ROC – Receiver operating characteristic
Results of applying different classifier selection methods on each merged dataset (corresponding to noted aggregation function)
| Method | |||||
|---|---|---|---|---|---|
| Best classifier (%) | 79 | 77 | 74 | 79 | 77 |
| 64 | 52 | 52 | 82 | 56 | |
| 75 | 73 | 73 | 79 | 72 | |
| 0.68 | 0.68 | 0.63 | 0.78 | 0.77 | |
| All classifiers (%) | 80 | 73 | 88 | 78 | |
| 76 | 63 | 82 | 56 | ||
| 75 | 62 | 80 | 72 | ||
| 0.78 | 0.65 | 0.83 | 0.77 | ||
| PCA on classifiers (%) | 83 | 83 | 79 | ||
| 75 | 67 | 63 | |||
| 70 | 76 | 66 | |||
| 0.80 | 0.70 | 0.74 | |||
| Forward selection (%) | 77 | 80 | 86 | 73 | |
| 63 | 71 | 67 | 71 | ||
| 74 | 76 | 82 | 54 | ||
| 0.81 | 0.70 | 0.83 | 0.79 | ||
| Backward selection (%) | 84 | 76 | 81 | 84 | |
| 70 | 63 | 71 | 66 | ||
| 72 | 64 | 76 | 80 | ||
| 0.79 | 0.66 | 0.72 | 0.86 |
Each cell contains sensitivity, specificity, F-score, and ROC area for corresponding setup. PCA – Principal component analysis; ROC – Receiver operating characteristic
Figure 4Results of the best classifier selection method for each aggregation strategy (corresponding to the noted aggregation function)
Results of applying different classification algorithms on a single dataset of each modality
| Method | Green (%) | Hinselmann (%) | Schiller (%) |
|---|---|---|---|
| NavieBayes | 58/15 | 68/67 | |
| 58/0.55 | 68/0.67 | ||
| AdaBoost | 58/36 | 68/18 | 63/56 |
| 59/0.59 | 64/0.76 | 61/0.72 | |
| Random forest | 63/38 | 79/21 | 68/57 |
| 59/0.59 | 70/0.75 | 62/0.69 | |
| Random tree | 68/50 | 79/39 | 58/56 |
| 66/0.59 | 76/0.59 | 58/0.57 | |
| SVM | 63/47 | 90/61 | 58/42 |
| 62/0.55 | 88/0.75 | 43/0.50 | |
| Decision tree | 63/38 | ||
| 59/0.51 | |||
| Logit boost | 53/33 | 74/20 | 63/60 |
| 51/0.47 | 67/0.71 | 63/0.69 | |
| All | 66/54 | 74/39 | 59/60 |
| 63/0.65 | 75/0.59 | 58/0.57 | |
| PCA on classifiers | 79/39 | ||
| 72/0.66 | |||
| Forward | 63/66 | 74/67 | |
| 59/0.73 | 72/0.67 | ||
| Backward | 70/62 | 74/50 | 63/60 |
| 63/0.69 | 67/0.65 | 60/0.62 | |
| Proposed ( |
Each cell contains sensitivity, specificity, F-score, and ROC area for the corresponding setup. PCA – Principal component analysis; SVM – Support vector machine; ROC – Receiver operating characteristic
Comparison of cervical cancer prediction systems
| Method | Sensitivity (%) | Specificity (%) | ROC area | |
|---|---|---|---|---|
| [ | 88% | 83% | NA | 0.87 |
| [ | NA | NA | NA | 0.91 |
| [ | NA | NA | 89 | NA |
| Best single-modality model (%) | 86 | 78 | 80 | 0.83 |
| Proposed model (%) | 96 | 94 | 91 | 0.94 |
NA – Not available; ROC – Receiver operating characteristic
Results of applying seven single classifiers on DSmvc dataset
| Method | Sensitivity (%) | Specificity (%) | ROC area | Mean±STD | |
|---|---|---|---|---|---|
| NavieBayes | 74 | 52 | 73 | 0.59 | 0.83±0.15 |
| AdaBoost | 72 | 26 | 63 | 0.36 | 0.83±0.10 |
| Random Forest | 74 | 26 | 61 | 0.64 | 0.83±0.23 |
| SVM | 68 | 24 | 60 | 0.46 | 0.83±0.22 |
| Decision tree | 58 | 21 | 54 | 0.43 | 0.83±0.08 |
| Logit boost | 68 | 24 | 60 | 0.47 | 0.83±0.09 |
Random tree is the best performing classifier on the dataset acquired by applying Fmvs aggregation function. STD – Standard deviation; SVM – Support vector machine; ROC – Receiver operating characteristic
Results of applying seven single classifiers on DSps dataset
| Method | Sensitivity (%) | Specificity (%) | ROC area | Mean±STD | |
|---|---|---|---|---|---|
| NavieBayes | 67 | 43 | 66 | 0.43 | 0.09±0.23 |
| AdaBoost | 68 | 32 | 56 | 0.32 | 0.09±0.16 |
| Random Forest | 68 | 32 | 56 | 0.32 | 0.09±0.10 |
| SVM | 68 | 24 | 60 | 0.46 | 0.09±0.10 |
| Decision tree | 58 | 21 | 54 | 0.43 | 0.09±0.27 |
| Logit boost | 68 | 24 | 60 | 0.49 | 0.09±0.14 |
Random tree is the best performing classifier on the dataset acquired by applying Fps aggregation function. STD – Standard deviation; ROC – Receiver operating characteristic; SVM – Support vector machine
Results of applying seven single classifiers on DSpc dataset
| Method | Sensitivity (%) | Specificity (%) | ROC area | Mean±STD | |
|---|---|---|---|---|---|
| NavieBayes | 68 | 76 | 69 | 0.77 | 0.48±0.10 |
| AdaBoost | 58 | 52 | 58 | 0.57 | 0.48±0.20 |
| Random forest | 63 | 73 | 63 | 0.71 | 0.48±0.16 |
| Random tree | 63 | 61 | 64 | 0.62 | 0.48±0.22 |
| SVM | 58 | 70 | 57 | 0.64 | 0.48±0.24 |
| Decision tree | 47 | 46 | 48 | 0.38 | 0.48±0.10 |
Logit Boost is the best performing classifier on the dataset acquired by applying Fpc aggregation function. STD – Standard deviation; ROC – Receiver operating characteristic; SVM – Support vector machine