| Literature DB >> 29672639 |
Mizuho Nishio1,2, Mitsuo Nishizawa3, Osamu Sugiyama2, Ryosuke Kojima4, Masahiro Yakami1,2, Tomohiro Kuroda5, Kaori Togashi1.
Abstract
We aimed to evaluate a computer-aided diagnosis (CADx) system for lung nodule classification focussing on (i) usefulness of the conventional CADx system (hand-crafted imaging feature + machine learning algorithm), (ii) comparison between support vector machine (SVM) and gradient tree boosting (XGBoost) as machine learning algorithms, and (iii) effectiveness of parameter optimization using Bayesian optimization and random search. Data on 99 lung nodules (62 lung cancers and 37 benign lung nodules) were included from public databases of CT images. A variant of the local binary pattern was used for calculating a feature vector. SVM or XGBoost was trained using the feature vector and its corresponding label. Tree Parzen Estimator (TPE) was used as Bayesian optimization for parameters of SVM and XGBoost. Random search was done for comparison with TPE. Leave-one-out cross-validation was used for optimizing and evaluating the performance of our CADx system. Performance was evaluated using area under the curve (AUC) of receiver operating characteristic analysis. AUC was calculated 10 times, and its average was obtained. The best averaged AUC of SVM and XGBoost was 0.850 and 0.896, respectively; both were obtained using TPE. XGBoost was generally superior to SVM. Optimal parameters for achieving high AUC were obtained with fewer numbers of trials when using TPE, compared with random search. Bayesian optimization of SVM and XGBoost parameters was more efficient than random search. Based on observer study, AUC values of two board-certified radiologists were 0.898 and 0.822. The results show that diagnostic accuracy of our CADx system was comparable to that of radiologists with respect to classifying lung nodules.Entities:
Mesh:
Year: 2018 PMID: 29672639 PMCID: PMC5908232 DOI: 10.1371/journal.pone.0195875
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Outline of our CADx system.
Abbreviations: CADx, computer-aided diagnosis; LBP-TOP, local binary pattern on three orthogonal planes; SVM, support vector machine.
Results of CADx when using SVM and parameter optimization.
| Algorithm | Number of trial | Validation loss | AUC | Accuracy |
|---|---|---|---|---|
| Random | 10 | 0.528 | 0.792 | 0.734 |
| Random | 100 | 0.481 | 0.832 | 0.780 |
| Random | 200 | 0.460 | 0.848 | 0.794 |
| Random | 1000 | 0.451 | 0.849 | 0.789 |
| TPE | 10 | 0.515 | 0.797 | 0.724 |
| TPE | 100 | 0.461 | 0.847 | 0.802 |
| TPE | 200 | 0.458 | 0.846 | 0.792 |
| TPE | 1000 | 0.453 | 0.850 | 0.797 |
Abbreviation: computer-aided diagnosis, CADx; support vector machine, SVM; Tree Parzen Estimator, TPE; area under the curve, AUC.
Results of CADx when using XGBoost and parameter optimization.
| Algorithm | Number of trial | Validation loss | AUC | Accuracy |
|---|---|---|---|---|
| Random | 10 | 0.488 | 0.838 | 0.756 |
| Random | 100 | 0.451 | 0.864 | 0.771 |
| Random | 200 | 0.440 | 0.868 | 0.784 |
| Random | 1000 | 0.422 | 0.878 | 0.806 |
| TPE | 10 | 0.494 | 0.838 | 0.762 |
| TPE | 100 | 0.427 | 0.876 | 0.811 |
| TPE | 200 | 0.419 | 0.881 | 0.804 |
| TPE | 1000 | 0.394 | 0.896 | 0.820 |
Abbreviation: computer-aided diagnosis, CADx; support vector machine, SVM; Tree Parzen Estimator, TPE; area under the curve, AUC.
Fig 2Validation loss of CADx.
Abbreviations: CADx, computer-aided diagnosis; SVM, support vector machine.
Fig 4Accuracy of CADx.
Abbreviations: CADx, computer-aided diagnosis; SVM, support vector machine.
Fig 5ROC curves of two radiologists.
Note: (A) radiologist1 and (B) radiologist2. Abbreviations: ROC, receiver operating characteristic.