| Literature DB >> 30286740 |
Fei Li1, Zhe Wang2, Guoxiang Qu3, Diping Song3, Ye Yuan1,4, Yang Xu5, Kai Gao1, Guangwei Luo1, Zegu Xiao3, Dennis S C Lam4, Hua Zhong6, Yu Qiao7, Xiulan Zhang8.
Abstract
BACKGROUND: To develop a deep neural network able to differentiate glaucoma from non-glaucoma visual fields based on visual filed (VF) test results, we collected VF tests from 3 different ophthalmic centers in mainland China.Entities:
Keywords: Deep learning; Glaucoma; Visual field
Mesh:
Year: 2018 PMID: 30286740 PMCID: PMC6172715 DOI: 10.1186/s12880-018-0273-5
Source DB: PubMed Journal: BMC Med Imaging ISSN: 1471-2342 Impact factor: 1.930
Fig. 1Diagram showing the modified VGG network. VGG15 was adopted as our network structure. We modified the output dimension of the penultimate layer fc7 from 4096 to 200. And the last layer is modified to output a two-dimension vector which corresponds to the prediction scores of healthy VF and glaucoma VF. The network is first pre-trained on a large scale, natural image classification dataset ImageNet16 to initialize its parameters. Then we modified the last two layers as mentioned above and initialized their parameters by drawing from a Gaussian distribution
Baseline characteristics of participants
| Non-glaucoma Group | Glaucoma Group | ||
|---|---|---|---|
| No. of images | 1623 | 2389 | – |
| Age (SD) | 47.2 (17.4) | 49.2 (16.3) | 0.0022* |
| left/right | 635/919 | 607/911 | 0.6211 |
| VFI (SD) | 0.917 (0.126) | 0.847 (0.162) | 0.0001* |
| MD (SD) | − 5.0 (23.5) | −9.0 (44.8) | 0.0039* |
| PSD (SD) | 3.6 (3.3) | 6.7 (22.2) | 0.0001* |
*shows results with a significant difference
Performance of the algorithm and the compared methods
| Methods | Accuracy | Specificity | Sensitivity | |
|---|---|---|---|---|
| Ophthalmologists | resident #1 | 0.640 | 0.767 | 0.513 |
| resident #2 | 0.593 | 0.680 | 0.507 | |
| resident #3 | 0.587 | 0.630 | 0.540 | |
| attending #1 | 0.533 | 0.213 | 0.853 | |
| attending #2 | 0.570 | 0.670 | 0.473 | |
| attending #3 | 0.653 | 0.547 | 0.760 | |
| glaucoma expert #1 | 0.663 | 0.700 | 0.647 | |
| glaucoma expert #2 | 0.607 | 0.527 | 0.687 | |
| glaucoma expert #3 | 0.607 | 0.913 | 0.300 | |
| Rule based methods | AGIS | 0.459 | 0.560 | 0.343 |
| GSS2 | 0.523 | 0.500 | 0.550 | |
| Traditional machine learning methods | SVM | 0.670 | 0.618 | 0.733 |
| RF | 0.644 | 0.453 | 0.863 | |
| k-NN | 0.591 | 0.347 | 0.870 | |
| CNN | 0.876 | 0.826 | 0.932 |
Fig. 2Validation set performance for glaucoma diagnosis. Performance of CNN, ophthalmologists and traditional algorithms are presented. There were 9 ophthalmologists participating in evaluation of VFs. On the validation set of 300 VFs, CNN achieved an accuracy of 0.876, while the specificity and sensitivity was 0.826 and 0.932, respectively. The average accuracies are 0.607, 0.585 and 0.626 for resident ophthalmologists, attending ophthalmologists and glaucoma experts, respectively. Both AGIS and GSS2 are not able to achieve satisfactory results. Three traditional machine learning algorithms were also included in the experiments. SVM performed best among these machine learning methods, but still much worse than CNN. We also examined the receiver operating characteristic curve (ROC) of CNN and the compared methods. CNN achieved an AUC of 0.966 (95%CI, 0.948–0.985), which outperformed all the ophthalmologists, rule based methods and traditional machine learning methods by a large margin
Fig. 3Relative validation set accuracy versus number of training images. We studied the relative validation set accuracy as a function of the number of images in the training set. The training set is randomly chosen as a subset of the original training set at rates of (5%, 10%, …, 100%). Each set includes all the images in the smaller subset. As shown in the figure, the performance does not improve too much after the training set includes more than 3712 images