| Literature DB >> 30951163 |
Felix Yu1, Gianluca Silva Croso1, Tae Soo Kim1, Ziang Song1, Felix Parker1, Gregory D Hager1,2, Austin Reiter1,2, S Swaroop Vedula2, Haider Ali1,2, Shameema Sikder3.
Abstract
Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision.Entities:
Mesh:
Year: 2019 PMID: 30951163 PMCID: PMC6450320 DOI: 10.1001/jamanetworkopen.2019.1860
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Algorithms Evaluated for Classification of Phases in Cataract Surgery
CNN indicates convolutional neural network; RNN, recurrent neural network; SVM, support vector machine.
Instances of Each Phase Within Cataract Surgery in Data Set
| Phase in Cataract Surgery | Faculty Surgeon, No. (%) | Trainee Surgeon, No. (%) | Total, No. |
|---|---|---|---|
| Incision | |||
| Side | 31 (26.5) | 86 (73.5) | 117 |
| Main | 40 (30.5) | 91 (69.5) | 131 |
| Capsulorrhexis | 29 (28.7) | 72 (71.3) | 101 |
| Hydrodissection | 28 (28.5) | 70 (71.5) | 98 |
| Phacoemulsification | 31 (29.8) | 73 (70.2) | 104 |
| Cortical removal | 35 (28.9) | 86 (71.1) | 121 |
| Lens insertion | 31 (28.7) | 77 (71.3) | 108 |
| Ocular viscoelastic device removal | 33 (30.0) | 77 (70.0) | 110 |
| Wound closure | |||
| Corneal hydration | 35 (27.3) | 93 (72.7) | 128 |
| Suture incision | 17 (29.3) | 41 (70.7) | 58 |
Summary Measures of Algorithm Performance for Phase Classification
| Metric | SVM, Algorithm 1, Instrument Labels | RNN, Algorithm 2, Instrument Labels | CNN, Algorithm 3, Images | CNN-RNN, Algorithm 4, Images | CNN-RNN, Algorithm 5, Images and Instrument Labels |
|---|---|---|---|---|---|
| Unweighted accuracy (95% CI) | 0.938 (0.937-0.939) | 0.959 (0.958-0.960) | 0.956 (0.954-0.957) | 0.921 (0.920-0.923) | 0.915 (0.913-0.916) |
| Frequency-weighted accuracy (95% CI) | 0.935 (0.934-0.936) | 0.957 (0.956-0.958) | 0.955 (0.953-0.956) | 0.919 (0.918-0.920) | 0.913 (0.912-0.914) |
| Inverse variance−weighted accuracy (95% CI) | 0.963 (0.962-0.965) | 0.976 (0.975-0.978) | 0.958 (0.957-0.960) | 0.928 (0.926-0.930) | 0.920 (0.918-0.922) |
| Unweighted AUC (95% CI) | 0.737 (0.730-0.744) | 0.773 (0.770-0.776) | 0.712 (0.704-0.719) | 0.752 (0.750-0.755) | 0.737 (0.735-0.739) |
Abbreviations: AUC, area under the receiver operating characteristic curve; CNN, convolutional neural network; RNN, recurrent neural network; SVM, support vector machine.
Accuracy, Sensitivity, Specificity, and Precision for Algorithms Across Phases
| Algorithm and Measure | Side Incision | Main Incision | Capsulorrhexis | Hydrodissection | Phacoemulsification | Cortical Removal | Lens Insertion | Ocular Viscoelastic Device Removal | Wound Closure, Corneal Hydration | Wound Closure, Suture Incision |
|---|---|---|---|---|---|---|---|---|---|---|
| SVM, algorithm 1, instrument labels (95% CI) | ||||||||||
| Accuracy | 0.985 (0.982-0.987) | 0.930 (0.927-0.933) | 0.963 (0.962-0.964) | 0.910 (0.907-0.914) | 0.958 (0.955-0.961) | 0.882 (0.878-0.886) | 0.987 (0.986-0.988) | 0.899 (0.897-0.900) | 0.893 (0.892-0.895) | 0.968 (0.964-0.971) |
| Sensitivity | 0.936 (0.917-0.955) | 0.949 (0.949-0.949) | 0.890 (0.890-0.890) | 0.809 (0.799-0.819) | 0.904 (0.894-0.914) | 0.475 (0.446-0.504) | 0.920 (0.920-0.920) | 0.247 (0.247-0.247) | 0.005 (0.000-0.015) | 0.852 (0.784-0.920) |
| Specificity | 0.990 (0.989-0.991) | 0.927 (0.924-0.931) | 0.972 (0.971-0.973) | 0.923 (0.919-0.926) | 0.965 (0.962-0.968) | 0.932 (0.930-0.934) | 0.996 (0.994-0.997) | 0.976 (0.975-0.978) | 0.999 (0.998-1.000) | 0.972 (0.970-0.975) |
| Precision | 0.906 (0.896-0.916) | 0.615 (0.604-0.625) | 0.798 (0.791-0.805) | 0.555 (0.542-0.568) | 0.759 (0.742-0.776) | 0.459 (0.442-0.476) | 0.963 (0.953-0.973) | 0.556 (0.539-0.573) | 0.517 (0.000-1.000) | 0.553 (0.524-0.582) |
| RNN, algorithm 2, instrument labels (95% CI) | ||||||||||
| Accuracy | 0.989 (0.987-0.991) | 0.985 (0.982-0.987) | 0.973 (0.971-0.976) | 0.960 (0.957-0.962) | 0.962 (0.959-0.965) | 0.915 (0.911-0.919) | 0.988 (0.987-0.990) | 0.927 (0.924-0.930) | 0.909 (0.905-0.914) | 0.984 (0.981-0.987) |
| Sensitivity | 0.940 (0.925-0.956) | 0.974 (0.957-0.991) | 0.925 (0.915-0.935) | 0.716 (0.706-0.727) | 0.812 (0.795-0.829) | 0.583 (0.552-0.614) | 0.943 (0.934-0.952) | 0.508 (0.494-0.521) | 0.765 (0.745-0.784) | 0.800 (0.728-0.873) |
| Specificity | 0.994 (0.992-0.996) | 0.986 (0.985-0.987) | 0.979 (0.977-0.981) | 0.989 (0.986-0.991) | 0.980 (0.978-0.983) | 0.956 (0.952-0.959) | 0.994 (0.992-0.996) | 0.977 (0.974-0.980) | 0.926 (0.922-0.931) | 0.991 (0.990-0.993) |
| Precision | 0.942 (0.927-0.957) | 0.893 (0.885-0.902) | 0.847 (0.834-0.860) | 0.883 (0.859-0.907) | 0.836 (0.819-0.852) | 0.615 (0.594-0.637) | 0.951 (0.938-0.964) | 0.722 (0.695-0.749) | 0.554 (0.537-0.570) | 0.781 (0.749-0.813) |
| CNN, algorithm 3, images (95% CI) | ||||||||||
| Accuracy | 0.962 (0.958-0.966) | 0.970 (0.966-0.974) | 0.928 (0.923-0.932) | 0.957 (0.955-0.958) | 0.959 (0.956-0.962) | 0.940 (0.935-0.944) | 0.964 (0.960-0.967) | 0.959 (0.956-0.963) | 0.953 (0.948-0.958) | 0.966 (0.963-0.970) |
| Sensitivity | 0.723 (0.689-0.756) | 0.870 (0.845-0.895) | 0.920 (0.920-0.920) | 0.623 (0.613-0.634) | 0.884 (0.874-0.894) | 0.793 (0.762-0.823) | 0.813 (0.796-0.830) | 0.799 (0.782-0.816) | 0.749 (0.722-0.775) | 0.279 (0.210-0.347) |
| Specificity | 0.987 (0.984-0.990) | 0.982 (0.979-0.985) | 0.929 (0.924-0.934) | 0.996 (0.995-0.998) | 0.968 (0.965-0.971) | 0.958 (0.954-0.961) | 0.982 (0.979-0.985) | 0.978 (0.975-0.982) | 0.978 (0.973-0.982) | 0.994 (0.992-0.996) |
| Precision | 0.858 (0.829-0.886) | 0.856 (0.835-0.877) | 0.614 (0.597-0.630) | 0.952 (0.932-0.972) | 0.770 (0.753-0.787) | 0.696 (0.677-0.716) | 0.850 (0.828-0.873) | 0.816 (0.793-0.838) | 0.802 (0.769-0.834) | 0.646 (0.555-0.738) |
| CNN-RNN, algorithm 4, images (95% CI) | ||||||||||
| Accuracy | 0.939 (0.934-0.945) | 0.930 (0.926-0.935) | 0.931 (0.928-0.934) | 0.936 (0.932-0.939) | 0.938 (0.935-0.940) | 0.841 (0.837-0.845) | 0.930 (0.927-0.933) | 0.900 (0.896-0.903) | 0.916 (0.911-0.921) | 0.951 (0.948-0.955) |
| Sensitivity | 0.609 (0.569-0.649) | 0.745 (0.720-0.770) | 0.745 (0.735-0.755) | 0.557 (0.557-0.557) | 0.692 (0.682-0.702) | 0.546 (0.521-0.571) | 0.486 (0.477-0.496) | 0.545 (0.531-0.558) | 0.594 (0.568-0.621) | 0.414 (0.352-0.477) |
| Specificity | 0.974 (0.971-0.978) | 0.953 (0.949-0.957) | 0.954 (0.950-0.957) | 0.981 (0.977-0.984) | 0.968 (0.965-0.970) | 0.877 (0.874-0.880) | 0.984 (0.982-0.987) | 0.942 (0.939-0.946) | 0.955 (0.95-0.959) | 0.973 (0.970-0.976) |
| Precision | 0.715 (0.684-0.747) | 0.659 (0.639-0.679) | 0.665 (0.649-0.682) | 0.775 (0.743-0.806) | 0.723 (0.709-0.737) | 0.351 (0.340-0.363) | 0.795 (0.764-0.825) | 0.529 (0.513-0.545) | 0.613 (0.586-0.640) | 0.378 (0.334-0.422) |
| CNN-RNN, algorithm 5, images and instrument labels (95% CI) | ||||||||||
| Accuracy | 0.932 (0.927-0.937) | 0.947 (0.943-0.951) | 0.902 (0.899-0.906) | 0.900 (0.896-0.904) | 0.920 (0.918-0.922) | 0.861 (0.858-0.865) | 0.914 (0.910-0.917) | 0.909 (0.905-0.913) | 0.925 (0.922-0.929) | 0.937 (0.933-0.941) |
| Sensitivity | 0.512 (0.475-0.550) | 0.709 (0.681-0.737) | 0.625 (0.615-0.635) | 0.495 (0.495-0.495) | 0.666 (0.653-0.680) | 0.520 (0.503-0.537) | 0.547 (0.530-0.563) | 0.619 (0.597-0.642) | 0.518 (0.499-0.537) | 0.421 (0.338-0.503) |
| Specificity | 0.976 (0.973-0.980) | 0.976 (0.973-0.979) | 0.937 (0.933-0.940) | 0.949 (0.944-0.953) | 0.951 (0.949-0.953) | 0.903 (0.900-0.906) | 0.959 (0.956-0.963) | 0.943 (0.940-0.947) | 0.974 (0.970-0.978) | 0.957 (0.955-0.960) |
| Precision | 0.697 (0.661-0.734) | 0.782 (0.762-0.802) | 0.549 (0.535-0.564) | 0.535 (0.513-0.558) | 0.624 (0.615-0.633) | 0.396 (0.385-0.407) | 0.624 (0.603-0.644) | 0.566 (0.549-0.583) | 0.704 (0.674-0.733) | 0.283 (0.241-0.325) |
Abbreviations: CNN, convolutional neural network; RNN, recurrent neural network; SVM, support vector machine.
Figure 2. Differences in Area Under the Receiver Operating Characteristic Curve Between Pairs of Algorithms for Phase Classification
Area under the receiver operating characteristic curve of algorithm in column subtracted from area under the receiver operating characteristic curve of algorithm in row. CNN indicates convolutional neural network; RNN, recurrent neural network; and SVM, support vector machine.
aP = .008.