| Literature DB >> 32344829 |
Kenya Kusunose1, Akihiro Haga2, Mizuki Inoue2, Daiju Fukuda1, Hirotsugu Yamada1, Masataka Sata1.
Abstract
A proper echocardiographic study requires several video clips recorded from different acquisition angles for observation of the complex cardiac anatomy. However, these video clips are not necessarily labeled in a database. Identification of the acquired view becomes the first step of analyzing an echocardiogram. Currently, there is no consensus whether the mislabeled samples can be used to create a feasible clinical prediction model of ejection fraction (EF). The aim of this study was to test two types of input methods for the classification of images, and to test the accuracy of the prediction model for EF in a learning database containing mislabeled images that were not checked by observers. We enrolled 340 patients with five standard views (long axis, short axis, 3-chamber view, 4-chamber view and 2-chamber view) and 10 images in a cycle, used for training a convolutional neural network to classify views (total 17,000 labeled images). All DICOM images were rigidly registered and rescaled into a reference image to fit the size of echocardiographic images. We employed 5-fold cross validation to examine model performance. We tested models trained by two types of data, averaged images and 10 selected images. Our best model (from 10 selected images) classified video views with 98.1% overall test accuracy in the independent cohort. In our view classification model, 1.9% of the images were mislabeled. To determine if this 98.1% accuracy was acceptable for creating the clinical prediction model using echocardiographic data, we tested the prediction model for EF using learning data with a 1.9% error rate. The accuracy of the prediction model for EF was warranted, even with training data containing 1.9% mislabeled images. The CNN algorithm can classify images into five standard views in a clinical setting. Our results suggest that this approach may provide a clinically feasible accuracy level of view classification for the analysis of echocardiographic data.Entities:
Keywords: artificial intelligence; echocardiography; view classification
Mesh:
Year: 2020 PMID: 32344829 PMCID: PMC7277840 DOI: 10.3390/biom10050665
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Baseline characteristics of the study population.
| Control | |
|---|---|
| Number | 340 |
| Age, years | 66 ± 14 |
| Male, % | 58 |
| Ischemic Cardiomyopathy, % | 48 |
| Heart rate, bpm | 77 ± 16 |
| LVEDVi, ml/m2 | 74 (53–105) |
| LVESVi, ml/m2 | 40 (20–74) |
| LVEF, % | 45 (29–62) |
Data are presented as number of patients (percentage), mean ± SD or median (interquartile range). Abbreviations: LVEDVi, left ventricular end diastolic volume index; LVESVi, left ventricular end systolic volume index; WMSI, wall motion score index; LVEF, left ventricular ejection fraction.
Figure 1Standard views: The apical 2-chamber (AP2), apical 4-chamber (AP4), apical 3-chamber (AP3), parasternal long axis (PLAX), and parasternal short axis (PSAX) views were stored digitally for playback and analysis. Echocardiographic images shown here are the average of 10 consecutive images.
Figure 2Neural network for view classification: We designed and trained convolutional neural network models to recognize 5 different standard echocardiographic views.
Figure 3Echocardiogram view classification by deep-learning model: Actual view labels are on the y-axis, and neural network-predicted view labels are on the x-axis by view category for video classification.
Figure 4Well-classified and misclassified cases: For the misclassified cases, it seems to be difficult even for expert observers to determine the accurate view.