| Literature DB >> 30217194 |
Wenjia Bai1, Matthew Sinclair2, Giacomo Tarroni2, Ozan Oktay2, Martin Rajchl2, Ghislain Vaillant2, Aaron M Lee3, Nay Aung3, Elena Lukaschuk4, Mihir M Sanghvi3, Filip Zemrak3, Kenneth Fung3, Jose Miguel Paiva3, Valentina Carapella4, Young Jin Kim4, Hideaki Suzuki5, Bernhard Kainz2, Paul M Matthews5, Steffen E Petersen3, Stefan K Piechnik4, Stefan Neubauer4, Ben Glocker2, Daniel Rueckert2.
Abstract
BACKGROUND: Cardiovascular resonance (CMR) imaging is a standard imaging modality for assessing cardiovascular diseases (CVDs), the leading cause of death globally. CMR enables accurate quantification of the cardiac chamber volume, ejection fraction and myocardial mass, providing information for diagnosis and monitoring of CVDs. However, for years, clinicians have been relying on manual approaches for CMR image analysis, which is time consuming and prone to subjective errors. It is a major clinical challenge to automatically derive quantitative and clinically relevant information from CMR images.Entities:
Keywords: CMR image analysis; Fully convolutional networks; Machine learning
Mesh:
Year: 2018 PMID: 30217194 PMCID: PMC6138894 DOI: 10.1186/s12968-018-0471-x
Source DB: PubMed Journal: J Cardiovasc Magn Reson ISSN: 1097-6647 Impact factor: 5.364
Fig. 1The network architecture. A fully convolutional network (FCN) is used, which takes the cardiovascular magnetic resonance (CMR) image as input, learns image features from fine to coarse scales through a series of convolutions, concatenates multi-scale features and finally predicts a pixelwise image segmentation
The network architecture
| Scale | Size | Convolution |
|---|---|---|
| 1 | 192×192 | 3×3, 16 |
| 3×3, 16 | ||
| 2 | 96×96 | 3×3, 32 |
| 3×3,32 | ||
| 3 | 48×48 | 3×3, 64 |
| 3×3, 64 | ||
| 3×3, 64 | ||
| 4 | 24×24 | 3×3, 128 |
| 3×3, 128 | ||
| 3×3, 128 | ||
| 5 | 12×12 | 3×3, 256 |
| 3×3, 256 | ||
| 3×3, 256 | ||
| upsample and concatenate scale 1 to 5 features | ||
| predict | 192×192 | 1×1, 64 |
| 1×1, 64 | ||
| 1×1, K | ||
The first two columns list the resolution scale and feature map size. The third column lists the convolutional layer parameters, with “ 3×3,16” denoting 3×3 kernel and 16 output features. The last convolutional layer outputs K features, with K denoting the number of label classes
Fig. 2Illustration of the Dice metric and contour distance metrics. A and B are two sets representing automated segmentation and manual segmentation. The Dice metric calculates the ratio of the intersection |A∩B| over the average area of the two sets (|A|+|B|)/2. The mean contour distance first calculates, for each point p on one contour, its distance to the other contour d(p,∂), then calculates the mean across all the points p. The Hausdorff distance calculates the maximum distance between the two contours
Fig. 3Illustration of the segmentation results for short-axis and long-axis images. The top row shows the automated segmentation, whereas the bottom row shows the manual segmentation. The automated method segments all the time frames. However, only end-diastolic (ED) and end-systolic (ES) frames are shown, as manual analysis only annotates ED and ES frames. The cardiac chambers are represented by different colours. a short-axis. b long-axis (2 chamber view). c long-axis (4 chamber view)
The Dice metric, mean contour distance (MCD) and Hausdorff distance (HD) between automated segmentation and manual segmentation for short-axis images
| (a) The full test set ( | |||
| Dice | MCD (mm) | HD (mm) | |
| LV cavity | 0.94 (0.04) | 1.04 (0.35) | 3.16 (0.98) |
| LV myocardium | 0.88 (0.03) | 1.14 (0.40) | 3.92 (1.37) |
| RV cavity | 0.90 (0.05) | 1.78 (0.70) | 7.25 (2.70) |
| (b) Cases with CVDs ( | |||
| LV cavity | 0.94 (0.04) | 1.19 (0.41) | 3.62 (1.14) |
| LV myocardium | 0.87 (0.04) | 1.23 (0.40) | 4.28 (1.18) |
| RV cavity | 0.90 (0.04) | 2.02 (0.88) | 8.19 (2.94) |
The mean and standard deviation (in parenthesis) are reported
CVD: cardiovascular diseases, LV: left ventricle, RV: right ventricle
Qualitative visual assessment of automated segmentation
| Agreement (%) | Disagreement (%) | ||||
|---|---|---|---|---|---|
| Auto. better | Man. better | Not sure | |||
| Analyst 1 | Basal | 40.0 | 26.2 | 20.6 | 13.2 |
| Mid-ventricular | 84.8 | 12.2 | 2.4 | 0.6 | |
| Apical | 44.0 | 29.0 | 22.0 | 5.0 | |
| Analyst 2 | Basal | 33.0 | 27.4 | 17.4 | 22.2 |
| Mid-ventricular | 91.6 | 6.6 | 1.8 | 0.0 | |
| Apical | 80.8 | 8.8 | 9.6 | 0.8 | |
Two experienced image analysts visually compared automated segmentation to manual segmentation for 250 test subjects and assessed whether the two segmentations achieved a good agreement (visually close to each other) or not. If there was a disagreement between the two, the analysts would score in three categories: automated segmentation performs better; manual segmentation performs better; not sure which one is better. The visual assessment was performed for basal, mid-ventricular and apical slices. The percentage of each score catetory is reported
The difference in clinical measures between automated segmentation and manual segmentation, as well between measurements by different human observers
| (a) Absolute difference | ||||
| Auto vs Manual | O1 vs O2 | O2 vs O3 | O3 vs O1 | |
| ( | ( | ( | ( | |
| LVEDV (mL) | 6.1 (5.3) | 6.1 (4.4) | 8.8 (4.8) | 4.8 (3.1) |
| LVESV (mL) | 5.3 (4.9) | 4.1 (4.2) | 6.7 (4.2) | 7.1 (3.8) |
| LVM (gram) | 6.9 (5.5) | 4.2 (3.2) | 6.6 (4.9) | 6.5 (4.8) |
| RVEDV (mL) | 8.5 (7.1) | 11.1 (7.2) | 6.2 (4.6) | 8.7 (5.8) |
| RVESV (mL) | 7.2 (6.8) | 15.6 (7.8) | 6.6 (5.5) | 11.7 (6.9) |
| (b) Relative difference | ||||
| LVEDV (%) | 4.1 (3.5) | 4.2 (3.1) | 6.3 (3.3) | 3.4 (2.2) |
| LVESV (%) | 9.5 (9.5) | 6.8 (7.5) | 12.5 (8.5) | 11.7 (5.1) |
| LVM (%) | 8.3 (7.6) | 4.4 (3.3) | 6.0 (3.7) | 6.7 (4.6) |
| RVEDV (%) | 5.6 (4.6) | 8.0 (5.0) | 4.2 (3.1) | 5.7 (3.6) |
| RVESV (%) | 11.8 (12.2) | 30.6 (15.5) | 10.9 (8.3) | 16.9 (9.2) |
The first column shows the difference between automated and manual segmentations on a test set of 600 subjects. The second to fourth columns show the inter-observer variability, which is evaluated on a randomly selected set of 50 subjects, each being analysed by three different human observers (O1, O2, O3) independently. The mean and standard deviation (in parenthesis) of the absolute difference and relative difference are reported
Fig. 4Bland-Altman plots of clinical measures between automated measurement and manual measurement, as well between measurements by different human observers. The first column shows the agreement between automated and manual measurements on a test set of 600 subjects. The second to fourth columns show the inter-observer variability evaluated on the randomly selected set of 50 subjects. In each Bland-Altman plot, the x-axis denotes the average of two measurements and the y-axis denotes the difference between them. The dark dashed line denotes the mean difference (bias) and the two light dashed lines denote ± 1.96 standard deviations from the mean
The Dice metric and contour distance metrics between automated segmentation and manual segmentation for long-axis images, as well between segmentations by different human observers
| (a) Dice metric | ||||
| Auto vs Manual | O1 vs O2 | O2 vs O3 | O3 vs O1 | |
| ( | ( | ( | ( | |
| LA cavity (2Ch) | 0.93 (0.05) | 0.92 (0.02) | 0.90 (0.04) | 0.90 (0.04) |
| LA cavity (4Ch) | 0.95 (0.02) | 0.95 (0.03) | 0.94 (0.02) | 0.94 (0.03) |
| RA cavity (4Ch) | 0.96 (0.02) | 0.95 (0.02) | 0.95 (0.02) | 0.95 (0.02) |
| (b) Mean contour distance (mm) | ||||
| LA cavity (2Ch) | 1.46 (1.06) | 1.57 (0.39) | 1.94 (0.68) | 1.95 (0.57) |
| LA cavity (4Ch) | 1.04 (0.38) | 1.08 (0.40) | 1.21 (0.33) | 1.23 (0.35) |
| RA cavity (4Ch) | 0.99 (0.43) | 1.13 (0.35) | 1.22 (0.37) | 1.16 (0.37) |
| (c) Hausdorff distance (mm) | ||||
| LA cavity (2Ch) | 5.76 (5.85) | 5.66 (1.97) | 7.16 (3.12) | 6.78 (2.53) |
| LA cavity (4Ch) | 4.03 (2.26) | 3.89 (1.85) | 4.29 (1.97) | 4.06 (1.44) |
| RA cavity (4Ch) | 3.89 (2.39) | 4.31 (2.20) | 4.20 (2.16) | 4.08 (2.06) |
The first column shows the difference between automated and manual segmentations on a test set of 600 subjects. The second to fourth columns show the inter-observer variability, which is evaluated on a randomly selected set of 50 subjects, each being analysed by three different human observers (O1, O2, O3) independently. The mean and standard deviation (in parenthesis) of the metrics are reported
An exemplar study of cardiac function on large-scale datasets using automatically derived clinical measures
| Normal | Obese | ||
|---|---|---|---|
| ( | ( | ||
| LVEDV (mL) | 143 (31) | 158 (34) | < 0.001 |
| LVESV (mL) | 60 (19) | 67 (20) | < 0.001 |
| LVM (gram) | 85 (20) | 103 (26) | < 0.001 |
| RVEDV (mL) | 152 (36) | 167 (38) | < 0.001 |
| RVESV (mL) | 67 (20) | 75 (22) | < 0.001 |
It compares the normal weight group (18.5 ≤ BMI < 25) to the obese group (BMI ≥ 30). The mean and standard deviation (in parenthesis) are reported
BMI: body mass index, LVEDV: left ventricular end-diastolic volume, LVESV: left ventricular end-systolic volume, LVM: left ventricular mass, RVEDV: right ventricular end-diastolic volume, RVESV: right ventricular end-systolic volume
The difference in derived clinical measures between automated segmentation and manual segmentation, as well between measurements by different human observers
| (a) Absolute difference | ||||
|---|---|---|---|---|
| Auto vs Manual | O1 vs O2 | O2 vs O3 | O3 vs O1 | |
| ( | ( | ( | ( | |
| LVSV (mL) | 6.1 (5.6) | 6.6 (4.1) | 5.6 (4.1) | 4.2 (3.2) |
| LVEF (%) | 3.2 (2.9) | 3.1 (2.1) | 3.0 (2.4) | 3.8 (1.8) |
| LVCO (L/min) | 0.4 (0.3) | 0.4 (0.2) | 0.3 (0.2) | 0.3 (0.2) |
| RVSV (mL) | 8.1 (6.8) | 7.1 (5.5) | 5.3 (4.2) | 5.4 (4.8) |
| RVEF (%) | 4.3 (3.6) | 7.8 (4.4) | 3.7 (2.7) | 5.7 (3.9) |
| RVCO (L/min) | 0.5 (0.4) | 0.4 (0.3) | 0.3 (0.2) | 0.3 (0.3) |
| (b) Relative difference | ||||
| LVSV (%) | 7.0 (5.8) | 7.4 (4.1) | 6.5 (4.8) | 4.8 (3.3) |
| LVEF (%) | 5.4 (4.8) | 5.1 (3.7) | 4.9 (3.8) | 6.6 (3.2) |
| LVCO (%) | 7.0 (5.8) | 7.4 (4.1) | 6.5 (4.8) | 4.8 (3.3) |
| RVSV (%) | 9.6 (8.3) | 8.1 (6.9) | 6.1 (4.4) | 7.1 (8.5) |
| RVEF (%) | 7.5 (6.2) | 12.3 (6.6) | 6.5 (5.0) | 10.7 (7.9) |
| RVCO (%) | 9.6 (8.3) | 8.1 (6.9) | 6.1 (4.4) | 7.1 (8.5) |
LVSV: left ventricular stroke volume, LVEF: left ventricular ejection fraction, LVCO: left ventricular cardiac output, RVSV: right ventricular stroke volume, RVEF: right ventricular ejection fraction, RVCO: right ventricular cardiac output
The first column shows the difference between automated and manual segmentations on a test set of 600 subjects. The second to fourth columns show the inter-observer variability, which is evaluated on a randomly selected set of 50 subjects, each being analysed by three different human observers (O1, O2, O3) independently. The mean and standard deviation (in parenthesis) of the absolute difference and relative difference are reported
Fig. 5Segmentation results on other datasets. The first two cases come from the LVSC 2009 dataset, whereas the last two cases come from the ACDC 2017 dataset. The four cases are respectively of heart failure, LV hypertrophy, dilated cardiomyopathy and abnormal right ventricle. The top row shows the segmentation results by directly applying the UK Biobank-trained network to the LVSC and ACDC data. The bottom row shows the segmentation results after fine-tuning the network to the new data
Dice overlap metrics for segmentations on LVSC 2009 and ACDC 2017 datasets
| LVSC 2009 | ACDC 2017 | |||
|---|---|---|---|---|
| validation set ( | training set split ( | |||
| w.o. fine-tune | w. fine-tune | w.o. fine-tune | w. fine-tune | |
| LV cavity | 0.72 (0.22) | 0.90 (0.08) | 0.74 (0.29) | 0.94 (0.04) |
| LV myocardium | 0.56 (0.18) | 0.81 (0.05) | 0.65 (0.24) | 0.88 (0.05) |
| RV cavity | - | - | 0.60 (0.35) | 0.88 (0.08) |
The performances using the UK Biobank-trained network without fine-tuning and after fine-tuning are compared. The mean and standard deviation (in parenthesis) are reported
The Dice metric and contour distance metrics between automated segmentation and manual segmentation for short-axis images, as well between segmentations by different human observers
| (a) Dice metric | ||||
| Auto vs Manual | O1 vs O2 | O2 vs O3 | O3 vs O1 | |
| ( | ( | ( | ( | |
| LV cavity | 0.94 (0.04) | 0.94 (0.04) | 0.92 (0.04) | 0.93 (0.04) |
| LV myocardium | 0.88 (0.03) | 0.88 (0.02) | 0.87 (0.03) | 0.88 (0.02) |
| RV cavity | 0.90 (0.05) | 0.87 (0.06) | 0.88 (0.05) | 0.89 (0.05) |
| (b) Mean contour distance (mm) | ||||
| LV cavity | 1.04 (0.35) | 1.00 (0.25) | 1.30 (0.37) | 1.21 (0.48) |
| LV myocardium | 1.14 (0.40) | 1.16 (0.34) | 1.19 (0.25) | 1.21 (0.36) |
| RV cavity | 1.78 (0.70) | 2.00 (0.79) | 1.78 (0.45) | 1.87 (0.74) |
| (c) Hausdorff distance (mm) | ||||
| LV cavity | 3.16 (0.98) | 2.84 (0.70) | 3.31 (0.90) | 3.25 (0.96) |
| LV myocardium | 3.92 (1.37) | 3.70 (1.16) | 3.82 (1.07) | 3.76 (1.21) |
| RV cavity | 7.25 (2.70) | 7.56 (2.51) | 7.35 (2.19) | 7.14 (2.20) |
The first column shows the difference between automated and manual segmentations on a test set of 600 subjects. The second to fourth columns show the inter-observer variability, which is evaluated on a randomly selected set of 50 subjects, each being analysed by three different human observers (O1, O2, O3) independently. The mean and standard deviation (in parenthesis) of the metrics are reported
The Dice metric, mean contour distance (MCD) and Hausdorff distance (HD) between automated segmentation and manual segmentation for long-axis image
| Dice | MCD (mm) | HD (mm) | |
|---|---|---|---|
| LA cavity (2Ch) | 0.93 (0.05) | 1.46 (1.06) | 5.76 (5.85) |
| LA cavity (4Ch) | 0.95 (0.02) | 1.04 (0.38) | 4.03 (2.26) |
| RA cavity (4Ch) | 0.96 (0.02) | 0.99 (0.43) | 3.89 (2.39) |
The mean and standard deviation (in parenthesis) are reported on a test set of 600 subjects
LA: left atrium, RA: right atrium