| Literature DB >> 35022020 |
Jiacheng Li1, Ruirui Li2, Ruize Han1, Song Wang3,4.
Abstract
BACKGROUND: Retinal vessel segmentation benefits significantly from deep learning. Its performance relies on sufficient training images with accurate ground-truth segmentation, which are usually manually annotated in the form of binary pixel-wise label maps. Manually annotated ground-truth label maps, more or less, contain errors for part of the pixels. Due to the thin structure of retina vessels, such errors are more frequent and serious in manual annotations, which negatively affect deep learning performance.Entities:
Keywords: Label map correction; Noise-tolerant; Reliability estimation; Retina image segmentation; Temporal statistics
Mesh:
Year: 2022 PMID: 35022020 PMCID: PMC8753937 DOI: 10.1186/s12880-021-00732-y
Source DB: PubMed Journal: BMC Med Imaging ISSN: 1471-2342 Impact factor: 1.930
Fig. 1An illustration of the proposed framework
Fig. 2An illustration of the proposed training schedule
Fig. 3Noisy label maps from different pollution source: a correct label map without noise, b label map with synthetic noise, c label map with pseudo labeling noise, d label map with manually labeled noise
Comparative results of prediction on the testing set (%)
| Dataset | DRIVE(R) | STARE(VK) | CHASE | ||||
|---|---|---|---|---|---|---|---|
| Group | Method | F1 | PR | F1 | PR | F1 | PR |
| Baseline | 73.2 | 76.0 | 75.7 | 77.7 | 81.9 | 82.8 | |
| Cas | 76.1 | 83.5 | 72.6 | 78.8 | 79.9 | 87.8 | |
| SF | 78.4 | 87.8 | 78.0 | 87.1 | 85.5 | 93.9 | |
| Ours | |||||||
| Baseline | 70.2 | 72.8 | 72.3 | 74.2 | 77.2 | 78.3 | |
| Cas | 75.1 | 82.0 | 72.9 | 79.8 | 77.4 | 85.4 | |
| SF | 75.7 | 84.5 | 77.3 | 86.0 | 83.6 | 92.2 | |
| Ours | |||||||
| Baseline | 67.2 | 69.6 | 69.4 | 71.3 | 73.8 | 74.9 | |
| Cas | 75.1 | 82.3 | 71.1 | 76.9 | 77.7 | 85.4 | |
| SF | 73.2 | 82.4 | 75.0 | 83.5 | 82.4 | 91.0 | |
| Ours | |||||||
| Pseudo | Baseline | 79.3 | 87.2 | 76.0 | 83.9 | / | / |
| Cas | 75.7 | 82.8 | 74.0 | 81.3 | / | / | |
| SF | 80.0 | 88.1 | / | / | |||
| Ours | 76.5 | 84.2 | / | / | |||
| Manual | Baseline | 78.9 | 79.9 | 72.2 | 76.0 | 76.3 | 78.2 |
| Cas | 78.1 | 85.8 | 73.4 | 79.8 | 78.8 | 81.8 | |
| SF | 82.8 | 91.2 | 76.7 | 84.1 | 82.7 | 90.4 | |
| Ours | |||||||
The values with bold denote the best performance in each group
Comparative results of prediction on testing set.(%)
| Dataset | DRIVE(R) | STARE(VK) | CHASE | ||||
|---|---|---|---|---|---|---|---|
| Group | Method | F1 | PR | F1 | PR | F1 | PR |
| U-Net | 73.9 | 82.9 | 79.6 | 87.4 | 77.4 | ||
| Cas | 73.6 | 81.9 | 75.1 | 82.1 | 73.4 | 80.7 | |
| SF | 76.3 | 84.9 | 80.4 | 88.2 | 76.5 | 84.7 | |
| Ours | 85.3 | ||||||
| U-Net | 74.5 | 81.3 | 79.1 | 87.3 | 70.7 | 78.4 | |
| Cas | 73.4 | 82.0 | 74.4 | 82.0 | 73.1 | 80.5 | |
| SF | 73.7 | 82.4 | 80.0 | 87.6 | 75.3 | 82.9 | |
| Ours | |||||||
| U-Net | 72.3 | 81.4 | 78.6 | 86.4 | 70.1 | 77.0 | |
| Cas | 72.2 | 80.2 | 69.7 | 76.1 | 69.9 | 76.6 | |
| SF | 72.0 | 80.3 | 76.9 | 85.3 | 74.4 | 81.5 | |
| Ours | |||||||
| Pseudo | U-Net | 78.1 | 86.5 | 80.1 | 88.3 | / | / |
| Cas | 74.0 | 82.2 | 75.8 | 83.8 | / | / | |
| SF | 78.5 | 87.0 | 80.3 | 88.5 | / | / | |
| Ours | / | / | |||||
| Manual | U-Net | 80.2 | 88.7 | 80.0 | 87.9 | 77.7 | 85.0 |
| Cas | 76.5 | 85.1 | 73.8 | 81.1 | 73.9 | 81.0 | |
| SF | 80.9 | 89.5 | 81.3 | 88.7 | 79.4 | 87.2 | |
| Ours | |||||||
The values with bold denote the best performance in each group
Cross validation on DRIVE(R) and STARE datasets.(%)
| Dataset | DRIVE(R) | STARE(VK) | |||
|---|---|---|---|---|---|
| Group | Method | F1 | PR | F1 | PR |
| U-Net | 68.5 | 75.6 | 70.3 | 78.6 | |
| Cas | 36.0 | 36.0 | 70.8 | 74.8 | |
| SF | 65.0 | 71.2 | 72.2 | 80.2 | |
| Ours | |||||
| U-Net | 68.7 | 74.6 | 70.1 | 77.3 | |
| Cas | 47.0 | 48.5 | 72.4 | 79.8 | |
| SF | 66.7 | 72.6 | 70.9 | 78.3 | |
| Ours | |||||
| U-Net | 49.0 | 46.1 | 67.0 | 74.7 | |
| Cas | 50.0 | 54.5 | 77.9 | ||
| SF | 67.8 | 73.5 | 68.7 | 76.4 | |
| Ours | |||||
| Pseudo | U-Net | 57.4 | 62.7 | 72.2 | 78.1 |
| Cas | 49.6 | 52.2 | 76.9 | 83.5 | |
| SF | 70.9 | 76.9 | 74.7 | 81.1 | |
| Ours | |||||
| Manual | U-Net | 55.3 | 59.7 | 70.6 | 77.3 |
| Cas | 61.9 | 68.3 | 75.5 | 82.0 | |
| SF | 65.9 | 71.3 | 74.9 | 82.7 | |
| Ours | |||||
The values with bold denote the best performance in each group
Fig. 4Visual results of the corrected label maps on training set of the comparison methods and the proposed method
Fig. 6Full visual results of the corrected label maps on training set of the comparison methods and the proposed method
Fig. 5Change curve of train loss, test loss, and the label correction performance with the training epoch increasing
Ablation study of label correction task.(%)
| Dataset | DRIVE(R) | STARE(VK) | CHASE | ||||
|---|---|---|---|---|---|---|---|
| Group | Method | F1 | PR | F1 | PR | F1 | PR |
| w/o TML | 78.5 | 87.5 | 79.0 | 87.4 | 85.5 | 93.9 | |
| w | 79.3 | 87.7 | 78.9 | 87.2 | 85.5 | 93.8 | |
| w | 88.1 | 79.5 | 87.7 | ||||
| Ours | 79.6 | 85.9 | 94.1 | ||||
| w/o TML | 76.4 | 85.9 | 78.0 | 86.5 | 83.6 | 92.2 | |
| w | 76.7 | 85.7 | 78.3 | 86.7 | 83.8 | 92.4 | |
| w | 77.1 | 84.7 | 77.9 | 87.0 | |||
| Ours | 84.3 | ||||||
| w/o TML | 75.6 | 84.0 | 76.9 | 85.0 | 82.4 | 91.0 | |
| w | 74.4 | 82.9 | 76.0 | 84.4 | 81.3 | 89.0 | |
| w | 84.7 | ||||||
| Ours | 77.0 | 77.6 | 86.4 | 83.1 | 91.4 | ||
| Pseudo | w/o TML | 79.9 | 88.0 | 76.2 | 83.9 | / | / |
| w | 80.0 | 88.0 | 76.3 | 84.0 | / | / | |
| w | 80.1 | 87.9 | / | / | |||
| Ours | 76.5 | 84.2 | / | / | |||
| Manual | w/o TML | 82.6 | 90.5 | 76.2 | 83.4 | 82.7 | 90.6 |
| w | 82.5 | 90.9 | 75.8 | 83.1 | 82.5 | 90.4 | |
| w | 82.6 | 90.8 | 77.6 | 82.8 | 90.1 | ||
| Ours | 84.3 | ||||||
The values with bold denote the best performance in each group