| Literature DB >> 23028771 |
Lucas D Eggert1, Jens Sommer, Andreas Jansen, Tilo Kircher, Carsten Konrad.
Abstract
Automated gray matter segmentation of magnetic resonance imaging data is essential for morphometric analyses of the brain, particularly when large sample sizes are investigated. However, although detection of small structural brain differences may fundamentally depend on the method used, both accuracy and reliability of different automated segmentation algorithms have rarely been compared. Here, performance of the segmentation algorithms provided by SPM8, VBM8, FSL and FreeSurfer was quantified on simulated and real magnetic resonance imaging data. First, accuracy was assessed by comparing segmentations of twenty simulated and 18 real T1 images with corresponding ground truth images. Second, reliability was determined in ten T1 images from the same subject and in ten T1 images of different subjects scanned twice. Third, the impact of preprocessing steps on segmentation accuracy was investigated. VBM8 showed a very high accuracy and a very high reliability. FSL achieved the highest accuracy but demonstrated poor reliability and FreeSurfer showed the lowest accuracy, but high reliability. An universally valid recommendation on how to implement morphometric analyses is not warranted due to the vast number of scanning and analysis parameters. However, our analysis suggests that researchers can optimize their individual processing procedures with respect to final segmentation quality and exemplifies adequate performance criteria.Entities:
Mesh:
Year: 2012 PMID: 23028771 PMCID: PMC3445568 DOI: 10.1371/journal.pone.0045081
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Image quality parameters of the BrainWeb data set.
| SNR | |||
| Data set | White matter | Gray matter | CNR |
| BrainWeb data set | |||
| Image 4 | 59.13 | 44.48 | 14.65 |
| Image 5 | 59.08 | 46.74 | 12.34 |
| Image 6 | 56.80 | 45.72 | 11.08 |
| Image 18 | 50.97 | 40.80 | 10.17 |
| Image 20 | 54.75 | 43.67 | 11.08 |
| Image 38 | 56.09 | 45.97 | 10.12 |
| Image 41 | 50.28 | 39.95 | 10.33 |
| Image 42 | 50.72 | 40.65 | 10.07 |
| Image 43 | 50.26 | 39.62 | 10.64 |
| Image 44 | 53.17 | 42.86 | 10.31 |
| Image 45 | 54.71 | 43.78 | 10.93 |
| Image 46 | 52.02 | 41.09 | 10.93 |
| Image 47 | 50.01 | 40.10 | 9.91 |
| Image 48 | 53.00 | 42.32 | 10.68 |
| Image 49 | 53.06 | 41.23 | 11.83 |
| Image 50 | 53.26 | 41.89 | 11.37 |
| Image 51 | 49.86 | 38.38 | 11.48 |
| Image 52 | 53.33 | 42.95 | 10.38 |
| Image 53 | 47.13 | 37.35 | 9.78 |
| Image 54 | 53.23 | 41.33 | 11.90 |
Note. SNR = signal-to-noise ratio; CNR = contrast-to-noise ratio.
Image quality parameters of the IBSR data set.
| SNR | |||
| Data set | White matter | Gray matter | CNR |
| Image 1 | 47.27 | 36.28 | 10.99 |
| Image 2 | 116.98 | 94.92 | 22.06 |
| Image 3 | 40.01 | 27.01 | 13.00 |
| Image 4 | 23.48 | 15.86 | 7.62 |
| Image 5 | 113.07 | 80.23 | 32.84 |
| Image 6 | 100.85 | 68.55 | 32.30 |
| Image 7 | 69.17 | 34.42 | 34.75 |
| Image 8 | 75.83 | 43.87 | 31.96 |
| Image 9 | 83.92 | 48.52 | 35.40 |
| Image 10 | 51.07 | 25.02 | 26.05 |
| Image 11 | 106.32 | 58.95 | 47.37 |
| Image 12 | 57.38 | 42.79 | 14.59 |
| Image 13 | 39.13 | 27.52 | 11.61 |
| Image 14 | 61.94 | 40.74 | 21.20 |
| Image 15 | 103.84 | 70.01 | 33.83 |
| Image 16 | 111.24 | 71.60 | 39.64 |
| Image 17 | 40.41 | 26.67 | 13.74 |
| Image 18 | 39.43 | 26.81 | 12.62 |
Note. SNR = signal-to-noise ratio; CNR = contrast-to-noise ratio.
Image quality parameters of the Single Subject data set.
| SNR | |||
| Data set | White matter | Gray matter | CNR |
| Image 1 | 132.49 | 86.01 | 46.48 |
| Image 2 | 137.74 | 85.82 | 51.92 |
| Image 3 | 128.06 | 78.71 | 49.35 |
| Image 4 | 141.60 | 89.15 | 52.45 |
| Image 5 | 125.76 | 78.81 | 46.95 |
| Image 6 | 143.16 | 83.24 | 59.92 |
| Image 7 | 143.88 | 84.14 | 59.74 |
| Image 8 | 144.12 | 83.93 | 60.19 |
| Image 9 | 131.53 | 81.86 | 49.67 |
| Image 10 | 142.81 | 64.14 | 78.67 |
Note. SNR = signal-to-noise ratio; CNR = contrast-to-noise ratio.
Image quality parameters of the OASIS data set.
| SNR | |||
| Data set | White matter | Gray matter | CNR |
| Image 61/1 | 17.83 | 9.30 | 8.53 |
| Image 62/2 | 22.44 | 12.06 | 10.38 |
| Image 92/1 | 18.71 | 10.23 | 8.48 |
| Image 92/2 | 19.53 | 10.71 | 8.82 |
| Image 111/1 | 26.10 | 12.84 | 13.26 |
| Image 111/2 | 22.62 | 10.00 | 12.62 |
| Image 145/1 | 21.13 | 9.10 | 12.03 |
| Image 145/2 | 22.92 | 10.45 | 12.47 |
| Image 150/1 | 28.05 | 13.80 | 14.25 |
| Image 150/2 | 28.08 | 14.39 | 13.69 |
| Image 156/1 | 29.91 | 15.30 | 14.61 |
| Image 156/2 | 28.22 | 14.15 | 14.07 |
| Image 236/1 | 29.09 | 14.28 | 14.81 |
| Image 236/2 | 27.96 | 15.80 | 12.16 |
| Image 249/1 | 27.06 | 13.35 | 13.71 |
| Image 249/2 | 27.25 | 13.09 | 14.16 |
| Image 285/1 | 22.05 | 11.48 | 10.57 |
| Image 285/2 | 18.99 | 9.90 | 9.09 |
| Image 379/1 | 25.89 | 13.17 | 12.72 |
| Image 379/2 | 32.27 | 15.79 | 16.48 |
Note. SNR = signal-to-noise ratio; CNR = contrast-to-noise ratio.
Figure 1Overview of the study design.
In total we processed fifty data sets: (i) twenty simulated brains of the Simulated Brain Database with different anatomical models (“BrainWeb data set”), (ii) 18 different real subjects with corresponding expert segmentations (“IBSR data set”), (iii) ten T1-weighted scans of the same individual (“Single Subject data set”), and (iv) ten pairs of images of subjects who were scanned twice within a maximum of twelve days (“OASIS data set”). We created in total thirty segmentation pathways where each consisted of: An intensity non-uniformity correction preprocessing step (consisting of no intensity correction or N3), a skull-stripping preprocessing step (consisting of no skull-stripping, BET, or WS), and the segmentation of gray matter (via Segment, New Segment, VBM8, FAST, or FreeSurfer). Once created, we determined that 23 of the total constructed segmentation pathways were feasible for evaluation and these were investigated in the analysis (infeasible pathways are represented with a dot as end marker). To determine the accuracy of the different segmentation pathways we calculated the Dice coefficient for the gray matter maps and corresponding ground truth images for the twenty simulated brains and the IBSR data set. We tested the reliability of the segmentation pathways by (i) determining the variability in terms of standard deviation and coefficient of variation with respect to gray matter volume on the Single Subject data images, and (ii) by calculating the test-retest reliability with respect to gray matter volume for the OASIS data set.
Figure 2Overview of the results for the five segmentation algorithms in default mode.
Depicted is the mean Dice coefficient that was reached by each of the five standard segmentation algorithms in its default mode on the BrainWeb images (Panel A) and on the IBSR data set (Panel B). Panel C and D summarize the average false positive rate fp and the average false negative rate fn for each segmentation algorithm on the Brain Web data set (Panel C) and on the IBSR images (Panel D). Panel E shows the coefficient of variation for gray matter volumes detected in the Single Subject data set. For each segmentation algorithm, Panel F depicts the test-retest reliability determined on the OASIS data set.