This paper describes the application of the alternating Kernel mixture (AKM) segmentation algorithm to high resolution MRI subvolumes acquired from a 1.5T scanner (hippocampus, n = 10 and prefrontal cortex, n = 9) and a 3T scanner (hippocampus, n = 10 and occipital lobe, n = 10). Segmentation of the subvolumes into cerebrospinal fluid, gray matter, and white matter tissue is validated by comparison with manual segmentation. When compared with other segmentation methods that use traditional Bayesian segmentation, AKM yields smaller errors (P < .005, exact Wilcoxon signed rank test) demonstrating the robustness and wide applicability of AKM across different structures. By generating multiple mixtures for each tissue compartment, AKM mimics the increased variation of manual segmentation in partial volumes due to the highly folded tissues. AKM's superior performance makes it useful for tissue segmentation of subcortical and cortical structures in large-scale neuroimaging studies.
This paper describes the application of the alternating Kernel mixture (AKM) segmentation algorithm to high resolution MRI subvolumes acquired from a 1.5T scanner (hippocampus, n = 10 and prefrontal cortex, n = 9) and a 3T scanner (hippocampus, n = 10 and occipital lobe, n = 10). Segmentation of the subvolumes into cerebrospinal fluid, gray matter, and white matter tissue is validated by comparison with manual segmentation. When compared with other segmentation methods that use traditional Bayesian segmentation, AKM yields smaller errors (P < .005, exact Wilcoxon signed rank test) demonstrating the robustness and wide applicability of AKM across different structures. By generating multiple mixtures for each tissue compartment, AKM mimics the increased variation of manual segmentation in partial volumes due to the highly folded tissues. AKM's superior performance makes it useful for tissue segmentation of subcortical and cortical structures in large-scale neuroimaging studies.
Current magnetic resonance image (MRI)
studies investigate abnormalities of cortical and subcortical structures in
neurodevelopmental and neurodegenerative disorders. These studies require a
delineation of a region of interest (ROI) by manual segmentation by an expert
rater. For example, studies on Alzheimer's disease and mild cognitive
impairment examine the hippocampus [1] while those in schizophrenia have
studied the occipital lobe and prefrontal cortex [2, 3]. Once the ROI is defined,
segmentation into tissue types such as gray matter (GM), white matter (WM), or
cerebrospinal fluid (CSF) can assess subtle volume changes caused by disease
[4, 5]. While manual segmentation would provide gold standard, it is labor
intensive limiting the number of subjects in any study [6]. Also, the rater
needs to be trained to ensure small inter- or intrarater variation. Therefore,
it is necessary to develop a method that allows for efficient processing of
large number of subjects with high inter- or intrarater reliability, thereby
increasing statistical power. Such a method will facilitate greater
understanding of shape change in networks of cortical structures implicated in
neuropsychiatric diseases [7, 8].A variety of methods have been proposed for the
segmentation of subcortical tissue such as the hippocampus [9-11] and cortical tissues
such as prefrontal cortex, cingulate cortex, and planum temporale [12-16]. However,
even though tissue classification methods have been improving in their
performance, relatively low accuracy (comparing with expert-rater standards)
has prevented accurate structural segmentation, for example, distinguishing the
hippocampus from surrounding structures in the medial temporal lobe.Partial volume voxels containing multiple tissue types
present challenges to traditional Bayesian tissue classification methods [17-24]
that model each tissue type as a fixed-bandwidth, single Gaussian in
mixture-of-Gaussian models. Priebe et al. [25] proposed an alternating Kernel mixture
(AKM) method which allowed for the flexibility of a Gaussian mixture model, with
bandwidth, and the number of Gaussians selected adaptively from the data for
each tissue type. The purpose of our study was to compare the performance of
AKM and traditional Bayesian methods. The two methods were compared by
determining which method was closer to the manual segmentation (ground truth) of
cortical and subcortical structures in MRI subvolumes acquired from 1.5T and 3T
scanners.The manuscript is organized as follows. Section 2 describes
the AKM mixture modeling methodology in detail and other Bayesian-based
segmentation methods. Section 3 describes the dataset being investigated and
image analysis employed. Section 4 reports the results.
2. METHOD
2.1. Alternating Kernel mixture method
Priebe and Marchette [26] and James et al. [27] introduced
a semiparametric solution to the problem of estimating the common probability
density function for multiple identically distributed random variables. Their solution is an iterative one that
combines parametric and nonparametric estimates with a resulting model that
incorporates both the complexity and the smoothness of the data.We applied this
method to the problem of MR segmentation.
Gaussian mixture modeling is a popular segmentation technique. The marginal probability density function for
the observations is where C:= {C, G, W} is the set of tissue
types (CSF, GM, and WM), f are the class-conditional marginals, and π are class-conditional mixing
coefficients. These coefficients are
nonnegative and sum to unity. Thus, the
image is the sum of the three tissue types.
Each class-conditional marginal is a mixture of normals given by where π are the strictly positive, class-specific
mixing coefficients, which sum to one, and φ are the Gaussian probabilities with a mean of μ and a variance of σ2. Combining these equations we see that the
marginals are given byThe method estimates the class-conditional mixture
complexities k, the mixing coefficients π,
and the mixture components φ.
The Expectation-Maximization (EM) algorithm is used to estimate the means and
variances of the components [18, 19]. The mixture complexities are estimated
from the data.The method alternates between parametric finite
mixture estimates and nonparametric Kernel estimates. Each estimate is based on the previous one of
the opposite type. The first step of the
algorithm is to find a parametric estimate and a nonparametric estimate of the
data. Then, at each iteration, a
parametric estimate that minimizes the distance between the two previous
estimates is computed. Using the
parameter estimates thus derived, a nonparametric estimate is found. This continues until the distance between two
consecutive parametric estimates is smaller than a desired constant.The filtered Kernel
estimate (i.e., the nonparametric estimate), with bandwidth b, is where X = {X1,…, X} is the subject's MR voxel observation, σ2 is the variance of the tth component of the mixture, and φ0 is the standard normal with zero mean and unit
variance. The nonparametric estimates
are each based on the parametric estimate from the previous iteration and are
given by where F is the class of k-component Gaussian mixtures, and is
the integrated squared error.To
actually classify voxels, the Bayes plug-in classifier is used: where x is the voxel to be labeled. The label is assigned to a class based on
which one maximizes posterior probability of class membership. This can also be seen as a likelihood ratio
test procedure given by Tissues are then classified according to Table 1.
Table 1
Voxel classification based on likelihood ratio test.
Case
Classification
r1(x) > 1
Voxel labeled C
r2(x) < 1
Voxel labeled W
r1(x) < 1 and r2(x) > 1
Voxel labeled G
r1(x) > 1 and r2(x) < 1
Should not occur
Tie
Determined arbitrarily
This method results in the voxels being classified into
three categories. Priebe et al. [25]
showed how a training set could be used to determine the number of components
for each tissue. However, the focus of this paper is on how this could be done
on a case-by-case basis using visual inspection. It was found that two or three
components of CSF, GM, and WM produced the best result; in a couple of cases
the complexity was better modeled with four components.
2.2. Bayesian segmentation
For comparison, voxels are classified into three tissue
types by Bayesian segmentation: where I is the image intensity, h is the anatomical label, μ is the mean, and σ2 is the variance of the Gaussian density. The algorithm is where π(h) is the prior distribution that represents the
relative amount of each of the tissue types and H := {C, G, W}. As with AKM,
the EM algorithm is used to estimate the means and variances of the three tissues
[18, 19].
2.3. Neyman-Pearson recalibration
Bayesian
segmentation can be extended to two additional classes for C/G and G/W partial
volumes which are optimally determined by [28] at
each voxel. Here, the four thresholds (θ1,…, θ4) are determined by the five
Gaussians. Thresholds are selected to
minimize the misclassification error (Section 3.5) such that θ = θ1 + t(θ2 − θ1) and θ = θ3 + t(θ4 − θ3), where t ∈ [0, 1] and t ∈ [0, 1].
The means then are used to recalibrate the segmentations yielding new
thresholds. This is referred to as Neyman-Pearson recalibration.
3. VALIDATION
3.1. Data acquisition
Four different sets of ROIs were extracted from subjects
scanned via the magnetization prepared rapid gradient echo sequence
on different scanners. Two came from a 3T scanner (10 hippocampi [29] and 5
pairs of left and right occipital lobes [30]); two came from a 1.5T scanner (10
hippocampi [31] and 9 prefrontal cortices [31]). Processed datasets were
reformatted to 8 bit and interpolated to 1 × 1 × 1 mm3 isotropic voxels except for the prefrontal set
with a resolution of 0.5 × 0.5 × 0.5 mm3, and are available at http://www.cis.jhu.edu/data.sets/index.html.
3.2. MRI subvolumes
To obtain a smaller ROI around a hippocampus, we manually
outlined hippocampus and dilated it by 3 × 3 × 3 mm3 cubes with three
iterations to generate a mask via BLOX (http://sourceforge.net/projects/blox).
Figure 1 shows an example of the mask generated for a left hippocampus. The
prefrontal cortex [31] and occipital lobe [30] subvolumes were defined by an
expert neuroanatomist.
Figure 1
Hippocampus ROI mask delineated in red.
3.3. Manual tissue segmentation
The 39 subvolumes were hand segmented into CSF, GM, and WM tissue
compartments by three different raters in independent studies (e.g., [30, 31]) and
blind to the autosegmentation. Segmentation was done by visual inspection on
contiguous sagittal slices on Analyze software [32] and saved as Analyze image data
with labels for CSF, GM, and WM.
3.4. Automated tissue segmentation
AKM and Bayesian segmentation were applied to the 39 subvolumes.
For comparison, FreeSurfer [33] and BrainVoyager [34] were used to segment the
hippocampi and occipital lobes, respectively. Neyman-Pearson segmentation was
applied to prefrontal cortex. The EM algorithm [18] ensured that computations
were done in real time.The 10 hippocampus subvolumes from the 1.5T scanner were
processed by FreeSurfer [33] to segment and label the volume by its anatomical
structure. Each voxel was classified by a given anatomical label (i.e.,
hippocampus, ventricles). Then we group the structures into WM, GM, and CSF to
create WM, GM, and CSF masks. Lateral ventricle and left inferior lateral
ventricle were categorized as CSF. Cerebellum-exterior, hippocampus and amygdala
were grouped as GM and cerebral white matter, thalamus proper, putamen, ventral
diencephalon, and WM hypointensities were grouped as WM.
3.5. Quantification of segmentation accuracy
Segmentations were compared via the L1 distance between two distributions as a measure of
misclassification error. A cost is assigned to each labeled voxel. If it was labeled correctly, that cost is 0,
and if labeled incorrectly, that cost is generally 1. This cost, called the L1 distance, is where p(h ∣ I) is the posteriori probability of hypothesis h at voxel n for the automated, p(h ∣ I) is the same for the manual segmentation, and m is the number of tissue types [18, 19, 28, 35].
The distance measures agreement between segmentations based on distance between
probability distributions [36]. The standard overlap measures penalize small
objects assuming that most of the error occurs at the boundary of objects thus L1 distance is more
appropriate for assessing 3D segmentation [37]. Another standard measure, the
Dice measure, was also used [38].
4. RESULTS
Tables 2, 3, 4, and 5 and Figures 2, 3, 4, and 5 show that L1 distances for AKM method are
lower than those for Bayesian and other segmentation methods (P < .005, Exact Wilcoxon signed rank
test). Lower L1
distances mean that AKM segmentation have more overlap with manual segmentation
than other methods. Dice measures for AKM were consistently smaller than other
methods.
Table 2
Classification error for Bayesian and AKM for hippocampi (3T).
Bayesian
AKM
1
0.106
0.099
2
0.124
0.107
3
0.234
0.183
4
0.133
0.120
5
0.283
0.114
6
0.186
0.103
7
0.153
0.131
8
0.133
0.126
9
0.273
0.172
10
0.163
0.153
Table 3
Classification error for Bayesian, AKM, and Neyman-Pearson for prefrontal cortices (1.5T).
Bayesian
Neyman-Pearson
AKM
1
0.138
0.144
0.093
2
0.103
0.103
0.101
3
0.087
0.088
0.081
4
0.091
0.097
0.081
5
0.135
0.135
0.097
6
0.093
0.098
0.088
7
0.095
0.103
0.093
8
0.127
0.127
0.106
9
0.091
0.091
0.085
Table 4
Classification error for Bayesian, BrainVoyager, and AKM for occipital lobes (3T).
Bayesian
BrainVoyager
AKM
1
0.170
0.199
0.119
2
0.149
0.202
0.093
3
0.243
0.221
0.099
4
0.210
0.202
0.096
5
0.236
0.244
0.112
6
0.224
0.237
0.128
7
0.165
0.111
0.099
8
0.224
0.117
0.104
9
0.157
0.248
0.119
10
0.146
0.237
0.121
Table 5
Classification error for Bayesian, FreeSurfer and AKM for ten hippocampi (1.5T).
Bayesian
FreeSurfer
AKM
1
0.121
0.145
0.113
2
0.162
0.225
0.161
3
0.110
0.144
0.096
4
0.131
0.178
0.109
5
0.175
0.191
0.146
6
0.129
0.169
0.119
7
0.121
0.190
0.121
8
0.121
0.165
0.121
9
0.155
0.196
0.139
10
0.128
0.143
0.120
Figure 2
Classification error for Bayesian (blue) and
AKM (red) for hippocampi (3T).
Figure 3
Classification error for Bayesian (blue), Neyman-Pearson
(green), and AKM (red) for prefrontal cortices (1.5T).
Figure 4
Classification error for Bayesian (blue),
BrainVoyager (green), and AKM (red) for occipital lobes (3T).
Figure 5
Classification error for Bayesian (blue),
FreeSurfer (green), and AKM (red) for ten hippocampi (1.5T).
Figures 6 and 7 explain the
reason for low classification errors of AKM. Green, red, and blue curves show the
intensity profile of voxels labeled as CSF, GM, and WM, respectively. The
figures show how intensity histograms for manual segmentation voxels are
similar to those for AKM segmentation. Vertical lines are threshold intensity
values calculated from AKM method. Manual segmentation histograms show that
each tissue type has wide range of intensities thus resulting in large overlaps
between tissue types due to partial volume problems where the boundaries
between tissue types are not obvious. Figure 7 shows how the large tails for
each tissue types is captured by AKM yielding more accurate threshold values compared
with the single Gaussian approach. Further, Table 3 and Figure 3 shows that AKM
models the partial volume better than Neyman-Pearson; note that Neyman-Pearson
yielded larger errors than Bayesian since the recalibration was based on the averaged
thresholds. Finally, Figures 8, 9, and 10 show views of
the AKM segmentation of hippocampus, prefrontal cortex, and occipital lobe
subvolumes.
Figure 6
Intensity histogram for hippocampus:
hand (a) and AKM (b) segmentation with vertical lines from AKM threshold. Green, blue, and red correspond to CSF, GM, and WM segmented
voxels, respectively.
Figure 7
Intensity histogram for occipital lobe: hand- (a), AKM (b), and Bayesian
segmentation (c) with vertical lines from AKM ((a) and (b)) and Bayesian
(c).
Figure 8
Sagittal view of left hippocampus. (a) MRI. (b) AKM
segmentation (blue-GM, white-WM, red-CSF).
Axial view of left occipital lobe. (a)
MRI. (b) AKM segmentation (blue-GM, white-WM, red-CSF).
5. CONCLUSION
This
paper describes an algorithm that models each tissue type in brain MRI subvolumes as a semiparametric mixture of
Gaussians. The classification method which uses this algorithm results in
better segmentation than a traditional, single-component Bayesian method
especially when there is not enough CSF or WM in the subvolume. Human raters
are good at segmenting partial volume voxels by adapting to the high variance
of intensities in these regions. AKM is also able to adaptively select the
bandwidth and the number of Gaussians for each tissue type. Thus, AKM
approximated the manual segmentation more closely compared to Bayesian methods.
AKM can automatically delineate cortical and subcortical structures which can
be distinguished by intensity information. However, there are structures that
cannot be segmented by intensity alone. For example, anterior boundary of the
hippocampus merges with the amygdala which has similar intensity [39] or the anatomical boundary of prefrontal
cortex and occipital lobe has to be
defined with spatial information. For these structures, AKM may be
useful when combined with mapping and image registration approach. Also, AKM
can be applied to other imaging modalities of other anatomical structures, such
as segmenting myocardium, blood, and bone in a cardiac CT scan.
Authors: Bruce Fischl; David H Salat; Evelina Busa; Marilyn Albert; Megan Dieterich; Christian Haselgrove; Andre van der Kouwe; Ron Killiany; David Kennedy; Shuna Klaveness; Albert Montillo; Nikos Makris; Bruce Rosen; Anders M Dale Journal: Neuron Date: 2002-01-31 Impact factor: 17.173
Authors: Y Hirayasu; S Tanaka; M E Shenton; D F Salisbury; M A DeSantis; J J Levitt; C Wible; D Yurgelun-Todd; R Kikinis; F A Jolesz; R W McCarley Journal: Cereb Cortex Date: 2001-04 Impact factor: 5.357
Authors: John G Csernansky; Lei Wang; Donald Jones; Devna Rastogi-Cruz; Joel A Posener; Gitry Heydebrand; J Philip Miller; Michael I Miller Journal: Am J Psychiatry Date: 2002-12 Impact factor: 18.112
Authors: Toshiaki Onitsuka; Robert W McCarley; Noriomi Kuroki; Chandlee C Dickey; Marek Kubicki; Susan S Demeo; Melissa Frumin; Ron Kikinis; Ferenc A Jolesz; Martha E Shenton Journal: Schizophr Res Date: 2007-03-09 Impact factor: 4.939
Authors: M I Miller; M Hosakere; A R Barker; C E Priebe; N Lee; J T Ratnanather; L Wang; M Gado; J C Morris; J G Csernansky Journal: Proc Natl Acad Sci U S A Date: 2003-12-01 Impact factor: 11.205
Authors: J Tilak Ratnanather; Clare B Poynton; Dominic V Pisano; Britni Crocker; Elizabeth Postell; Shannon Cebron; Elvan Ceyhan; Nancy A Honeycutt; Pamela B Mahon; Patrick E Barta Journal: Schizophr Res Date: 2013-09-06 Impact factor: 4.939
Authors: Michael P Harms; Lei Wang; Carolina Campanella; Kristina Aldridge; Amanda J Moffitt; John Kuelper; J Tilak Ratnanather; Michael I Miller; Deanna M Barch; John G Csernansky Journal: Br J Psychiatry Date: 2010-02 Impact factor: 9.319
Authors: Meghana S Karnik-Henry; Lei Wang; Deanna M Barch; Michael P Harms; Carolina Campanella; John G Csernansky Journal: Schizophr Res Date: 2012-04-26 Impact factor: 4.939