Literature DB >> 24783191

A gaussian mixture model approach for estimating and comparing the shapes of distributions of neuroimaging data: diffusion-measured aging effects in brain white matter.

Namhee Kim¹, Moonseong Heo², Roman Fleysher¹, Craig A Branch³, Michael L Lipton⁴.

Abstract

Neuroimaging signal intensity measures underlying physiology at each voxel unit. The brain-wide distribution of signal intensities may be used to assess gross brain abnormality. To compare distributions of brain image data between groups, t-tests are widely applied. This approach, however, only compares group means and fails to consider the shapes of the distributions. We propose a simple approach for estimating both subject- and group-level density functions based on the framework of Gaussian mixture modeling, with mixture probabilities that are testable between groups. We demonstrate this approach by application to the analysis of fractional anisotropy image data for assessment of aging effects in white matter.

Entities: Chemical Disease Gene Species

Keywords: Gaussian mixture model; aging effects; density function estimation; diffusion tensor imaging; fractional anisotropy

Year: 2014 PMID： 24783191 PMCID： PMC3995036 DOI： 10.3389/fpubh.2014.00032

Source DB: PubMed Journal: Front Public Health ISSN： 2296-2565

Introduction

Fractional anisotropy (FA), a scalar measure derived from diffusion tensor imaging (DTI), indexes the degree of anisotropy of water diffusion in brain tissue. In normal white matter (WM), water diffusion is highly constrained to predominantly move parallel to the long axis of axon bundles, or nerve fibers. High FA is thus characteristic of normal WM and may be an indicator of its health. Low FA, on the other hand, can reflect loss of WM microstructural elements in tissues, which normally have high FA and may be an indicator of disease. Low WM FA is associated with demyelinating disease, dementia, traumatic brain injury (TBI), and normal aging. Inter-subject variability of WM microstructure has been reported with normal subjects as well as patients with various neurodegenerative diseases (1–6). Neurodegeneration is also a feature of normal aging and WM effects of aging disrupt cerebral connectivity, leading to cognitive dysfunction (7, 8). Commonly applied statistical analyses, e.g., t-tests at each voxel, may be inherently insensitive to disease pathology due to inter-subject spatial variation (2–4, 9). A whole brain (or whole WM) histogram approach has been used in some studies to address this limitation (10, 11). For example, Benson et al. (10) estimated kurtosis, skewness, peak height, and mean from histograms of WM FA in TBI patients and normals. They aggregated these measures to test for group differences in the shapes of the individuals’ histograms. This analysis demonstrated that the FA distribution of TBI patients exhibited higher kurtosis, higher peak height, and greater skew toward higher FA, but lower mean, compared to those derived from controls. Although this approach used standard statistical summary measures, differences among the shapes of distributions between groups are not easily understood using these summary measures. In addition, these summary measures are not proper for description of multimodal distributions, a special case, which could result from a mixture of multiple distributions (12). Therefore, we propose that estimation and comparison of density functions between groups based on a mixture distribution approach will be more relevant to brain imaging data, which can exhibit unusual distributions. In this study, we propose a simple approach to estimate subject-level density functions. The technique is based on a Gaussian mixture model (GMM), which assigns subject-specific mixing probabilities to latent underlying Gaussian densities a posteriori to characterize an overall distribution of each subject. Estimation of group-level density functions will be based on the estimated subject-level density functions. Differences between groups therefore only depend on the composition of mixture probabilities to underlying Gaussian densities, which lead to an easy and intuitive comparison between groups. For instance, in a simple GMM assuming two Gaussian density components with equal variance constraint, a mixture density function with higher mixing probability to the Gaussian components with lower mean is characterized as a distribution with lower mean and positive skew, while the opposite is characterized as a distribution with higher mean and negative skew. Additionally, the mixture density function with mixing probabilities (1/2, 1/2) is a symmetric distribution with the highest variance in the data. We apply the proposed method to normal control subjects, to examine the effects of aging on the brain-wide distribution of FA.

General GMM

A general form of a GMM with m components can be expressed as follows: where Z is FA measurement of the j-th voxel (j = 1, …, N) observed from the i-th subject (i = 1, …, n). The density function f (.) is assumed to be a convex combination of m latent Gaussian densities with corresponding m mixing probabilities [τ(k), k = 1, …, m], where ϕ(.) is the standard normal density function, and θ = (μ, σ, and τ = 1, …, m). The likelihood function based on model (1) is expressed as follows: We assume that the variances are the same across the m-Gaussian densities, i.e., σ1 = …σ = σ, which reduces ϕ (k = 1, …, m) to in Eq. 1. The m latent Gaussian densities then differ only by their centers. We order the centers of the m-Gaussian densities as μ1 < … < μm. This parameterization gives easy interpretation of results for comparison of density functions across subgroups. For example, a density function with higher mixing probabilities for low order Gaussian densities will have a smaller mean than that with higher mixing probabilities for higher order Gaussian densities; a density with mixing probability of 1/2 to each of the two Gaussian densities (lowest, highest) will have the largest variance than any other combination of mixture probabilities. To estimate the parameters θ of the general GMM with equal variance constraint, we applied an expectation maximization (EM) algorithm (13); of note, a Bayesian mixture modeling approach (14, 15) is an alternative approach. Specifically, the EM algorithm treats the mixture model in Eq. (1) as an incomplete likelihood function with missing membership information for each observation. In each iteration of the EM algorithm, expectation-step (E-step) computes expectation of complete specification of log-likelihood function with respect to membership values with given data and estimated parameters, and maximization-step (M-step) computes maximum likelihood estimates of parameters specified in the likelihood function with the expected membership values (16). The applied EM algorithm is provided in Table 1. The total number of parameters with equal variance condition for m-Gaussian densities is 2m [=m (means) + m − 1 (mixing probabilities) + 1 (variance)]. To determine the optimal number of the latent Gaussian densities, ϕ’s (k = 1, …, m), we used the Akaike information criterion (AIC), which is expressed as . So that the optimal number of Gaussian densities (m) is associated with a minimum AIC value.

Table 1

EM algorithm adopted for this study.

Probability density function	f(z)=∑k=1mτ(k) ϕk(z)
E-step	τij(t)(k)= τ(t)(k) ϕk(t) (zij)∑l=1mτ(t)(l) ϕl(t) (zij)
M-step	μk(t+1)= ∑i=1n∑j=1Nzij τij(t)(k)∑i=1n∑j=1N τij(t)(k)
	σ2(t+1)= ∑i=1n∑j=1N∑k=1m(zij−μk(t+1))2 τij(t)(k)∑i=1n∑j=1N∑k=1mτij(t)(k)
	τ(t+1)(k)=1nN∑i=1n∑j=1Nτij(t)(k)

A variable noted with (.

EM algorithm adopted for this study. A variable noted with (.

Estimation of subject- and group-level density functions

In this section, we propose a method to estimate subject- and group-level density functions and their associated mixing probabilities. This approach fits the general GMM of Eq. 1 to the full dataset (Z = 1, …, n; j = 1, …, N), all voxels from all subjects, and estimates subject- and group-level density functions a posteriori. Under the general GMM, the membership probability of z to k-th Gaussian density, denoted by τ(k), is obtained based on Bayes theorem as follows: which results in . The estimated subject-level density function f for the i-th subject can be expressed as follows: with ϕ(z) (k = 1, …, m) estimated from the general GMM Eq. (1) using all voxel data from all subjects. The parameter τ(k) in Eq. 3 is the mixing probability of the k-th Gaussian density for the i-th subject, which we estimate as an average of membership probabilities of Z (j = 1, …, N) as follows: where τ(k) was estimated based on Eq. 2. Similarly, the estimated group-level density function for the g-th group can be expressed as follows: where in Eq. 4 is the mixing probability of the k-th Gaussian density for the g-th group, which we estimate as an average of τ(k) of the subjects in the g-th group as follows: where n is the number of subjects in the g-th group.

Example

Subjects

The Albert Einstein College of Medicine Institutional Review Board (IRB) approved and monitored this study. Twenty-eight normal subjects without history of head injury, cardiovascular or cerebrovascular disease, diabetes, or neurological or psychiatric disease were recruited between August 2006 and May 2010 through advertisements. Demographic data for the 28 normal subjects are summarized in Table 2.

Table 2

Subjects’ demographic characteristics.

	Gender	Number of subjects	Mean years of education (SD)	Minimum years of education	Maximum years of education
All ages	All genders	28	13.6 (1.6)	12	18
	Female	16	13.9 (1.8)	12	18
	Male	12	13.2 (1.3)	12	16
20–29	All	7	14.1 (1.7)	12	16
	Female	4	14.3 (1.7)	12	16
	Male	3	14.0 (2.0)	12	16
30–39	All	7	12.9 (1.0)	12	14
	Female	4	12.8 (1.0)	12	14
	Male	3	13.0 (1.0)	12	14
40–49	All	7	13.4 (1.6)	12	16
	Female	4	14.3 (1.7)	12	16
	Male	3	12.3 (0.6)	12	13
50–59	All	7	13.9 (2.0)	12	18
	Female	4	14.3 (2.6)	12	18
	Male	3	13.3 (1.2)	12	14

Subjects’ demographic characteristics.

Diffusion tensor image acquisition and image data preprocessing

Imaging was performed using a 3.0 T MRI scanner (Achieva; Philips Medical Systems, Best, The Netherlands) with an eight-channel head coil (Sense Head Coil; Philips Medical Systems). T1-weighted whole-head structural imaging was performed using sagittal three-dimensional magnetization-prepared rapid acquisition gradient echo (MP-RAGE; TR/TE = 9.9/4.6 ms; field of view, 240 mm2; matrix, 240 × 240; and section thickness, 1 mm). T2-weighted whole-head imaging was performed using axial two-dimensional turbo spin-echo (TR/TE = 4000/100 ms; field of view, 240 mm2; matrix, 384 × 512; and section thickness, 4.5 mm). DTI was performed using single-shot echo-planar imaging (TR/TE = 3800/88 ms; field of view, 240 mm2; matrix, 112 × 89; section thickness, 4.5 mm; independent diffusion sensitizing directions, 32; and b = 800 s/mm2). The images were preprocessed as described previously (4, 9).

Demonstration of the proposed methods using example FA image data

Fractional anisotropy datasets were normalized prior to fitting the proposed models. The normalization was necessary because it is well-known that FA is heterogeneous across brain regions; for example, higher FA is characteristic of deep WM such as the corpus callosum, and lower FA is typical of peripheral WM. Specifically, we normalized each FA measurement x by with mean and standard deviation (s) of n subjects (i = 1, …, n) at each voxel (j = 1, …, N). Subjects were divided into four age groups: 20–29, 30–39, 40–49, and 50–59 years. Each age group consisted of seven subjects (four women and three men). Age groups were matched for years of education, differing by no more than 2 years; no significant difference in years of education was detected among age groups (p = 0.76). Two Gaussian densities were required for the approach based on the AIC. In Figure 1, estimated density functions across the entire control group (n = 28) and each of the four age groups are presented . Inference on the difference in shapes of FA distribution across age groups was performed by estimating mixing probabilities from the proposed estimation method; these results are shown in Table 3 and Figure 2. We order estimated Gaussian densities (ϕ) by their centers, in ascending manner; thus, we identify ϕ1 as the Gaussian density with the lowest center.

Figure 1

Table 3

Estimated mixing probabilities for each age group.

Age	τg* (k = 1) (s.e.)	τg* (k = 2) (s.e.)
All	0.7488 (0.0076)	0.2512 (0.0076)
20–29	0.7277 (0.0086)	0.2723 (0.0086)
30–39	0.7337 (0.0075)	0.2663 (0.0075)
40–49	0.7481 (0.0150)	0.2519 (0.0150)
50–59	0.7873 (0.0184)^a	0.2127 (0.0184)^a

Figure 2

Box plots of subject-wise mixing probabilities by each age group (. Subject-level estimated mixing probability to the second Gaussian density [τi(k = 2), i = 1, …, n] is demonstrated for each of the age groups (G = 1, 2, 3, 4). Outliers in each box plot, marked with red + signs, are defined as values that are more than 1.5 times the interquartile range away from the top or bottom of the box.

Estimated age group-wise density functions. Estimated density function for all and each age group is demonstrated; (A) from all subjects, (B) from subjects aged 20–29, (C) from subjects aged 30–39, (D) from subjects aged 40–49, and (E) from subjects aged 50–59. Estimated mixing probabilities for each age group. . Box plots of subject-wise mixing probabilities by each age group (. Subject-level estimated mixing probability to the second Gaussian density [τi(k = 2), i = 1, …, n] is demonstrated for each of the age groups (G = 1, 2, 3, 4). Outliers in each box plot, marked with red + signs, are defined as values that are more than 1.5 times the interquartile range away from the top or bottom of the box. Mixing probabilities of the two Gaussian densities, k = 1 and k = 2, show opposite patterns of change across age groups. As age increases the mixing probability of the first Gaussian density (k = 1) increases while that of the second Gaussian density (k = 2) decreases. Box plots describing distributions of subject-level estimated mixing probabilities to the second Gaussian density [τi(k = 2), i = 1, …, n] are provided for all age groups in Figure 2. While mixing probabilities to the second Gaussian density (k = 2) were lower for older age group in that lower mean FA for older age group, higher between-subject variance is noted in Figure 2. A Kruskal–Wallis test was performed to compare subject-level mixing probabilities between Group = 1 and Group = g (g = 2, 3, 4). A significant difference in the mixing probability was found between Groups 1 (20–29 years) and 4 (50–59 years) with p = 0.048 with degrees of freedom (1, 12). This significantly lower mean mixing probability to the Gaussian density with a greater mean implies that FA declines significantly over the age of 50 years. This pattern also implies lower intra-subject variance in FA distribution for the age group (50–59 years) because of higher mixing probability to the first Gaussian density (k = 1). While all age groups showed positively skewed FA distributions (Table 3), the youngest group, aged 20–29 years, showed the greatest skewness to the right. However, the Kruskal–Wallis test, which compares all four groups was not significant (p = 0.141, df = 3, 24).

Discussion

The proposed approach for subgroup density estimation with GMM allows group comparison based on shape parameters represented by the relative mixing probabilities of the latent Gaussian densities. Since Gaussian densities are estimated based on the measurements from all voxels of all subjects, estimated individual densities will differ only by their own mixing probabilities. Individual densities are thus characterized by comparing the estimated mixing probabilities. Although kernel density estimation (KDE) (17) is a widely applied statistical approach for density estimation, a density function estimated by KDE does not provide shape parameters for comparison between groups because the parameter determining the shape of a density function by KDE is only the bandwidth for the chosen kernel function. In contrast, mixing probabilities estimated in the present study enable comparison of shapes between groups, which we have demonstrated with a real FA data set. The GMM approach proposed herein was designed to discriminate density functions across different subgroups. This approach implicitly assumes that the density function derived from all voxels will represent a mixture of at least two Gaussian components. Testing for differences in the shapes of the density functions from different groups is then possible by testing the means of mixing rates assigned to m-Gaussian densities (where m ≥ 2) between groups. However, if one Gaussian density function (i.e., m = 1) is sufficient to fit the entire distribution, discrimination of density functions between different subgroups is not available with the proposed approach. Additionally, the proposed GMM is a highly parsimonious approach in that it does not incorporate any subject- or group-specific shape parameters in the likelihood function; further improvement may be achieved by their inclusion. A GMM with unequal variance assumption resulted in little difference in fitting performance compared with the proposed approach with equal variance assumption (data not shown). Nevertheless, application of non-Gaussian mixtures could be employed, study of which is beyond the scope of the present paper and should serve as a future study. Initial application of the method to human FA datasets revealed that the distribution of FA differs significantly between subjects aged 20–29 years and those aged 50–59 years. Age-related change in FA distribution was found in mean, variance (intra-subject, between-subject), and skew. Since aging is a feature of neurodegeneration, this finding may have an implication to other neurodegenerative diseases, e.g., dementia and Alzheimer’s disease. Since the present demonstration is based on a small sample, further examinations with larger samples are warranted to fully characterize the utility of this approach and the age effects it has revealed.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

11 in total

Review 1. The role of advanced MR imaging findings as biomarkers of traumatic brain injury.

Authors: Zhifeng Kou; Zhen Wu; Karen A Tong; Barbara Holshouser; Randall R Benson; Jiani Hu; E Mark Haacke
Journal: J Head Trauma Rehabil Date: 2010 Jul-Aug Impact factor: 2.710

2. Effects of age on volumes of cortex, white matter and subcortical structures.

Authors: Kristine B Walhovd; Anders M Fjell; Ivar Reinvang; Arvid Lundervold; Anders M Dale; Dag E Eilertsen; Brian T Quinn; David Salat; Nikos Makris; Bruce Fischl
Journal: Neurobiol Aging Date: 2005-10 Impact factor: 4.673

3. Volumetric magnetic resonance imaging classification for Alzheimer's disease based on kernel density estimation of local features.

Authors: Hao Yan; Hu Wang; Yong-hui Wang; Yu-mei Zhang
Journal: Chin Med J (Engl) Date: 2013 Impact factor: 2.628

4. Atlasing location, asymmetry and inter-subject variability of white matter tracts in the human brain with MR diffusion tractography.

Authors: Michel Thiebaut de Schotten; Dominic H Ffytche; Alberto Bizzi; Flavio Dell'Acqua; Matthew Allin; Muriel Walshe; Robin Murray; Steven C Williams; Declan G M Murphy; Marco Catani
Journal: Neuroimage Date: 2010-08-02 Impact factor: 6.556

5. Robust detection of traumatic axonal injury in individual mild traumatic brain injury patients: intersubject variation, change over time and bidirectional changes in anisotropy.

Authors: Michael L Lipton; Namhee Kim; Young K Park; Miriam B Hulkower; Tova M Gardin; Keivan Shifteh; Mimi Kim; Molly E Zimmerman; Richard B Lipton; Craig A Branch
Journal: Brain Imaging Behav Date: 2012-06 Impact factor: 3.978

6. Global white matter analysis of diffusion tensor images is predictive of injury severity in traumatic brain injury.

Authors: Randall R Benson; Shashwath A Meda; Sriram Vasudevan; Zhifeng Kou; Koushik A Govindarajan; Robin A Hanks; Scott R Millis; Malek Makki; Zahid Latif; William Coplin; Jay Meythaler; E Mark Haacke
Journal: J Neurotrauma Date: 2007-03 Impact factor: 5.269

Review 7. Embracing chaos: the scope and importance of clinical and pathological heterogeneity in mTBI.

Authors: Sara B Rosenbaum; Michael L Lipton
Journal: Brain Imaging Behav Date: 2012-06 Impact factor: 3.978

8. Intersubject variability in the analysis of diffusion tensor images at the group level: fractional anisotropy mapping and fiber tracking techniques.

Authors: Hans-Peter Müller; Alexander Unrath; Axel Riecker; Elmar H Pinkhardt; Albert C Ludolph; Jan Kassubek
Journal: Magn Reson Imaging Date: 2008-08-12 Impact factor: 2.546

9. Multifocal white matter ultrastructural abnormalities in mild traumatic brain injury with cognitive disability: a voxel-wise analysis of diffusion tensor imaging.

Authors: Michael L Lipton; Erik Gellella; Calvin Lo; Tamar Gold; Babak A Ardekani; Keivan Shifteh; Jacqueline A Bello; Craig A Branch
Journal: J Neurotrauma Date: 2008-11 Impact factor: 5.269

10. Whole brain approaches for identification of microstructural abnormalities in individual patients: comparison of techniques applied to mild traumatic brain injury.

Authors: Namhee Kim; Craig A Branch; Mimi Kim; Michael L Lipton
Journal: PLoS One Date: 2013-03-26 Impact factor: 3.240

2 in total

1. Two step Gaussian mixture model approach to characterize white matter disease based on distributional changes.

Authors: Namhee Kim; Moonseong Heo; Roman Fleysher; Craig A Branch; Michael L Lipton
Journal: J Neurosci Methods Date: 2016-04-29 Impact factor: 2.390

2. Use of brain MRI atlases to determine boundaries of age-related pathology: the importance of statistical method.

Authors: David Alexander Dickie; Dominic E Job; David Rodriguez Gonzalez; Susan D Shenkin; Joanna M Wardlaw
Journal: PLoS One Date: 2015-05-29 Impact factor: 3.240

2 in total