Mário João Fartaria1, Alexandra Todea2, Tobias Kober3, Kieran O'brien4, Gunnar Krueger5, Reto Meuli6, Cristina Granziera7, Alexis Roche8, Meritxell Bach Cuadra9. 1. Advanced Clinical Imaging Technology (HC CMEA SUI DI PI), Siemens Healthcare AG, Lausanne, Switzerland; Department of Radiology, Lausanne University Hospital (CHUV), and University of Lausanne (UNIL), Lausanne, Switzerland; Signal Processing Laboratory (LTS 5), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. Electronic address: mario.fartaria_de_oliveira@siemens-healthineers.com. 2. Department of Radiology, Pourtalès Hospital, Neuchâtel, Switzerland. 3. Advanced Clinical Imaging Technology (HC CMEA SUI DI PI), Siemens Healthcare AG, Lausanne, Switzerland; Department of Radiology, Lausanne University Hospital (CHUV), and University of Lausanne (UNIL), Lausanne, Switzerland; Signal Processing Laboratory (LTS 5), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. 4. Centre for Advanced Imaging, University of Queensland, Queensland, Australia; Siemens Healthcare Pty. Ltd., Brisbane, Queensland, Australia. 5. Siemens Healthcare Ltd, Zürich, Switzerland. 6. Department of Radiology, Lausanne University Hospital (CHUV), and University of Lausanne (UNIL), Lausanne, Switzerland. 7. Neurologic Clinic and Policlinic, Departments of Medicine, Clinical Research and Biomedical Engineering, University Hospital Basel and University of Basel, Basel, Switzerland; Translational Imaging in Neurology (ThINK) Basel, Department of Medicine and Biomedical Engineering, University Hospital Basel and University of Basel, Basel, Switzerland. 8. Department of Radiology, Lausanne University Hospital (CHUV), and University of Lausanne (UNIL), Lausanne, Switzerland; Advanced Clinical Imaging Technology (HC CMEA SUI DI PI), Siemens Healthcare AG, Lausanne, Switzerland; Signal Processing Laboratory (LTS 5), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. 9. Department of Radiology, Lausanne University Hospital (CHUV), and University of Lausanne (UNIL), Lausanne, Switzerland; Medical Image Analysis Laboratory (MIAL), Centre d'Imagerie BioMédicale (CIBM), Lausanne, Switzerland; Signal Processing Laboratory (LTS 5), Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
Abstract
White-matter lesion count and volume estimation are key to the diagnosis and monitoring of multiple sclerosis (MS). Automated MS lesion segmentation methods that have been proposed in the past 20 years reach their limits when applied to patients in early disease stages characterized by low lesion load and small lesions. We propose an algorithm to automatically assess MS lesion load (number and volume) while taking into account the mixing of healthy and lesional tissue in the image voxels due to partial volume effects. The proposed method works on 3D MPRAGE and 3D FLAIR images as obtained from current routine MS clinical protocols. The method was evaluated and compared with manual segmentation on a cohort of 39 early-stage MS patients with low disability, and showed higher Dice similarity coefficients (median DSC = 0.55) and higher detection rate (median DR = 61%) than two widely used methods (median DSC = 0.50, median DR < 45%) for automated MS lesion segmentation. We argue that this is due to the higher performance in segmentation of small lesions, which are inherently prone to partial volume effects.
White-matter lesion count and volume estimation are key to the diagnosis and monitoring of multiple sclerosis (MS). Automated MS lesion segmentation methods that have been proposed in the past 20 years reach their limits when applied to patients in early disease stages characterized by low lesion load and small lesions. We propose an algorithm to automatically assess MS lesion load (number and volume) while taking into account the mixing of healthy and lesional tissue in the image voxels due to partial volume effects. The proposed method works on 3D MPRAGE and 3D FLAIR images as obtained from current routine MS clinical protocols. The method was evaluated and compared with manual segmentation on a cohort of 39 early-stage MS patients with low disability, and showed higher Dice similarity coefficients (median DSC = 0.55) and higher detection rate (median DR = 61%) than two widely used methods (median DSC = 0.50, median DR < 45%) for automated MS lesion segmentation. We argue that this is due to the higher performance in segmentation of small lesions, which are inherently prone to partial volume effects.
Multiple sclerosis (MS) is a chronic autoimmune disease of the central nervous system, characterized by inflammation, demyelination, axonal loss, and gliosis (Noseworthy et al., 2000). Current MS diagnostic and follow-up criteria are exploiting magnetic resonance (MR) imaging metrics of lesion load, i.e. lesion count and location, as well as activity, i.e. gadolinium enhancement due to blood-brain barrier disruption (Rovira et al., 2015). The presence and spatial pattern of focal lesions in MR images (“dissemination in space”) and the appearance of new lesions (“dissemination in time”) are key components of current diagnosis criteria (Filippi et al., 2016; Polman et al., 2011). The identification of focal pathology and of new lesions as well as their size changes in follow-up scans are important to perform an early diagnosis, quantifying ongoing disease activity and monitor treatment effects (Filippi et al., 2016). Consequently, one current research focus has been the development of automated MS lesion segmentation (Garcia-Lorenzo et al., 2013; Lladó, Ganiler, et al., 2012; Lladó, Oliver, et al., 2012).Automated segmentation approaches are either unsupervised or supervised. Unsupervised approaches typically apply clustering algorithms that make use of both image intensity from one or several MR contrasts (T1-weighted, T2-weighted, proton-density-weighted, and/or 2D fluid-attenuated inversion recovery, FLAIR) and spatial information derived from probabilistic atlases of healthy tissues and/or topological constraints (Schmidt et al., 2012; Shiee et al., 2008; Souplet et al., 2008; Tomas-Fernandez and Warfield, 2015; Van Leemput et al., 2001). Supervised approaches, on the other hand, rely on manually annotated image sets used for training (Anbeek et al., 2004; Brosch et al., 2016; Fartaria, Bonnier, et al., 2016; Morra et al., 2008; Sweeney et al., 2014).We note that existing automated MS lesion segmentation methods were mostly evaluated on patients in advanced MS stages, who have a high disability, high numbers of lesions, and large lesion volumes (Datta and Narayana, 2013; Sajja et al., 2006; Steenwijk et al., 2013; Sweeney et al., 2013). They, however, showed substantially lower performance when applied to subjects with lower disease burden, i.e. lower lesion load and lesions of smaller size and volume (Anbeek et al., 2004; Cabezas et al., 2014; Fartaria, Bonnier, et al., 2016; Steenwijk et al., 2013; Sweeney et al., 2013). Automated MS lesion segmentation seems to be barely used in clinical practice, where detecting new lesions, particularly of small size, is of key importance for early diagnosis and follow-up of MS patients (Garcia-Lorenzo et al., 2013).We showed in previous work that small lesions are strongly affected by partial volume (PV) effects, rendering their detection, segmentation and volume estimation challenging (Fartaria, O'Brien, et al., 2017). Taking into account PV could consequently improve the detection of small lesions and the overall lesion volume estimation. This idea was investigated in the mid-90s by (Johnston et al., 1994), who proposed a semi-automated method that takes partial volumes into account using neighborhood and histogram analysis. The same group later added pre- and post-processing steps of image enhancement and mathematical morphology to improve the discrimination between healthy WM and lesions (Johnston et al., 1996). Recently, (Khademi and Moody, 2015) performed image classification using mixed tissue labels as in (Cuadra et al., 2005; Shattuck et al., 2001) and estimated the PV fraction in mixed classes using spatial image gradient analysis. Variants of this approach using hierarchical mixture models are employed in (Galimzianova et al., 2016; Sudre et al., 2015).Here, we propose a novel method for MS lesion segmentation that relies on a Bayesian PV estimation algorithm inspired by the “mixel” model originally proposed by (Choi et al., 1991), which leads to an ill-posed estimation problem for which (Roche and Forbes, 2014) proposed regularizing priors. We further included spatial constraints to estimate realistic concentration maps of healthy tissues (WM, GM, CSF), and pathological brain tissue. These concentration maps are used to directly compute lesion volumes rather than applying a correction of PV effects in initial hard tissue classification as in previous methods (Johnston et al., 1996; Khademi and Moody, 2015; Wu et al., 2006). Our approach does not rely on edge detection and therefore has the potential to assess PV effects in small lesions without clearly defined boundaries.
Method
Partial volume estimation
We consider a set of n images of a given subject acquired from different MR image sequences and previously submitted to various pre-processing steps including alignment, bias field correction and skull stripping. Consistent with (Choi et al., 1991; Pham and Prince, 2000; Roche and Forbes, 2014; Van Leemput et al., 2003), we assume that the vector of image intensities y at a voxel i in the total intra-cranial mask relates to an unknown vector of tissue concentrations q, with q ≥ 0 (the concentration of tissue t at voxel i is ≥0) and (the sum of the n tissue concentrations at a voxel i is equal to one), through the statistical relation:where M is an n × n matrix representing the mean tissue intensities for each channel (n is the number of distinct tissues and M represents the mean intensity of tissue t in channel c), and N represents a multi-variable Normal distribution with zero mean and covariance matrix V = diag (σ12, …, σ2). We assumed that, in each MR sequence, image intensities are corrupted with independent stationary Gaussian white noise, as a first-order approximation to the non-central chi noise distribution that takes into account the coil combination in MRI (Larsson et al., 2003). In this work, we consider n = 4 tissues: CSF, GM, WM and lesions, as well as n = 2 contrasts: magnetization-prepared rapid gradient echo (MPRAGE) and 3D FLAIR.It was recognized 25 years ago (Choi et al., 1991) that recovering the voxel-wise tissue concentrations q from the above multichannel image model leads to an ill-posed inverse problem if n > n + 1, as it is the case in our application. Recently, (Roche and Forbes, 2014) proposed a prior concentration model to regularize the problem when formulated via Bayesian maximum a posteriori (MAP) estimation:where n is the total number of intra-cranial voxels, A is a symmetric penalty matrix with zero diagonal and positive off-diagonal elements, β is a positive constant, and N is the neighborhood of voxel i according to the 6-topology (the 6 adjacent voxels connectedness in 3-dimensions). Both the elements of A and β are hyperparameters to be tuned in a learning phase. While β controls the amount of spatial smoothness of tissue concentration maps, the purpose of A is to disentangle intensity fluctuations due to noise from PV effects. Each non-diagonal element acts as a penalty on the mixing of distinct tissues in a voxel, hence limiting spurious concentration variations when a single tissue is present. For instance, the larger A12, the less likely voxels contain both CSF and GM.We propose to generalize the prior model of (Roche and Forbes, 2014) by allowing voxel-dependent penalty matrices A including non-zero diagonal elements in order to penalize tissues locally. This avoids confusing GM and lesions, which have similar intensity signatures in both MPRAGE and 3D FLAIR. Specifically, let π and π be a probablistic atlas-based prior probability map for the GM and WM, respectively. We set the diagonal elements of A corresponding to CSF, GM, WM and lesions, via:where the parameters a1 to a8 are pre-tuned with the smoothness parameter β, which are assumed voxel-independent in our particular implementation.Following (Roche and Forbes, 2014), we estimate the tissue concentrations by MAP, yielding a quadratic programming problem:where each q is searched in the multidimensional simplex. The solution is evaluated using an iterative scheme that loops over the intra-cranial voxels, and solves for the associated concentration vector q with all other concentration vectors held fixed using an active set algorithm (Nocedal and Wright, 2006). This method proves very robust in practice, and typically converges in <25 iterations.
Imaging parameters
The noise variance matrix V is initially assumed to be zero, and is iteratively re-estimated by MAP concurrently with the tissue concentrations (see Section 2.1), yielding the update rule:which is performed after a complete tissue concentration re-estimation loop over the intracranial voxels.Conversely, the matrix M of mean tissue intensities is held fixed during the estimation of tissue concentrations, having been determined beforehand by histogram matching (Nyul et al., 2000) with a reference patient. Histogram matching produces a piecewise linear intensity transformation f from the reference patient image to the input image. M is estimated as M = f(M), where M is the mean intensity matrix of the reference patient, which in this work was determined from manually drawn regions of interest containing no PV effects for CSF, GM, WM, and lesions.
Hyperparameter tuning
The same reference patient used to set the tissue mean intensities M (see Section 2.2) was used to tune the hyperparameters A and β so as to minimize the Hellinger distance H, between the manual lesion segmentation binary mask M, and the lesion concentration map output by the PV estimation algorithm Q:Two elements of A were fixed to very large values in order to proscribe mixing of CSF with WM and CSF with lesions (a2 = a3 = 1 × 1010). The other parameters were optimized using Powell's method, yielding a1 = 11.25, a4 = 14.33, a5 = 0.47, a6 = 12.21, a7 = 1.33, a8 = 16.93, and β = 0.54.
Experimental set
Data
The study was approved by the Ethics Committee of our institution and all subjects gave written informed consent prior to participation.Thirty-nine patients (14 males, 25 females, median age 34 years, age range: 20–60 years) with early relapsing-remitting MS, (disease duration < 5 years from diagnosis) and Expanded Disability Status Scale (EDSS) score between 1 and 2 (median EDSS = 1.5), were scanned on a 3 T MRI system (Magnetom Trio a Tim System, Siemens Healthcare GmbH, Erlangen, Germany) using a 32-channel head coil. The MRI protocol included: (i) high-resolution MPRAGE (TR/TI = 2300/900 ms, voxel size = 1 × 1 × 1.2 mm3), and (ii) 3D FLAIR (TR/TE/TI = 5000/394/1800 ms, voxel size = 1 × 1 × 1.2 mm3), (iii) magnetization-prepared 2 rapid acquisition gradient echo (MP2RAGE, TR/TE/TI1/TI2 = 5000/2.89/700/2500 ms, voxel size = 1 × 1 × 1.2 mm3), all acquired in the same session without patient repositioning. MS lesions were identified and marked by one radiologist and one neurologist (6 and 11 years of experience, respectively) by consensus using FLAIR and MP2RAGE images. A trained technician then delineated the lesion volumes in each image, which we considered to be the reference for lesion load and volume (Bonnier et al., 2015; Kober et al., 2012).
Pre-processing
A patient with relatively high lesion load was chosen as a reference to train the PV estimation algorithm (see 2.2, 2.3) and was subsequently excluded from the ensuing statistical analysis. ELASTIX (Klein et al., 2010) was used to confirm that the rigid transformation relating FLAIR and MPRAGE was below 1 voxel according to the resulted transformation parameters. All images were further skull-stripped using an in-house method (Schmitter et al., 2014), and corrected for intensity inhomogeneities using the N4 algorithm (Tustison et al., 2010). Fuzzy in-house templates of prior WM and GM probabilities were non-rigidly registered using ELASTIX onto each image volume to produce the prior maps π and π (see Section 2.1). Such priors were obtained using the DARTEL tool of SPM8 (Ashburner and Friston, 2009), from a data set of 136 MR scans of healthy subjects acquired at our institution.
Comparison with freely available software
We compared the results obtained using our method on the cohort of early MS patients with two widely used and freely available WM lesion segmentation methods. One method, the Lesion Segmentation Tool (LST), is based on unsupervised outlier rejection and region growing, and is distributed as an SPM toolbox (Schmidt et al., 2012). The other method, LesionTOADS (LTOADS), implements an unsupervised atlas-based fuzzy clustering and is distributed as a plug-in for MIPAV (Shiee et al., 2010). Default parameters were used in both methods, except for the LST where we set the initial threshold (κ = 0.05) based on the best Dice similarity coefficient (DSC) values obtained in the cohort (Schmidt et al., 2012).
Validation and results
Lesion detection and segmentation
In this section, we evaluate lesion detection and segmentation by the detection rate (DR), false positive rate (FPR), and Dice similarity coefficient (DSC). These metrics rely on the overlap between the automated segmentation and the reference. Fig. 1 illustrates the lesion maps obtained using the different considered methods.
Fig. 1
Axial, sagittal, and coronal views from three different patients showing the detection and segmentation results obtained from different methods. From top to bottom. 3D FLAIR slice, manual segmentation (reference), proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS).
Axial, sagittal, and coronal views from three different patients showing the detection and segmentation results obtained from different methods. From top to bottom. 3D FLAIR slice, manual segmentation (reference), proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS).Fig. 2 shows an example of a small and a large lesion, where the PV effect becomes apparent. The small lesion is substantially affected by partial volume, having a maximum voxel concentration of 55% lesional tissue (45% WM concentration for the same voxel) according to the PV-method. In the large lesion, PV is evident on the lesion border, where the voxel concentration of lesional tissue varies between 12% and 98%. The lesion centre has 100% of concentration of lesional tissue, according to the PV-method, indicating “pure” lesion signal.
Fig. 2
Partial volume effect in a small (left side), and large (right side) lesion. The small lesion is fully affected by partial volume, having a maximum concentration of 55% of lesional tissue in the path A to B, according to the proposed partial volume (PV) method. The big lesion is prone to partial volume effects in the border, detected as low lesion concentration by the proposed PV-method. In this case, effects of partial volume in the transitions between healthy and lesional tissue are clearly seen. Intensity profiles of the path A to B, for FLAIR intensities, and PV lesions maps where shown on the bottom. Smooth orange regions in the profile plots represent regions of partial volume (mix of lesional tissue and WM tissue).
Partial volume effect in a small (left side), and large (right side) lesion. The small lesion is fully affected by partial volume, having a maximum concentration of 55% of lesional tissue in the path A to B, according to the proposed partial volume (PV) method. The big lesion is prone to partial volume effects in the border, detected as low lesion concentration by the proposed PV-method. In this case, effects of partial volume in the transitions between healthy and lesional tissue are clearly seen. Intensity profiles of the path A to B, for FLAIR intensities, and PV lesions maps where shown on the bottom. Smooth orange regions in the profile plots represent regions of partial volume (mix of lesional tissue and WM tissue).Since our PV approach provides a concentration map of lesional tissue, a threshold needs to be applied for comparison with the binary masks provided by the other methods. Two lesion concentration map thresholds were used to match the median FPR (across the entire dataset) of the PV-method with LST and LTOADS, respectively. The FPR was obtained by dividing the number of lesions in the automated segmentation that do not overlap with any lesion in the reference with the number of overall lesions in the automated segmentation (Styner et al., 2008). An optimal concentration map threshold was also derived from a receiver operating characteristic (ROC) analysis based on the best trade-off between DR and FPR.Diagnostic criteria define a minimal diameter of 3 mm for MS lesion (Chris H. Polman et al., 2005). Recently, the meaningful of this diameter size was proved for 3D sequences at 3T (Grahl et al., 2017). If we approximate the lesion shape to a sphere, we end up to a minimum lesion volume of ≈15 μL. However, applying this threshold remove most of the lesions that are fully prone to PV effects. In this work we presented the results using two different definitions of minimum lesion volume: ≈3 μL and ≈15 μL. Our rational in applying a threshold of ≈3 μL was to validate the performance of the methods for small lesions fully prone to PV effects, that might be relevant for early diagnosis and follow-up.We evaluated lesion detection (knowing that lesion number is the main MRI biomarker for diagnosis in MS) through the DR, defined by dividing the number of lesions in the automated segmentation that overlap with a lesion in the reference, with the number of overall lesions in the reference (Styner et al., 2008). The lesion segmentation agreement with the reference was evaluated using DSC:where TP: number of true positives; FP: number of false positives; FN: number of false negatives; (“true” relates to marked voxels in the reference). The Wilcoxon signed-rank tests were used to compare the DSC results between methods.To match the FPR of LST (median FPR = 51%) and LTOADS (median FPR = 42%), thresholds of 32% and 38% were applied to the lesion concentration maps, respectively. We found that the optimal threshold for the generation of the binary lesion maps, according to the ROC analysis was 40%. PV-method presented the better detection performance for any lesion size category for all the applied thresholds to the concentration maps (Fig. 3). For a threshold of 32%, PV-method achieved better performance than LST in all lesion size categories (3–14 μL: 30.1%, 13.8%; 15–20 μL: 60.5%, 22.6%; 21–50 μL: 76.5%, 50.8%; 51–100 μL: 94.7%, 86.8%. DR values for PV-method and LST, respectively.), except for big lesions (lesion volume > 100 μL), where both methods detected 98.5% of the lesions. Overall, PV-method achieved a DR = 60.7%, against the 43.4% obtained from LST. For a threshold of 38%, PV-method achieved higher performance when compared to LTOADS in all lesion size categories (3–14 μL: 22.8%, 12.9%; 15–20 μL: 56.3%, 17.9%; 21–50 μL: 72.7%, 45.3%; 51–100 μL: 93%, 73.2%; >100 μL: 98.5%, 84.6%. DR values for PV-method and LTOADS, respectively.). Overall, PV-method achieved a DR = 56.2%, against the 37.8% obtained from LTOADS. Similar behaviour was found, when the optimal threshold of 40% was applied to the concentration maps, where PV-method reached a overall DR of 54.5%. Although the DR is lower compared to the values obtained with other thresholds, the PV-method still showed the best performance when compared to LST and LTOADS. For a threshold of 32% and considering a minimum lesion volume of ≈3.2 μL, PV-method achieved a median DSC of 0.55 significantly superior (P < .001) to LST (median DSC of 0.50, Fig. 4A). The same behaviour was observed when a threshold of 38% was applied to the PV map, the method presented the same median DSC of 0.55, which was significantly (P < .05) higher than LTOADS (median DSC of 0.51, Fig. 4A). When the optimal threshold of 40% was applied, PV-method achieved a median DSC of 0.53 significantly superior (P < .001) to LST but non-significantly different from LTOADS P = .06, Fig. 4A). Lastly, similar behaviour was found when the minimum lesion volume was set to ≈15 μL. However, median DSC values appeared to be higher in all the methods: 0.53 for PV-method with a threshold of 40%; 0.59 for PV-method with a threshold of 32%; 0.57 for PV-method with a threshold of 38%; 0.52 for LST; and 0.52 for LTOADS.
Fig. 3
Detection rate for different ranges of lesion size obtained using the proposed partial volume (PV) method (using three different thresholds, THLD), Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Numbers in parenthesis represent the number of lesions for the respective lesion size range according to the reference.
Fig. 4
Boxplots of Dice similarity coefficient between manual segmentation and automated lesion segmentation using the proposed partial volume (PV) method (using three different thresholds, THLD), Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (panel A) and ≈15 μL (panel B). The crosses in the plot represent outliers in the cohort.
Detection rate for different ranges of lesion size obtained using the proposed partial volume (PV) method (using three different thresholds, THLD), Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Numbers in parenthesis represent the number of lesions for the respective lesion size range according to the reference.Boxplots of Dice similarity coefficient between manual segmentation and automated lesion segmentation using the proposed partial volume (PV) method (using three different thresholds, THLD), Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (panel A) and ≈15 μL (panel B). The crosses in the plot represent outliers in the cohort.
Total lesion volume
The total lesion volume (TLV) per patient was estimated using each method, and compared with the reference using the two definitions of minimum lesion volume (≈3 μL and ≈15 μL, Fig. 5). The Wilcoxon signed-rank test was used to compare the results between methods. When the minimum lesion volume was set to ≈3 μL, the PV-method showed the best agreement in terms of TLV compared to the other methods (P < .001). Similar behaviour was found when the minimum lesion volume was set to 15 μL.
Fig. 5
Boxplots of total lesion volume (TLV) difference between reference and automated lesion segmentation using the proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (left panel) and ≈15 μL (right panel).The crosses in the plot represent outliers in the cohort.
Boxplots of total lesion volume (TLV) difference between reference and automated lesion segmentation using the proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (left panel) and ≈15 μL (right panel).The crosses in the plot represent outliers in the cohort.Correlations between the reference and each method, for both definitions of minimum lesion volume, were also studied using the Spearman's rank correlation coefficient (ρS) and the coefficient of determination (R-squared, R2). As shown in Fig. 6, the PV-method and LST methods were found to be more highly correlated with the manually determined TLVs than the LTOADS method. For a minimal lesion volume of ≈3 μL, the overall Spearman's rank correlation coefficient was ρ = 0.94 for the proposed method, ρ = 0.90 for LST, and ρ = 0.84 for LTOADS. Both LST and LTOADS presented a higher underestimation of the TLVs for patients with high/moderate lesion loads, as reflected by the coefficient of determination with respect to the identity line: R2 = 0.33 for LST and R2 = 0.27 for LTOADS, against R2 = 0.76 for the proposed PV-method. Similar results were observed when minimal lesion volume was set to ≈15 μL (see Fig. 6).
Fig. 6
Total lesion volume (TLV) correlation between the reference and automated lesion segmentation (from left to right) using the proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (first row) and ≈15 μL (second row). The respective Spearman's rank correlation coefficient (ρS), and coefficient of determination (R-squared, R2) are given. The dash line represents the identity line (TLVreference = TLVmethod), and the solid line represents the best linear fit to the points.
Total lesion volume (TLV) correlation between the reference and automated lesion segmentation (from left to right) using the proposed partial volume (PV) method, and two freely available software: Lesion Segmentation Tool (LST), and LesionTOADS (LTOADS). Results were computed using two different definitions of minimum lesion volume: ≈3 μL (first row) and ≈15 μL (second row). The respective Spearman's rank correlation coefficient (ρS), and coefficient of determination (R-squared, R2) are given. The dash line represents the identity line (TLVreference = TLVmethod), and the solid line represents the best linear fit to the points.
Discussion
We developed and validated a method to automatically detect and segment MS lesions in patients at an early disease stage and with low disability scores. Our method is based on a Bayesian PV estimation using the “mixel” model (Choi et al., 1991; Roche and Forbes, 2014), extended to lesion detection by including spatial constraints from atlas-based probability maps of GM and WM. Our approach was developed and optimized for images from the routine MS clinical protocol in our institution (3D MPRAGE and 3D FLAIR). Other clinical images, such as T2-weighted or proton density, may be also used in our multi-channel approach, although the hyper parameters require to be tuned accordingly. Most MS lesion segmentation methods reported in the literature show low performance when applied to patient data exhibiting low lesion load and small lesions (Anbeek et al., 2004; Cabezas et al., 2014; Galimzianova et al., 2016; Sajja et al., 2006; Schmidt et al., 2012; Steenwijk et al., 2013). However, accurate detection especially in the early stages of MS can be crucial for initial diagnosis and subsequent treatment monitoring (Filippi et al., 2016). Here, our goal was to improve the detection of very small lesions that are partially or totally affected by PV effects. Moreover, we aimed to better estimate lesion volume through better delineation of lesion borders, which are prone to PV effects.Our results showed the feasibility of the proposed method to detect and segment WM MS lesions, including lesions of small size, as well as the importance of taking into account PV for lesion detection, lesion segmentation, and lesion volume estimation. We considered two different definitions of minimal lesion size: 1) the one proposed by the MS diagnostic criteria (Polman et al., 2005) of 3 mm diameter and 2) a smaller size of ≈3 μL in volume in order to include small lesions fully prone to PV effects. PV-method showed the best results, when compared to LST and LTOADS, for both definitions of minimum lesion size. However, the improvements were more evident for a minimum lesion volume of ≈3 μL, where detection of small lesions by the proposed method significantly improved the total lesion volume estimation when compared to LST, and LTOADS.DSC is one of the most commonly-used metrics to measure the segmentation quality (Lladó, Oliver, et al., 2012); however, this metric is not appropriate to evaluate the segmentation of small objects (Schmidt et al., 2012). Despite this, DSC is widely used (Brosch et al., 2016; Galimzianova et al., 2016; Jain et al., 2015) even to this type of data characterized with low lesion load. Interestingly, the median of DSC for the PV-method was significantly higher than for the other two studied methods.Our study shows that modelling the PV effect improves volumetric measurements as well as the detection of small lesions, which is of particular importance for MS patients in an early disease phase. The clinical relevance of lesions with diameter smaller than 3 mm is not yet established. However, their counting can be important not only for early-diagnosis but also for follow-up assessment (new lesion count) and to monitor the treatment response.A limitation of our method is the detection of “black holes” which are hypointense in FLAIR images. Since the method relies on image intensity, hypointense voxels in FLAIR images from “black holes” will most likely be classified as CSF. In this particular data set of early MS stages, there were few “black holes”, however future developments based on CSF prior information should be done and tested in data sets of patients at later MS stages and with higher disability. Furthermore, the method should be tested on images from different scanners and protocols. This will enable us to test the robustness of the histogram matching approach (see Section 2.2) used to homogenise the intensity scale across images from different acquisitions. Lastly, it should be considered that the tuning of the hyper-parameters was performed using only one patient as a reference. This reference was found to be the most representative case in terms of lesion size and distribution. However, future validation should be performed in order to evaluate the influence of the reference patient on hyper-parameter tuning and consequently on lesion segmentation performance.Extensions of the presented method should focus on exploring mathematical morphology operations to better disentangle GM from lesions, and consequently improve the lesion detection. This could particularly improve the detection in regions where the method presents a relatively modest performance due to the lesion location and very similar intensities between lesional and normal-appearing GM tissue, as e.g. seen in juxtacortical lesions. Future work will also aim to extend the method to cortical lesion detection, by incorporating input from advanced sequences like double-inversion recovery (DIR), and magnetization-prepared 2 rapid acquisition with gradient echo (MP2RAGE), which has been recently shown to be more sensitive to cortical lesions (Fartaria, Bonnier, et al., 2016).
Authors: Richard McKinley; Rik Wepfer; Lorenz Grunder; Fabian Aschwanden; Tim Fischer; Christoph Friedli; Raphaela Muri; Christian Rummel; Rajeev Verma; Christian Weisstanner; Benedikt Wiestler; Christoph Berger; Paul Eichinger; Mark Muhlau; Mauricio Reyes; Anke Salmen; Andrew Chan; Roland Wiest; Franca Wagner Journal: Neuroimage Clin Date: 2019-12-09 Impact factor: 4.881