| Literature DB >> 31996727 |
Bruno Montcel1, David Rousseau2,3, Pierre Leclerc4,2, Cedric Ray4, Laurent Mahieu-Williame2, Laure Alston2, Carole Frindel2, Pierre-François Brevet4, David Meyronet5,6, Jacques Guyotat5.
Abstract
Gliomas are infiltrative brain tumors with a margin difficult to identify. 5-ALA induced PpIX fluorescence measurements are a clinical standard, but expert-based classification models still lack sensitivity and specificity. Here a fully automatic clustering method is proposed to discriminate glioma margin. This is obtained from spectroscopic fluorescent measurements acquired with a recently introduced intraoperative set up. We describe a data-driven selection of best spectral features and show how this improves results of margin prediction from healthy tissue by comparison with the standard biomarker-based prediction. This pilot study based on 10 patients and 50 samples shows promising results with a best performance of 77% of accuracy in healthy tissue prediction from margin tissue.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31996727 PMCID: PMC6989497 DOI: 10.1038/s41598-020-58299-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Global view of the proposed machine learning-based prediction of glioma margin by PpIX fluorescence spectroscopic measurements. In this study, the data set is composed of 50 samples from 10 patients. From left to right, the optical spectrum of cells around a tumor is measured. The dimension of the spectral information is then reduced to lower the redundancy. Supervised or unsupervised algorithms are finally used to classify the data and create a prediction of tissue state from the PpIX fluorescence spectroscopic measurements.
Figure 2Bayesian inference criterion (BIC) (left) and gap criterion (right) as a function of the number of clusters for K-means (top row) and GMM (bottom row). The minimum of the BIC and maximum of the gap criterion (highlighted in the red dash-dotted line) correspond to the optimal number of clusters in our data. Interestingly K-means and GMM are best described with 3 or 4 clusters which fit with the red dotted lines corresponding to the number of classes from the clinical taxonomy.
Figure 3Scree test (on the left) and cumulative variance (on the right) of principal component analysis. A minimum of 5 principal components is required to describe the data variance as can be inferred from the Scree test “elbow” and the saturation around 95% of the cumulative variance highlighted in red dotted lines.
Figure 4Normalized principal component of the PCA (in blue) in the original feature space (i.e. the optical spectrum space). For each principal component, the reference spectrum of the PpIX (for the state peaking at 634 nm) is also plotted (in red). For better comprehension, only one of the three fluorescence emission spectrum from the original feature space is represented as they are similar for all three excitation wavelengths. The first principal component is similar to the spectrum of the PpIX with a peak of 636 nm. The second principal component is best described as the autofluorescence of the measured tissue, i.e. the contribution of other fluorophores. The five following components all show a peak shifting between 620 nm and 636 nm.
Confusion matrix for K-means with 4 classes: tumor core, high density margin, low density margin and healthy tissue.
| Predicted Core | Predicted HD Margin | Predicted LD Margin | Predicted Healthy | |
|---|---|---|---|---|
| True Core (10) | 8 (80%) | 1.9 (19%) | 0 (0%) | 0.1 (1%) |
| True HD Margin (24) | 1.3 (5%) | 7.1 (30%) | 9.7 (40%) | 5.9 (25%) |
| True LD Margin (9) | 0 (0%) | 0.1 (1%) | 5.6 (62%) | 3.3 (37%) |
| True Healthy (7) | 0 (0%) | 0 (0%) | 1 (14%) | 6 (86%) |
Statistics (average, standard deviation) result from 20 predictions. In each cell, from left to right: number of instances, percentage of total class population and standard deviation.
Figure 5Unsupervised classification in a T-SNE reduced space. From left to right: the histological truth as given by the anatomopathologist, K-means classification and GMM classification both with 4 clusters.
Figure 6Comparison of confusion matrix for K-means with 4 classes: tumor core, high and low density margin and healthy tissue. Average result of 20 predictions. ML model is Machine learning-based Model, HD stands for high density and LD for low density.