| Literature DB >> 27404900 |
Niloufar Zarinabad1,2, Martin Wilson3, Simrandip K Gill1,2, Karen A Manias1,2, Nigel P Davies1,4, Andrew C Peet1,2.
Abstract
PURPOSE: Classification of pediatric brain tumors from 1 H-magnetic resonance spectroscopy (MRS) can aid diagnosis and management of brain tumors. However, varied incidence of the different tumor types leads to imbalanced class sizes and introduces difficulties in classifying rare tumor groups. This study assessed different imbalanced multiclass learning techniques and compared the use of complete spectra and quantified metabolite profiles for classification of three main childhood brain tumor types.Entities:
Keywords: MR spectroscopy; brain tumors; classification; diagnosis; imbalanced learning
Mesh:
Substances:
Year: 2016 PMID: 27404900 PMCID: PMC5484359 DOI: 10.1002/mrm.26318
Source DB: PubMed Journal: Magn Reson Med ISSN: 0740-3194 Impact factor: 4.668
Figure 1Experiment overview. These sets have been applied to both complete spectra and quantified metabolite profiles.
Estimated Metabolite Concentration ± Standard Deviation of the Three Tumour Types as Calculated by TARQUIN.
| Metabolite | Ependymomas | Medulloblastoma | Pilocytic Astrocytoma |
|
|---|---|---|---|---|
| Citrate | 0.85 ± 0.48 | 0.51 ± 0.36 | 0.30 ± 0.30 | <0.001 |
| Creatine | 2.04 ± 0.89 | 1.46 ± 1.34 | 0.73 ± 0.95 | 0.001 |
| Glycerophosphocholine | 1.56 ± 0.89 | 2.22 ± 1.16 | 0.77 ± 0.50 | <0.001 |
| Glucose | 2.04 ± 1.23 | 1.6 ± 1.85 | 2.61 ± 1.83 | 0.046 |
| Glutamine | 2.80 ± 1.69 | 2.34 ± 1.97 | 3.20 ± 2.05 | 0.137 |
| Glutathione | 0.60 ± 0.71 | 0.54 ± 0.54 | 0.19 ± 0.38 | 0.003 |
| Glutamate | 3.94 ± 1.87 | 3.73 ± 2.73 | 2.03 ± 1.12 | <0.001 |
| Glycine | 2.60 ± 4.20 | 3.21 ± 2.50 | 0.29 ± 0.60 | <0.001 |
| Myo‐inositol | 6.19 ± 4.78 | 1.37 ± 3.52 | 1.23 ± 1.92 | <0.001 |
| Lactate | 2.16 ± 1.64 | 2.30 ± 1.6 | 1.98 ± 1.05 | 0.557 |
| NAA | 0.24 ± 0.28 | 0.32 ± 0.40 | 0.56 ± 0.58 | 0.043 |
| NAAG | 0.70 ± 0.44 | 0.78 ± 0.58 | 0.95 ± 0.66 | 0.306 |
| Phosphocholine | 0.47 ± 0.40 | 1.14 ± 0.79 | 0.34 ± 0.33 | <0.001 |
| Phosphocreatine | 2.12 ± 1.66 | 1.71 ± 1.19 | 0.48 ± 0.55 | <0.001 |
| Scyllo‐inositol | 0.21 ± 0.30 | 0.32 ± 0.38 | 0.03 ± 0.12 | <0.001 |
| Taurine | 1.58 ± 1.49 | 3.46 ± 2.50 | 0.66 ± 0.68 | <0.001 |
| Total NAA (NAA + NAAG) | 0.94 ± 0.49 | 1.11 ± 0.77 | 1.50 ± 0.89 | 0.030 |
| Total choline (glycerophosphocholine + phosphocholine) | 2.03 ± 1.14 | 3.37 ± 1.59 | 1.12 ± 0.44 | <0.001 |
| Total creatine (creatine + phosphocreatine) | 4.17 ± 1.89 | 3.18 ± 1.85 | 1.22 ± 1.29 | <0.001 |
| Glx (glutamine + glutamate) | 6.75 ± 2.63 | 6.07 ± 3.32 | 5.27 ± 1.18 | 0.189 |
| TLM09 | 6.11 ± 2.27 | 6.90 ± 4.12 | 3.91 ± 1.83 | <0.001 |
| TLM13 | 20.85 ± 16.4 | 19.88 ± 14.2 | 7.24 ± 5.0 | <0.001 |
| TLM20 | 8.04 ± 3.85 | 9.40 ± 3.89 | 5.0 ± 2.21 | <0.001 |
Abbreviations: NAA, N‐acetyl aspartate; NAAG, N‐acetyl aspartate Glutamate; TLM, Total lipid and macromolecular.
Data are presented as the mean ± standard deviation.
Calculated using Kruskal‐Wallis analysis of variance with α = 0.05 for ependymomas (n = 10) versus medullobastoma (n = 38) versus pilocytic astrocytoma (n = 42).
Figure 2Mean ± standard deviation spectra for the original (n = 10) and synthetically generate ependymoma samples (n = 20) using bSMOTE and five nearest neighbors. Instances are created based on the original ependymoma (WHO grade II) and anaplastic ependymoma (World Health Organization grade III) samples.
Figure 3FreeViz two‐dimensional linear projection graphs visualizing the interactions among tumor groups in original and bSMOTE training data sets. Linear projection of the data provides new insight into the data through visualization of the space with reduced dimensionality. This method finds an optimal two‐dimensional linear projection of the given data, where the quality is defined by a separation of the data from different classes and the proximity of the instances from the same class. Base vectors of projection represent relevant metabolic features. Tumor types are identified by their color. Features with longer projection of the base vector are those with a higher impact on the placement of the instances in the two‐dimensional projection. Because this technique optimizes the projection with respect to classification of the groups, features with higher impact on classification outcome generally will have longer base vectors.
Number of Trees used in Random Forest Classifier at Different Oversampling Rates.
| bSMOTE % | 0 | 50 | 100 | 150 | 200 |
|---|---|---|---|---|---|
| Metabolite profile | 25 | 25 | 50 | 70 | 90 |
| Complete spectra | 78 | 110 | 165 | 170 | 170 |
Figure 4Performance comparison among five classification methods at different bSMOTE oversampling rates for complete spectra and metabolite profiles using the 10‐fold cross‐validation evaluation technique.
BAR of the pattern Recognition Techniques Obtained Using Metabolite Profiles and Complete Spectra as the Classifier Input at all bSMOTE Oversampling Rates.
| Original | 0.5 | 1 | 1.5 | 2 | |
|---|---|---|---|---|---|
| AdaBoostM1‐NB | |||||
| Spectra | 0.56 | 0.82 | 0.80 | 0.82 | 0.80 |
| Metabolite | 0.70 | 0.76 | 0.82 | 0.90 | 0.89 |
| AdaBoostM1‐SVM | |||||
| Spectra | 0.67 | 0.83 | 0.87 | 0.90 | 0.86 |
| Metabolite | 0.76 | 0.78 | 0.88 | 0.90 | 0.93 |
| AdaBoostM1‐NN | |||||
| Spectra | 0.63 | 0.84 | 0.79 | 0.83 | 0.82 |
| Metabolite | 0.78 | 0.83 | 0.93 | 0.92 | 0.92 |
| AdaBoostM1‐LDA | |||||
| Spectra | 0.58 | 0.70 | 0.76 | 0.80 | 0.80 |
| Metabolite | 0.82 | 0.79 | 0.86 | 0.93 | 0.91 |
| Random forests | |||||
| Spectra | 0.73 | 0.79 | 0.75 | 0.84 | 0.80 |
| Metabolite | 0.72 | 0.76 | 0.82 | 0.85 | 0.90 |
Figure 5Box plots represent BARs (a) and ependymoma F‐measures (b) obtained at different sampling rates with all learning algorithms comparing complete spectra and metabolite profiles as classifier input.