| Literature DB >> 30467922 |
Yang Ding1,2, Sabrina Suffren1,2, Pierre Bellec2,3,4, Gregory A Lodygensky1,2,5.
Abstract
Quality control (QC) of brain magnetic resonance images (MRI) is an important process requiring a significant amount of manual inspection. Major artifacts, such as severe subject motion, are easy to identify to naïve observers but lack automated identification tools. Clinical trials involving motion-prone neonates typically pool data to obtain sufficient power, and automated quality control protocols are especially important to safeguard data quality. Current study tested an open source method to detect major artifacts among 2D neonatal MRI via supervised machine learning. A total of 1,020 two-dimensional transverse T2-weighted MRI images of preterm newborns were examined and classified as either QC Pass or QC Fail. Then 70 features across focus, texture, noise, and natural scene statistics categories were extracted from each image. Several different classifiers were trained and their performance was compared with subjective rating as the gold standard. We repeated the rating process again to examine the stability of the rating and classification. When tested via 10-fold cross validation, the random undersampling and adaboost ensemble (RUSBoost) method achieved the best overall performance for QC Fail images with 85% positive predictive value along with 75% sensitivity. Similar classification performance was observed in the analyses of the repeated subjective rating. Current results served as a proof of concept for predicting images that fail quality control using no-reference objective image features. We also highlighted the importance of evaluating results beyond mere accuracy as a performance measure for machine learning in imbalanced group settings due to larger proportion of QC Pass quality images.Entities:
Keywords: Canadian Neonatal Brain Platform; T2w; brain imaging; motion detection; neonatal; open source; quality control
Mesh:
Year: 2018 PMID: 30467922 PMCID: PMC6588009 DOI: 10.1002/hbm.24449
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.038
Figure 1Example of images that fail (left) and pass (right) subjective quality control
Figure 2Graphical illustration of the analyses strategy and respective hypotheses
Similarity matrix between first and second classification of the same dataset
| First rating | Total | |||
|---|---|---|---|---|
| QC fail | QC pass | |||
| Second rating | QC fail | 44 | 7 | 51 |
| QC pass | 15 | 954 | 969 | |
| Total | 59 | 961 | 1,020 | |
QC = quality control.
manova Wilk's lambda test measuring the amount of variance observed in the four image feature categories not accounted for by the variance in the subjective manual ratings
| Feature type | Wilk's lambda |
| Hypothesis DF | Error DF |
| Partial Eta squared |
|---|---|---|---|---|---|---|
| Focus and Blurr | 0.11 | 342.5 | 24 | 995 | <.001 | 0.89 |
| Signal and noise | 0.87 | 10.3 | 14 | 1,005 | <.001 | 0.13 |
| Texture | 0.54 | 44.4 | 19 | 1,000 | <.001 | 0.46 |
| Natural scene statistics | 0.56 | 115.5 | 7 | 1,012 | <.001 | 0.44 |
| Combination of all feature types | 0.08 | 201.2 | 56 | 963 | <.001 | 0.92 |
DF = degree of freedoms.
Classification performance of various classifiers tested
| Performance (10‐fold cross validation) | Precision (%) | Recall (%) | F1 score | Negative predictive value (%) | Specificity (%) | Overall prediction accuracy (%) |
|---|---|---|---|---|---|---|
| Linear discriminant analyses | 60.7 | 62.7 | 61.67 | 97.7 | 97.5 | 95.5 |
| Quadratic discriminant analyses | 46.2 | 81.4 | 58.90 | 98.8 | 94.2 | 93.4 |
| Naïve Bay classifier | 78.4 | 49.2 | 60.42 | 96.9 | 99.2 | 96.3 |
| Decision tree | 66.1 | 69.5 | 67.77 | 98.1 | 97.8 | 96.2 |
| RUSBoost (min false negative) | 65.8 | 84.7 | 74.07 | 99.0 | 97.3 | 96.6 |
| RUSBoost (min false positive) | 84.6 | 74.6 | 79.28 | 98.5 | 99.2 | 97.7 |
Min = minimizing; RUSBoost, Random UnderSampling and adaBoost.