| Literature DB >> 35433168 |
Elaine Zaunseder1,2, Saskia Haupt1,2, Ulrike Mütze3, Sven F Garbade3, Stefan Kölker3, Vincent Heuveline1,2.
Abstract
The development and continuous optimization of newborn screening (NBS) programs remains an important and challenging task due to the low prevalence of screened diseases and high sensitivity requirements for screening methods. Recently, different machine learning (ML) methods have been applied to support NBS. However, most studies only focus on single diseases or specific ML techniques making it difficult to draw conclusions on which methods are best to implement. Therefore, we performed a systematic literature review of peer-reviewed publications on ML-based NBS methods. Overall, 125 related papers, published in the past two decades, were collected for the study, and 17 met the inclusion criteria. We analyzed the opportunities and challenges of ML methods for NBS including data preprocessing, classification models and pattern recognition methods based on their underlying approaches, data requirements, interpretability on a modular level, and performance. In general, ML methods have the potential to reduce the false positive rate and identify so far unknown metabolic patterns within NBS data. Our analysis revealed, that, among the presented, logistic regression analysis and support vector machines seem to be valuable candidates for NBS. However, due to the variety of diseases and methods, a general recommendation for a single method in NBS is not possible. Instead, these methods should be further investigated and compared to other approaches in comprehensive studies as they show promising results in NBS applications.Entities:
Keywords: data mining; data preprocessing; data science; deep learning; machine learning; modeling; neonatal screening; pattern recognition
Year: 2022 PMID: 35433168 PMCID: PMC8995842 DOI: 10.1002/jmd2.12285
Source DB: PubMed Journal: JIMD Rep ISSN: 2192-8304
FIGURE 1Illustration of machine learning pipeline in NBS. The classification model is the essential part of the ML pipeline in NBS including the interpretable and noninterpretable classification methods and their performance optimization. Data preprocessing is an optional module applied before the classification model. It can include data sampling, feature selection and feature construction methods. Pattern recognition is applied after the classification method evaluating feature importance for biomarker discovery. ML, machine learning; NBS, newborn screening
Summary of all reviewed studies on applied data imbalance, feature construction, feature selection and ML classification methods
| Author | Disease | Data imbalance | Feature construction | Feature selection | ML classification |
|---|---|---|---|---|---|
| Baumgartner et al. | PKU | Random sampling | Information gain | DT, LRA | |
| Baumgartner et al. | MCADD, PKU | Random sampling | Information gain, relief‐based | LDA, DT, KNN, LRA, NN, SVM | |
| Baumgartner et al. | 3‐MCCD*, MCADD, PKU | Random sampling | Diagnostic flag | DT, LRA | |
| Baumgartner et al. | 3‐MCCD*, PKU, GA1, MMA, PA, MCADD, LCHADD | Random sampling | Discriminatory threshold | KNN, LRA, Naive Bayes, NN, SVM | |
| Ho et al. | MCADD | Informed sampling | Arithmetic ratio |
| Rule learner |
| Hsieh et al. | MMA | Pearson coefficient | SVM | ||
| Hsieh et al. | MMA | Random sampling | Pearson coefficient | SVM | |
| Van den Bulcke et al. | MCADD | Oversampling | Arithmetic ratio | Variable set optimization | DT, LRA, Ridge‐LRA |
| Chen et al. | PKU | Fisher score | SVM | ||
| Chen et al. | 3‐MCCD*, PKU, MET | Arithmetic ratio | Fisher score, Variable set optimization | SVM | |
| Lin et al. | CIT1, CIT2, CPT1D, GA1, IBDD, IVA, MADD, MET, MMA, MSUD, PA, PKU, PTPSD, SCADD*, VLCADD | Random sampling, oversampling, informed sampling |
| Bagging, Boosting, DT, KNN, LDA, LRA, RF, SVM | |
| Peng et al. | MMA | Oversampling | RF | ||
| Wang et al. | SCADD*, MCADD, VLCADD | Arithmetic ratio | Discriminatory threshold | LRA | |
| Zarin Mousavi et al. | CH |
| Bagging, Boosting, DT, NN, SVM | ||
| Peng et al. | GA1, MMA, OTCD, VLCADD | Second tier | RF | ||
| Zhu et al. | PKU | Arithmetic ratio | Pearson coefficient, LVQ | LRA | |
| Lasarev et al. | CAH | Informed sampling | PCA | DT |
Note: Diseases with * are biochemical variations nowadays known as nondiseases.
Abbreviations: CAH, congenital adrenal hyperplasia; CH, congenital hypothyroidism; CIT1, citrullinemia type I; CIT2, citrullinemia type II; CPT1D, carnitine palmitoyltransferase I deficiency; DT, decision tree; GA1, glutaric aciduria type I; IBDD, isobutyryl‐CoA dehydrogenase deficiency; IVA, isovaleric aciduria; KNN, K‐nearest neighbors; LCHADD, long‐chain hydroxyacyl‐CoA deficiency; LDA, linear discriminant analysis; LRA, logistic regression analysis; LVQ, learned vector quantization; MADD, multiple acyl‐CoA dehydrogenase deficiency; MCADD, medium‐chain acyl‐CoA dehydrogenase deficiency; 3‐MCCD, 3‐methylcrotonyl‐CoA carboxylase deficiency; MET, hypermethioninemia; MMA, methylmalonic aciduria; MSUD, maple syrup urine disease; NN, neural network; OTCD, ornithine transcarbamylase deficiency; PA, propionic aciduria; PCA, principal component analysis; PKU, phenylketonuria; PTPSD, 6‐pyruvoyl‐tetrahydrobiopterin synthetase deficiency; RF, random forest; Ridge‐LRA, logistic ridge regression; SCADD, short‐chain acyl‐CoA dehydrogenase deficiency; SVM, support vector machine; VLCADD, very long‐chain acyl‐CoA dehydrogenase deficiency.
Sensitivity, specificity and positive predictive value (PPV) of considered ML classification methods
| Disease | ML classification | Sensitivity (%) | Specificity (%) | PPV (%) | Author |
|---|---|---|---|---|---|
| (A) Comparative ML classification studies | |||||
| PKU | LRA | 100 | 99.793 | 17.41 | Baumgartner et al. |
| LRA | 98.0 | 99.9 | – | Baumgartner et al. | |
| LRA | 96.809 | 99.905 | 49.46 | Baumgartner et al. | |
| MMA | NN | 98.0 | – | 98.0 | Baumgartner et al. |
| MCADD | Ridge‐LRA | 100 | 99.987 | 33.90 | Van den Bulcke et al. |
| LRA | 96.83 | 99.992 | 88.41 | Baumgartner et al. | |
| LRA | 95.238 | 99.992 | 88.24 | Baumgartner et al. | |
| 3‐MCCD* | LRA | 95.455 | 99.957 | 33.33 | Baumgartner et al. |
| CH | Bagging‐SVM | 73.33 | 100 | – | Zarin Mousavi et al. |
| CIT2, MET, MMA, PKU, SCADD* | SVM | 91.30 | 36.36 | 19.29 | Lin et al. |
| (B) Single ML classification studies | |||||
| PKU | SVM | 100 | 99.997 (99.971) | – | Chen et al. |
| SVM | 100 (100) | 99.98 (99.96) | – | Chen et al. | |
| LRA | 97.66 | 31.61 | 24.59 | Zhu et al. | |
| MMA | SVM | 100 (100) | 100 (99.79) | – | Hsieh et al. |
| RF | 100 (100) | 89.678 (81.226) | 26.40 (16.40) | Peng et al. | |
| RF | 96.117 (96.117) |
| 28.9 (16.5) | Peng et al. | |
| SVM | 95.9 (81.4) | 95.6 (76.2) | – | Hsieh et al. | |
| MCADD | LRA |
|
| 18.2 (3.4) | Wang et al. |
| RL | 100 (100) | 99.901 (98.463) | 93.75 (49.18) | Ho et al. | |
| GA1 | RF |
|
| 22.30 (3.10) | Peng et al. |
| 3‐MCCD* | SVM | 100 | 99.936 (99.711) | – | Chen et al. |
| MET | SVM | 100 | 99.986 (99.958) | – | Chen et al. |
| VLCADD | LRA |
|
| 100 (100) | Wang et al. |
| RF |
|
| 23.40 (23.10) | Peng et al. | |
| OTCD | RF |
|
| 62.10 (3.50) | Peng et al. |
| SCADD* | LRA |
|
| 73.3 (22.0) | Wang et al. |
| CAH | DT |
|
| 66.7 (20) | Lasarev et al. |
Note: (A) Values of best performing ML classification methods with highest sensitivity and specificity in comparative studies. If presented in the study, these are the results from largest or unknown validation datasets. (B) All results of studies applying a single classification method. If sensitivity and specificity were not stated in the study, the results are calculated based on the published contingency table and given in italics. Results in brackets show comparison to traditional NBS, where given. Diseases with * are biochemical variations nowadays none as nondiseases. The results from Lin et al. are presented in a separate row, since they only report average evaluation results for groups diseases. Most studies applied sampling algorithms, changing the sick‐to‐control ratio, and reduced datasets, such as only including false positive patients from traditional screening. Hence, the performance results and reference values of Table 2 have to be evaluated and compared carefully.
Abbreviations: CAH, congenital adrenal hyperplasia; CH, congenital hypothyroidism; CIT2, citrullinemia type II; DT, decision tree; GA1, glutaric aciduria type I; LRA, logistic regression analysis; MCADD, medium‐chain acyl‐CoA dehydrogenase deficiency; 3‐MCCD, 3‐methylcrotonyl‐CoA carboxylase deficiency; MET, hypermethioninemia; MMA, methylmalonic aciduria; NN, neural network; OTCD, ornithine transcarbamylase deficiency; PKU, phenylketonuria; RF, random forest; RL, rule learner; Ridge‐LRA, logistic ridge regression; SCADD, short‐chain acyl‐CoA dehydrogenase deficiency; SVM, support vector machine; VLCADD, very long‐chain acyl‐CoA dehydrogenase deficiency.
FIGURE 2PRISMA flow diagram describing the two‐stage search procedure for studies identified, screened, included, and excluded for this review
FIGURE 3Applied sampling methods. Imbalanced datasets consist of healthy patients (), special subsets of healthy patients (), and sick patients (). Oversampling adds synthetically created sick patients to the dataset. Undersampling methods reduce the number of data points: random sampling randomly excludes healthy patients, informed sampling excludes only specific subsets of healthy patients. In each box the studies applying the respective method are given
FIGURE 4Applied feature selection strategies. Prealgorithm strategies (left) work independent of the ML classification method and are filter methods, using statistical properties, or informed methods, using clinical knowledge. Postalgorithm methods (right) are directly embedded within the classification method or wrapped around it via an iterative loop. In each box the studies applying the respective method are given. ML, machine learning
FIGURE 5ML classification methods applied in NBS. The methods are distinguished according to their interpretability and functionality. Interpretable methods on a modular level (left) and noninterpretable methods on a modular level (right) which can be split into ensemble and deep learners or other methods. In each box the studies applying the respective method are given. ML, machine learning; NBS, newborn screening