| Literature DB >> 32024143 |
Hanneke A Haijes1,2, Maria van der Ham1, Hubertus C M T Prinsen1, Melissa H Broeks1, Peter M van Hasselt2, Monique G M de Sain-van der Velden1, Nanda M Verhoeven-Duif1, Judith J M Jans1.
Abstract
Untargeted metabolomics may become a standard approach to address diagnostic requests, but, at present, data interpretation is very labor-intensive. To facilitate its implementation in metabolic diagnostic screening, we developed a method for automated data interpretation that preselects the most likely inborn errors of metabolism (IEM). The input parameters of the knowledge-based algorithm were (1) weight scores assigned to 268 unique metabolites for 119 different IEM based on literature and expert opinion, and (2) metabolite Z-scores and ranks based on direct-infusion high resolution mass spectrometry. The output was a ranked list of differential diagnoses (DD) per sample. The algorithm was first optimized using a training set of 110 dried blood spots (DBS) comprising 23 different IEM and 86 plasma samples comprising 21 different IEM. Further optimization was performed using a set of 96 DBS consisting of 53 different IEM. The diagnostic value was validated in a set of 115 plasma samples, which included 58 different IEM and resulted in the correct diagnosis being included in the DD of 72% of the samples, comprising 44 different IEM. The median length of the DD was 10 IEM, and the correct diagnosis ranked first in 37% of the samples. Here, we demonstrate the accuracy of the diagnostic algorithm in preselecting the most likely IEM, based on the untargeted metabolomics of a single sample. We show, as a proof of principle, that automated data interpretation has the potential to facilitate the implementation of untargeted metabolomics for metabolic diagnostic screening, and we provide suggestions for further optimization of the algorithm to improve diagnostic accuracy.Entities:
Keywords: IEM; automated data interpretation; diagnostics; direct-infusion high-resolution mass spectrometry; inborn errors of metabolism; next generation metabolic screening; untargeted metabolomics
Mesh:
Substances:
Year: 2020 PMID: 32024143 PMCID: PMC7037085 DOI: 10.3390/ijms21030979
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Performance automated data interpretation for patient sample sets.
| Training Sets | Optimization Set | Validation Set | ||
|---|---|---|---|---|
| Matrix | DBS | Plasma | DBS | Plasma |
| Samples | 110 | 86 | 96 | 115 |
| Patients | 42 | 38 | 96 | 115 |
| IEM | 23 | 21 | 53 | 58 |
| Correct IEM in DD ( | 86/110; 78% | 68/86; 79% | 68/96; 71% | 83/115; 72% |
| Correct IEM in top 3 of DD ( | 74/110; 67% | 36/86; 42% | 60/96; 63% | 65/115; 57% |
| Correct IEM ranked first ( | 46/110; 42% | 28/86; 33% | 38/96; 40% | 43/115; 37% |
| Length DD (median; (5th–95th)) | 8; [2–14] | 12; [3–25] | 8; [1–23] | 10; [3–22] |
DBS: dried blood spots; DD: differential diagnosis; IEM: inborn error of metabolism; n: total number; 5th: fifth percentile; 95th: ninety-fifth percentile.
Performance automated data interpretation for control sample sets.
| Training Sets | Optimization Set | Validation Set | ||
|---|---|---|---|---|
| Matrix | DBS | Plasma | DBS | Plasma |
| Samples | 105 | 84 | 66 | 83 |
| Individuals | 30 | 28 | 48 | 28 |
| Length DD (median; (5th–95th)) | 2; (0–12) | 3; (0–11) | 2; (0–8) | 3; (0–10) |
DBS: dried blood spots; DD: differential diagnosis; 5th: fifth percentile; 95th: ninety-fifth percentile.
Figure 1Overview of the diagnostic algorithm. CSF: cerebrospinal fluid; DBS: dried blood spots; IEM: inborn error of metabolism; m/z: mass to charge ratio. The * indicates multiplication. The Rank (Ranked 1-…) is calculated as followed: positive Z-scores were ranked from the maximum Z-score to zero (the highest Z-score observed ranked at position 1), and negative Z-scores were ranked from the minimum Z-score (ranking at 1) to zero.
Figure 2Flowchart of the six phases of the development and optimization of the diagnostic algorithm, with in-between assessments of the performance of the algorithm in the training sets, DBS optimization set and plasma validation set. DBS: dried blood spots.