| Literature DB >> 30674317 |
Zhidai Liu1,2,3,4, Tingting Zhou5, Xing Han1,2,3,4, Tingyuan Lang6, Shan Liu1,2,3,4, Penghui Zhang7,2,3,4, Haiyan Liu8, Kexing Wan1,2,3,4, Jie Yu8, Liang Zhang9, Liyan Chen10, Roger W Beuerman10,11,12, Bin Peng13, Lei Zhou14,15,16, Lin Zou17,18,19,20.
Abstract
BACKGROUND: The altered concentrations of amino acids were found in the bone marrow or blood of leukemia patients. Metabolomics technology combining mathematical model of biomarkers could be used for assisting the diagnosis of pediatric acute leukemia (AL).Entities:
Keywords: Acute leukemia; Amino acid panel; Mass spectrometry; Mathematical model
Mesh:
Substances:
Year: 2019 PMID: 30674317 PMCID: PMC6343345 DOI: 10.1186/s12967-019-1783-9
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Fig. 1The overview of study design. In the phase of model establishment, 240 newly diagnosed AL children (ALL/AML = 174/66), 284 children with non-neoplastic hematological diseases and 220 healthy children were recruited for amino acids quantization with LC–MS/MS (red part). Based on the concentrations of 17 amino acids in all the patients and controls, we evaluated the differences among groups (red part) and the best model was established by Python-sklearn (green part). The model was then improved and verified by parameters adjusting and cross-validation (green part). Finally, another prospective independent cohort consisting of 280 newly diagnosed AL (ALL/AML = 184/96) and 308 children with non-neoplastic hematological diseases were used for further clinical verification as Out-Sample Test (purple part)
Concentrations of amino acid among children in Group A
| Amino acid | AL children | Ctrl | Healthy children | |
|---|---|---|---|---|
| Ala | 134.59 ± 49.41 | 148.99 ± 47.90 | 144.87 ± 47.93 | 0.084 |
| Asp | 17.59 ± 8.58 | 14.17 ± 3.39 | 13.63 ± 2.77 | < 0.001 |
| Glu | 27.94 ± 14.69 | 20.84 ± 4.95 | 29.75 ± 6.49 | < 0.001 |
| Met | 18.35 ± 11.18 | 21.88 ± 10.17 | 16.24 ± 6.61 | 0.002 |
| Phe | 59.81 ± 23.23 | 35.96 ± 7.77 | 49.99 ± 28.98 | < 0.001 |
| Tyr | 35.85 ± 14.88 | 31.17 ± 10.62 | 39.87 ± 18.55 | 0.001 |
| Leu | 55.81 ± 16.56 | 66.57 ± 15.64 | 62.82 ± 15.46 | < 0.001 |
| Trp | 20.28 ± 13.15 | 14.77 ± 4.06 | 17.27 ± 5.18 | < 0.001 |
| Val | 95.41 ± 28.42 | 103.12 ± 23.40 | 112.37 ± 29.90 | 0.001 |
| Arg | 60.68 ± 20.43 | 60.83 ± 15.51 | 66.12 ± 19.70 | 0.177 |
| Cit | 11.90 ± 4.44 | 16.45 ± 4.25 | 15.40 ± 5.93 | < 0.001 |
| Gly | 80.60 ± 33.77 | 66.36 ± 14.91 | 69.04 ± 19.80 | < 0.001 |
| Orn | 21.61 ± 4.88 | 24.23 ± 2.62 | 33.01 ± 3.08 | < 0.001 |
| Gln | 16.45 ± 5.88 | 17.06 ± 2.80 | 6.47 ± 2.39 | < 0.001 |
| His | 84.98 ± 79.31 | 73.50 ± 61.55 | 73.38 ± 22.13 | 0.357 |
| Ser | 9.99 ± 4.31 | 8.73 ± 1.67 | 11.10 ± 2.99 | < 0.001 |
| Thr | 14.92 ± 7.72 | 14.52 ± 3.94 | 16.41 ± 6.75 | 0.215 |
Ala: alanine; Asp: aspartic acid; Glu: glutamic acid; Met: methionine; Phe: phenylalanine; Tyr: tyrosine; Leu: leucine; Trp: tryptophane; Val: valine; Arg: argnine; Cit: citrulline; Gly: glycine; Orn: ornithine; Gln: glutamine; His: histidine; Ser: serine; Thr: threonine
Fig. 2The heatmap of the Pearson correlation coefficients between each amino acid group. Ala: alanine; Asp: aspartic acid; Glu: glutamic acid; Met: methionine; Phe: phenylalanine; Tyr: tyrosine; Leu: leucine; Trp: tryptophane; Val: valine; Arg: argnine; Cit: citrulline; Gly: glycine; Orn: ornithine; Gln: glutamine; His: histidine; Ser: serine; Thr: threonine; Group: The classification of children (All the children were divided into three groups: AL children, controls and healthy children, so each child had a label. Because we would establish model under supervised learning protocol, we need to evaluate the correlation between every amino acid and each label. The value of each amino acid to Group was higher, which mean the correlation between the amino acid and the diagnosis of AL was closer.)
The performance of models on AL diagnosis for In-Sample Test
| SVM | RF | XGBoost | |
|---|---|---|---|
| Sensitivity (%) | 92.23 ± 4.32 | 94.44 ± 5.27 | 95.86 ± 4.21 |
| Specificity (%) | 94.43 ± 3.77 | 91.76 ± 4.85 | 94.21 ± 4.96 |
| Accuracy (%) | 87.24 ± 4.23 | 88.76 ± 5.11 | 90.23 ± 4.89 |
| AUCa | 0.812 ± 0.036 | 0.821 ± 0.032 | 0.828 ± 0.035 |
SVM: support vector machine; RF: random forest; XGboost: eXtreme Gradient Boosting; AUC: area under curve
aROC analysis
The cross-validation of best model for each algorithm on AL diagnosis for In-Sample Test
| SVM | RF | XGBoost | |
|---|---|---|---|
| Mean of accuracy (%) (95% CI) | 89.84 (84.72, 94.96) | 90.12 (84.67, 95.57) | 91.35 (87.05, 95.65) |
| Mean of AUC (95% CI) | 0.848 (0.819, 0.877) | 0.834 (0.811, 0.857) | 0.856 (0.809, 0.923) |
Fig. 3The learning-curve for the three algorithm. a The learning-curve for SVM; b the learning-curve for RF; c the learning-curve for XGBoost; red curve stood for training set and green curve stood for testing set
The validation of models on AL diagnosis for Out-Sample Test
| Diagnosis (model/clinical diagnosis) | χ2 | AUCi | ||||||
|---|---|---|---|---|---|---|---|---|
| +/+ a | ∓b | ±c | −/−d | |||||
| Result-SVMh | 237 | 43 | 46 | 262 | 0.1011 | 0.697 | 0.751 | 0.788 |
| Sensitivitye (%) | 84.64 | |||||||
| Specificityf (%) | 85.06 | |||||||
| Accuracyg (%) | 84.86 | |||||||
| Result-RFh | 231 | 49 | 38 | 270 | 1.3908 | 0.703 | 0.238 | 0.803 |
| Sensitivitye (%) | 82.50 | |||||||
| Specificityf (%) | 87.66 | |||||||
| Accuracyg (%) | 85.20 | |||||||
| Result-XGBh | 252 | 28 | 34 | 274 | 0.2903 | 0.789 | 0.446 | 0.830 |
| Sensitivitye (%) | 90.00 | |||||||
| Specificityf (%) | 88.96 | |||||||
| Accuracyg (%) | 89.46 | |||||||
SVM: support vector machine; RF: random forest; XGB: XGBoot; FN: false negative; FP: false positive; AUC: area under curve
aOur model or clinical diagnosis were both positive-children were with leukemia
bOur model diagnosed children as normal, but the clinical diagnosis of them was leukemia
cOur model diagnosed children as leukemia, but the clinical diagnosis of them was normal
dOur model or clinical diagnosis were both negative, and children were normal
eNumber of +/+ for each model/(number of +/+ for each model plus number of ∓ for each model) × 100%
fNumber of −/− for each model/(number of −/− for each model plus number of ± for each model) × 100%
g(Number of −/− for each model plus number of +/+ for each model)/588 × 100%
hMcNemar’s test
iROC analysis
The true positive and negative prediction performance of morphology and XGBoost model in Group B
| Diagnosis (model/clinical diagnosis) | AUCe | ||||||
|---|---|---|---|---|---|---|---|
| +/+a | ∓b | ±c | −/−d | ||||
| M | 268 | 12 | 86 | 222 | 0.670 | < 0.001 | 0.742 |
| X | 252 | 28 | 34 | 274 | 0.789 | 0.720 | 0.830 |
| M + X | 262 | 18 | 26 | 282 | 0.850 | 0.523 | 0.872 |
McNemar’s test
M: morphology; X: XGBoost model; AUC: area under curve
aOur model or clinical diagnosis were both positive-children were with leukemia
bOur model diagnosed children as normal, but the clinical diagnosis of them was leukemia
cOur model diagnosed children as leukemia, but the clinical diagnosis of them was normal
dOur model or clinical diagnosis were both negative, and children were normal
eROC analysis
The comparison between new strategy and conventional methods
| New strategy | Conventional methods | |
|---|---|---|
| Time-consuming | 4–6 h | 3 days |
| Expense | $20 per child | $250 per child |
| Sample collection | Peripheral blood (easy to collect) | Bone marrow (hard to collect) |