| Literature DB >> 25380056 |
Paul R West1, David G Amaral2, Preeti Bais3, Alan M Smith1, Laura A Egnash1, Mark E Ross1, Jessica A Palmer1, Burr R Fontaine1, Kevin R Conard1, Blythe A Corbett4, Gabriela G Cezar1, Elizabeth L R Donley1, Robert E Burrier1.
Abstract
BACKGROUND: The diagnosis of autism spectrum disorder (ASD) at the earliest age possible is important for initiating optimally effective intervention. In the United States the average age of diagnosis is 4 years. Identifying metabolic biomarker signatures of ASD from blood samples offers an opportunity for development of diagnostic tests for detection of ASD at an early age.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25380056 PMCID: PMC4224480 DOI: 10.1371/journal.pone.0112445
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Patient demographic information.
| Demographic | TD | ASD | Overall | |
| Group Size | 30 | 52 | 82 | |
| Sex (male %) | 86.67 | 78.85 | 81.7 | |
| Range | 4.17–6.92 | 4–6.92 | 4–6.92 | |
| Age (Years) | Average | 5.6 | 5.37 | 5.46 |
| Std. Dev. | 0.95 | 0.81 | 0.87 | |
| Range | 88–137 | 40–110 | 40–137 | |
| IQ | Average | 114.3 | 67.48 | 80 |
| Std. Dev. | 10.78 | 17.69 | 27.47 | |
Figure 1Classification modeling process.
A three-layer nested cross-validation approach was applied using both PLS-DA and SVM modeling methods to determine significant features capable of classifying children with ASD from TD children. The 179 features of the training set were analyzed using a leave-one-group-out cross-validation loop as described. The results from this cross-validation process were used to estimate model performance and create a robust feature VIP score index to rank the ASD vs TD classification importance of each of the 179 features. These feature ranks were used to evaluate the performance of the molecular signature using an independent validation set.
Figure 2Feature Importance Rankings.
The top 179 features were compared for rank between SVM and PLS modeling methods. The lowest rank scores represent the most important features.
A breakdown of the numbers of features resulting from filtering and annotation processes, based on molecular formula.
| Platform | Raw Features | Annotated Features | Unique Formula within a Platform | Features Passing Preprocessing Filters | Features Passing Univariate Filter |
| HILIC + | 3207 | 1985 | 146 | 1527 | 40 |
| HILIC− | 1865 | 1061 | 140 | 950 | 35 |
| C8+ | 3062 | 1902 | 140 | 1096 | 42 |
| C8− | 1568 | 847 | 77 | 514 | 23 |
| GC-MS | 485 | 178* | 142* | 485 | 39 |
|
|
|
|
|
|
|
This table also helps to illustrate the orthogonality and contribution of each of the 5 analytical platforms. Molecular formulae are being used here only to approximate the method orthogonality, since any given molecular formula may be associated with multiple chemical structures. *These annotations were confirmed in the GCMS platform and the formula were confirmed by using the KEGG database instead of the FBF procedure used in the 4 LCMS platforms.
Figure 3Performance of the SVM and PLS models.
Average AUC and accuracy of the (a) SVM and (b) PLS models containing different numbers of features. The bar graphs show the number of optimal models which were derived from recursive feature elimination process that was included in the resampling process for the indicated number of features.
Figure 4ROC curve performance of the classification models from the training and validation sets.
The average of 100 iterations of the classifier for the best performing feature sets following recursive feature elimination comparing ASD vs. TD samples (Black and Grey Lines). The blue (PLS) and red (SVM) lines are ROC curves of the best performing validation feature subsets. Vertical bars represent the standard error of the mean.
Classifier performance metrics based on predictions on the independent 21-sample validation set, showing the feature sets with the highest accuracy.
| Model | Feature No. | Accuracy | Sensitivity | Specificity | AUC |
| SVM | 80 | 0.81 | 0.85 | 0.75 | 0.84 |
| PLS | 160 | 0.81 | 0.92 | 0.63 | 0.81 |
Feature No. corresponds to the number of the ordered, ranked VIP features that were evaluated. Table S3 shows the results for all feature sets.
Confirmed metabolites.
| Analytical Platform | Metabolite | Feature ID | HMDB ID | Fold Change (ASD/TD) | p-value (ASD vs. TD) | FDR | SVM Rank | PLS Rank |
| HILICpos | homocitrulline | M190T512 | HMDB00679 | 0.67 | <0.001 | 0.059 | 1 | 1 |
| C8neg | 2-hydroxyvaleric acid | M117T127 | HMDB01863 | 0.80 | 0.0289 | 0.53 | 33 | 26 |
| HILICpos | cystine | M241T774 | HMDB00192 | 0.91 | 0.0277 | 0.532 | 87 | 121 |
| GCMS | aspartic acid | GCMS_aspartic.acid | HMDB00191 | 1.33 | <0.001 | 0.086 | 34 | 14 |
| HILICpos | isoleucine | M132T248 | HMDB00172 | 0.76 | 0.0351 | 0.541 | 60 | 69 |
| HILICpos | creatinine | M114T262 | HMDB00562 | 0.88 | 0.0471 | 0.576 | 57 | 75 |
| GCMS | serine | GCMS_serine | HMDB00187 | 1.16 | 0.00275 | 0.267 | 137 | 118 |
| HILICneg | 4-hydroxyphenyllactic acid | M181T66 | HMDB00755 | 0.84 | 0.0344 | 0.541 | 47 | 11 |
| GC-MS | citric acid | GCMS_citric.acid | HMDB00094 | 0.91 | 0.0492 | 0.580 | 84 | 16 |
| GC-MS | glutamic acid | GCMS_glutamic.acid | HMDB00148 | 1.28 | 0.00144 | 0.188 | 15 | 47 |
| GC-MS | lactic acid | GCMS_indol.3.lactate | HMDB00671 | 0.87 | 0.0181 | 0.457 | 55 | 52 |
| C8neg | DHEA sulfate | M367T736 | HMDB01032 | 2.55 | 0.00152 | 0.188 | 11 | 67 |
| GC-MS | glutaric acid | GCMS_glutaric.acid | HMDB00661 | 1.36 | 0.00492 | 0.322 | 27 | 15 |
| GC-MS | 5-hydroxynorvaline | GCMS_X5. Hydroxy norvaline.NIST | HMDB31658 | 1.27 | 0.0457 | 0.576 | 177 | 163 |
| GC-MS | heptadecanoic acid | GCMS_heptadecanoic.acid.NIST | HMDB02259 | 0.81 | 0.0270 | 0.527 | 135 | 110 |
| GC-MS | 5-aminovaleric acid lactam | GCMS_X5.aminovaleric.acid.lactame | HMDB11749 | 2.43 | 0.00211 | 0.22 | 127 | 62 |
| GC-MS | succinic acid | GCMS_succinic.acid | HMDB00254 | 1.11 | 0.0457 | 0.576 | 175 | 164 |
| GC-MS | myristic acid | GCMS_myristic.acid | HMDB00806 | 0.76 | 0.00892 | 0.371 | 24 | 27 |
| GC-MS | 2-hydroxyvaleric acid | GCMS_X2.hydroxyvaleric.acid | HMDB01863 | 1.41 | 0.0406 | 0.564 | 179 | 171 |
| GC-MS | methylhexadecanoic acid | GCMS_methylhexadecanoic.acid | NA | 0.82 | 0.0399 | 0.564 | 160 | 120 |
| GC-MS | 3-aminoisobutyric acid | GCMS_X3.aminoisobutyric.acid | HMDB02166 | 1.19 | 0.0473 | 0.576 | 176 | 176 |
Statistically significant metabolites from the 61-sample training set with chemical structures confirmed by LC-HRMS-MS or GC-MS.