| Literature DB >> 31797640 |
Peter Washington1, Kelley Marie Paskov, Haik Kalantarian, Nathaniel Stockham, Catalin Voss, Aaron Kline, Ritik Patnaik, Brianna Chrisman, Maya Varma, Qandeel Tariq, Kaitlyn Dunlap, Jessey Schwartz, Nick Haber, Dennis P Wall.
Abstract
Autism Spectrum Disorder (ASD) is a complex neuropsychiatric condition with a highly heterogeneous phenotype. Following the work of Duda et al., which uses a reduced feature set from the Social Responsiveness Scale, Second Edition (SRS) to distinguish ASD from ADHD, we performed item-level question selection on answers to the SRS to determine whether ASD can be distinguished from non-ASD using a similarly small subset of questions. To explore feature redundancies between the SRS questions, we performed filter, wrapper, and embedded feature selection analyses. To explore the linearity of the SRS-related ASD phenotype, we then compressed the 65-question SRS into low-dimension representations using PCA, t-SNE, and a denoising autoencoder. We measured the performance of a multilayer perceptron (MLP) classifier with the top-ranking questions as input. Classification using only the top-rated question resulted in an AUC of over 92% for SRS-derived diagnoses and an AUC of over 83% for dataset-specific diagnoses. High redundancy of features have implications towards replacing the social behaviors that are targeted in behavioral diagnostics and interventions, where digital quantification of certain features may be obfuscated due to privacy concerns. We similarly evaluated the performance of an MLP classifier trained on the low-dimension representations of the SRS, finding that the denoising autoencoder achieved slightly higher performance than the PCA and t-SNE representations.Entities:
Mesh:
Year: 2020 PMID: 31797640 PMCID: PMC6927820
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928
The SRS questions with the highest feature importances for predicting the SRS-derived ASD diagnosis. Because Recursive Feature Elemination (RFE) does not weight the selected features, we display the values of N for which the question appears in the top-N for values of N up to 6.
| SRS Question | Mutual Information | RFE Features | Decision |
|---|---|---|---|
| Relating to peers (37) | 0.383 (1) | 1, 4, 5, 6 | 0.604 (1) |
| Trouble keeping up with conversation flow (Q35) | 0.355 (2) | N/A | 0.005 (13) |
| Regarded by other children as odd (Q29) | 0.339 (3) | 2, 6 | 0.002 (47) |
| Socially awkward, even when trying to be polite (Q33) | 0.333 (4) | 3 | 0.006 (11) |
| Bizarre mannerisms (Q8) | 0.332 (5) | 3, 4, 5, 6 | 0.030314 (4) |
| Trouble understanding cause and effect (Q44) | 0.324 (6) | 2, 4, 5, 6 | 0.099 (2) |
| Difficulty with changes in routine (Q24) | 0.292 (9) | 3, 4, 5, 6 | 0.021 (5) |
| Communication of feelings to others (Q12) | 0.134 (47) | 5 | 0.005 (14) |
| Focuses on details rather than the big picture (Q58) | 0.216 (23) | 6 | 0.004 (22) |
| Either avoids or has unusual eye contact (Q16) | 0.287 (20) | N/A | 0.035 (3) |
The SRS questions with the highest feature importances across selection methods for predicting the dataset-specific ASD diagnosis. Because Recursive Feature Elemination (RFE) does not weight the selected features, we display the values of N for which the question appears in the top-N for values of N up to 6.
| SRS Question | Mutual Information | RFE Features | Decision |
|---|---|---|---|
| Trouble keeping up with conversation flow (Q35) | 0.224 (1) | 1, 2, 3, 4, 5, 6 | 0.391 (1) |
| Relating to peers (Q37) | 0.205 (2) | 6 | 0.007 (51) |
| Regarded by other children as odd (Q29) | 0.204 (3) | 2, 3, 4, 5, 6 | 0.057 (2) |
| Trouble understanding cause and effect (Q44) | 0.203 (4) | 4, 5, 6 | 0.010 (16) |
| Trouble with conversational turn taking (Q13) | 0.179 (5) | N/A | 0.010 (20) |
| Either avoids or has unusual eye contact (Q16) | 0.172 (8) | 3, 4, 5, 6 | 0.024 (3) |
| Bizarre mannerisms (Q8) | 0.178 (6) | N/A | 0.006 (56) |
| Is overly suspicious (Q59) | 0.002 (65) | 5, 6 | 0.005 (63) |
| Repetitive behaviors (Q50) | 0.126 (19) | N/A | 0.019 (4) |
| Repetitive behaviors (Q57) | 0.045 (56) | N/A | 0.012 (5) |
The AUC, precision (prec.), and recall (rec.) of a dense neural network predicting the SRS-derived ASD diagnosis trained on the top-ranking features of the 65-item SRS questionnaire using each feature selection technique.
| Number of | Mutual Information | RFE | Decision Tree |
|---|---|---|---|
| 1 | 0.928 / 0.900 / 0.928 | 0.928 / 0.900 / 0.928 | 0.928 / 0.900 / 0.928 |
| 2 | 0.961 / 0.947 / 0.906 | 0.955 / 0.912 / 0.919 | 0.962 / 0.943 / 0.906 |
| 3 | 0.971 / 0.919 / 0.953 | 0.975 / 0.932 / 0.938 | 0.973 / 0.939 / 0.933 |
| 4 | 0.974 / 0.937 / 0.938 | 0.979 / 0.941 / 0.944 | 0.980 / 0.944 / 0.943 |
| 5 | 0.980 / 0.941 / 0.941 | 0.982 / 0.939 / 0.948 | 0.984 / 0.950 / 0.951 |
| 6 | 0.983 / 0.944 / 0.949 | 0.985 / 0.949 / 0.951 | 0.987 / 0.950 / 0.961 |
| 0.997 / 0.972 / 0.979 | |||
The AUC, precision (prec.), and recall (rec.) of a dense neural network predicting the dataset-provided ASD diagnosis trained on the top-ranking features of the 65-item SRS questionnaire using each feature selection technique.
| Number of | Mutual Information | RFE | Decision Tree |
|---|---|---|---|
| 1 | 0.836 / 0.750 / 0.774 | 0.836 / 0.727 / 0.843 | 0.836 / 0.724 / 0.836 |
| 2 | 0.866 / 0.730 / 0.882 | 0.870 / 0.734 / 0.907 | 0.870 / 0.735 / 0.899 |
| 3 | 0.874 / 0.735 / 0.902 | 0.876 / 0.738 / 0.905 | 0.876 / 0.736 / 0.909 |
| 4 | 0.879 / 0.739 / 0.912 | 0.881 / 0.740 / 0.917 | 0.880 / 0.741 / 0.907 |
| 5 | 0.880 / 0.739 / 0.916 | 0.880 / 0.740 / 0.911 | 0.882 / 0.737 / 0.920 |
| 6 | 0.881 / 0.736 / 0.924 | 0.884 / 0.742 / 0.923 | 0.886 / 0.745 / 0.914 |
| 0.900 / 0.754 / 0.921 | |||
Fig. 1.(a and b) Principal Component Analysis (PCA), (c and d) t-Distributed Stochastic Neighbor Embedding (t-SNE), and (e and f) a 2-dimensional encoding using a denoising autoencoder with a middle layer of size 2 on the answers to the 65 questions of the Social Responsiveness Scale (SRS). (b, d, and f) There remains a clear but more noisy separation between cases and controls when coloring by dataset-provided diagnosis.
The AUC, precision (prec.), and recall (rec.) of a dense neural network predicting the SRS-derived ASD diagnosis and trained on lower dimensional representations of the 65-item SRS questionnaire via PCA, t-SNE, and the middle encoded layer of a denoising autoencoder.
| Dimension | PCA | t-SNE | Autoencoder |
|---|---|---|---|
| 1 | 0.9975 / 0.9778 / 0.9719 | 0.9871 / 0.9702 / 0.9356 | 0.9975 / 0.9828 / 0.9740 |
| 2 | 0.9975 / 0.9769 / 0.9759 | 0.9934 / 0.9766 / 0.9514 | 0.9974 / 0.9503 / 0.9950 |
| 3 | 0.9979 / 0.9739 / 0.9739 | 0.9920 / 0.9734 / 0.9415 | 0.9975 / 0.9818 / 0.9730 |
| 0.9979 / 0.9799 / 0.9884 | |||
The AUC, precision (prec.), and recall (rec.) of a dense neural network predicting the dataset-provided ASD diagnosis and trained on lower dimensional representations of the 65-item SRS questionnaire via PCA, t-SNE, and the middle encoded layer of a denoising autoencoder.
| Dimension | PCA | t-SNE | Autoencoder |
|---|---|---|---|
| 1 | 0.8717 / 0.7272 / 0.9097 | 0.8448 / 0.7246 / 0.9356 | 0.9017 / 0.7304 / 0.9373 |
| 2 | 0.8727 / 0.7206 / 0.9110 | 0.8821 / 0.7650 / 0.9157 | 0.9016 / 0.7193 / 0.9564 |
| 3 | 0.8813 / 0.7306 / 0.9150 | 0.8788 / 0.7542 / 0.8934 | 0.9021 / 0.7304 / 0.9373 |
| 0.9034 / 0.7673 / 0.9161 | |||