| Literature DB >> 34040042 |
Alexandra C Salem1, Heather MacFarlane2, Joel R Adams3, Grace O Lawley3, Jill K Dolata4, Steven Bedrick5, Eric Fombonne2,4.
Abstract
Measurement of language atypicalities in Autism Spectrum Disorder (ASD) is cumbersome and costly. Better language outcome measures are needed. Using language transcripts, we generated Automated Language Measures (ALMs) and tested their validity. 169 participants (96 ASD, 28 TD, 45 ADHD) ages 7 to 17 were evaluated with the Autism Diagnostic Observation Schedule. Transcripts of one task were analyzed to generate seven ALMs: mean length of utterance in morphemes, number of different word roots (NDWR), um proportion, content maze proportion, unintelligible proportion, c-units per minute, and repetition proportion. With the exception of repetition proportion (p [Formula: see text]), nonparametric ANOVAs showed significant group differences (p[Formula: see text]). The TD and ADHD groups did not differ from each other in post-hoc analyses. With the exception of NDWR, the ASD group showed significantly (p[Formula: see text]) lower scores than both comparison groups. The ALMs were correlated with standardized clinical and language evaluations of ASD. In age- and IQ-adjusted logistic regression analyses, four ALMs significantly predicted ASD status with satisfactory accuracy (67.9-75.5%). When ALMs were combined together, accuracy improved to 82.4%. These ALMs offer a promising approach for generating novel outcome measures.Entities:
Year: 2021 PMID: 34040042 PMCID: PMC8155086 DOI: 10.1038/s41598-021-90304-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Sample characteristics.
| ASD | TD | ADHD | p | post-hoc | |
|---|---|---|---|---|---|
| Male sex, N (%) | 80 (83.3) | 12 (42.9) | 31 (68.9) | ||
| Age in years, X (SD) | 11.36 (2.21) | 11.61 (1.73) | 11.46 (1.61) | .84 | |
| Hispanic, N (%) | 14 (14.6) | 5 (17.9) | 3 (6.7) | .53 | |
| Race white, N (%) | 77 (82.8) | 24 (85.7) | 37 (84.1) | .93 | |
| WISC full scale IQ, X (SD) | 99.0 (19.7) | 113.4 (12.3) | 111.6 (13.8) | ||
| SA score, X (SD) | 9.48 (3.52) | 1.04 (1.86) | 1.29 (1.44) | ||
| RRB score, X (SD) | 3.47 (1.56) | .52 (.71) | .42 (.58) | ||
| Total score, X (SD) | 12.95 (3.43) | 1.56 (2.29) | 1.71 (1.67) | ||
| SRS total t-score, X (SD) | 77.27 (10.60) | 43.96 (4.14) | 53.89 (8.62) | ||
| GCC, X (SD) | 73.32 (11.79) | 111.96 (8.31) | 96.91 (12.78) | ||
| Structural score, X (SD) | 6.50 (2.41) | 11.13 (1.12) | 9.63 (2.15) | ||
| Pragmatic score, X (SD) | 4.89 (1.82) | 11.85 (1.27) | 9.17 (1.95) | ||
Post-hoc Tukey, . SD: standard deviation. Full ranges of clinical measures can be found in Supplementary Table S1.
ALM calculation.
| Language construct | Literature source for construct | ALM | Calculation method |
|---|---|---|---|
| Utterance length | Gorman et al. (2015) | MLUM | Mean length of utterance in morphemes in all complete, fluent, and intelligible c-units |
| Total words | Gorman et al. (2015) | NDWR | Total number of different word roots in all complete, fluent, and intelligible c-units |
| Uh versus um | Gorman et al. (2016) | Um Proportion | |
| Filler versus content mazes | MacFarlane et al. (2017) | Content Maze Proportion | |
| Intelligibility | Abbeduto et al. (2020) | Unintelligible Proportion | |
| C-units Per Minute | Abbeduto et al. (2020) | CPM | |
| Repetition of others | van Santen et al. (2013) | Repetition proportion |
Diagnostic group differences for ALMs.
| ASD | TD | ADHD | p-value | post-hoc | |||||
|---|---|---|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | Mean | SD | ||||
| MLUM | 5.808 | 1.858 | 6.772 | 1.462 | 6.522 | 1.242 | 0.002253 | 0.0614 | |
| NDWR | 150.802 | 73.175 | 162.500 | 41.320 | 186.222 | 56.754 | 0.001798 | 0.0641 | |
| Um prop | 0.455 | 0.367 | 0.714 | 0.351 | 0.691 | 0.277 | 0.000154 | 0.0937 | |
| Content maze prop | 0.593 | 0.224 | 0.352 | 0.141 | 0.369 | 0.204 | 6.424e-10 | 0.2430 | |
| Unintell prop | 0.026 | 0.034 | 0.008 | 0.014 | 0.010 | 0.016 | 0.001021 | 0.0709 | |
| CPM | 11.211 | 2.492 | 12.978 | 2.321 | 14.107 | 3.268 | 1.295e-06 | 0.1513 | |
| Repetition prop | 0.037 | 0.034 | 0.025 | 0.017 | 0.024 | 0.020 | 0.07831 | 0.0186 | |
P-value determined by Kruskal-Wallis test. Eta-squared () effect sizes were calculated for each Kruskal-Wallis result. Post-hoc analysis performed by Games-Howell test.
Figure 1Relative frequency distribution of each ALM by ASD status.
Relationship of ALMs to clinical scores using Spearman correlations.
| MLUM | NDWR | Um | Content maze | Unintell | CPM | Repetition | CCC2 | CCC2 | CCC2 | ADOS | ADOS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Prop | Prop | Prop | Prop | GCC | Structural | Pragmatic | RRB | SA | ||||
| MLUM | – | |||||||||||
| NDWR | – | |||||||||||
| Um prop | 0.14 | 0.04 | – | |||||||||
| Content maze prop | − 0.08 | − 0.05 | – | |||||||||
| Unintell prop | − 0.06 | – | ||||||||||
| CPM | − 0.01 | − 0.08 | − 0.06 | – | ||||||||
| Repetition prop | – | |||||||||||
| CCC2 GCC | − 0.10 | – | ||||||||||
| CCC2 Structural | − 0.11 | – | ||||||||||
| CCC2 Pragmatic | − 0.09 | – | ||||||||||
| ADOS RRB | − 0.11 | 0.05 | – | |||||||||
| ADOS SA | – |
Italics indicates significance with . Boldface indicates significance with .
Logistic regression models for ALMs adjusted by age and IQ.
| Model | Goodness of fit | Wald test | Classification | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Name | Variable | -2 Log Like | Nagelkerke | Wald | p-value | Accuracy | Specificity | Sensitivity | AUC |
| Model 0 | IQ and Age | 196.5606 | 0.1701 | 0.6289 | 0.7191 | 0.5143 | 0.6868 | ||
| (Baseline) | IQ | 16.683 | 4.417e-0 | ||||||
| Age | 0.006 | 0.9382 | |||||||
| Model 1 | MLUM | 185.2495 | 0.2504 | 0.4865 | 0.9219 | 0.6415 | 0.6404 | 0.6429 | 0.7120 |
| Model 2 | NDWR | 193.7276 | 0.1907 | 2.5024 | 0.4749 | 0.6352 | 0.6517 | 0.6143 | 0.6937 |
| Model 3 | Um prop | 179.1942 | 0.2911 | 14.8426 | 0.0020 | 0.6792 | 0.7191 | 0.6286 | 0.7600 |
| Model 4 | Content maze prop | 163.5166 | 0.3896 | 15.5078 | 0.0014 | 0.7547 | 0.7640 | 0.7429 | 0.8149 |
| Model 5 | Unintell prop | 181.0161 | 0.2790 | 12.2279 | 0.0066 | 0.6792 | 0.6742 | 0.6857 | 0.7518 |
| Model 6 | CPM | 176.5453 | 0.3084 | 15.6164 | 0.0014 | 0.6918 | 0.7191 | 0.6571 | 0.7722 |
| Model 7 | Repetition prop | 188.4896 | 0.2280 | 7.6232 | 0.0545 | 0.6792 | 0.7528 | 0.5857 | 0.7222 |
| Model 8 | All 7 | 105.2476 | 0.6767 | 0.8239 | 0.8286 | 0.8202 | 0.9223 | ||
| MLUM | 2.0553 | 0.5610 | |||||||
| NDWR | 4.1463 | 0.2461 | |||||||
| Um prop | 3.2571 | 0.3537 | |||||||
| Content maze prop | 10.3446 | 0.0159 | |||||||
| Unintell prop | 10.1375 | 0.0174 | |||||||
| CPM | 14.1226 | 0.0027 | |||||||
| Repetition prop | 6.9012 | 0.0751 | |||||||
Model 0 uses only IQ and age. Models 1-7 use a single ordinal recoded ALM, and include IQ and age. Model 8 uses all seven recoded ALMs, and includes IQ and age. -2 Log Like is the -2 times the log-likelihood of the model (low values reflect better fit). Nagelkerke is a pseudo- measure (high values reflect better fit). Wald is the Wald test statistic chi-squared value and p-value is the p-value of the Wald result for the listed variable. The degrees of freedom for the Wald test were 1 for IQ and age and 3 for all other variables. Accuracy, specificity (true negative rate), sensitivity (true positive rate), and AUC are classification results for predicting ASD diagnosis.
Figure 2ROC curve for logistic regression models, evaluated on class probabilities.
AUC is area under the curve. Baseline is modeled with only IQ and age as independent variables. All other models are adjusted on IQ and age.