| Literature DB >> 34101081 |
Maria Eleonora Minissi1, Irene Alice Chicchi Giglioli2, Fabrizia Mantovani3, Mariano Alcañiz Raya2.
Abstract
The assessment of autism spectrum disorder (ASD) is based on semi-structured procedures addressed to children and caregivers. Such methods rely on the evaluation of behavioural symptoms rather than on the objective evaluation of psychophysiological underpinnings. Advances in research provided evidence of modern procedures for the early assessment of ASD, involving both machine learning (ML) techniques and biomarkers, as eye movements (EM) towards social stimuli. This systematic review provides a comprehensive discussion of 11 papers regarding the early assessment of ASD based on ML techniques and children's social visual attention (SVA). Evidences suggest ML as a relevant technique for the early assessment of ASD, which might represent a valid biomarker-based procedure to objectively make diagnosis. Limitations and future directions are discussed.Entities:
Keywords: Assessment; Autism spectrum disorder; Classification; Eye tracking; Machine learning; Social visual attention
Mesh:
Substances:
Year: 2021 PMID: 34101081 PMCID: PMC9021060 DOI: 10.1007/s10803-021-05106-5
Source DB: PubMed Journal: J Autism Dev Disord ISSN: 0162-3257
Fig. 1Flow diagram of study selection
Selected studies
| Authors | Mean age (SD) | Sample size (M) | Aim of the study | ASD assessment | Stimuli | EM measures | Data reduction | Selected features | ML model | ML findings | Conclusions | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ASD | TD | ASD | TD | ||||||||||
| Carette et al., | NR, age range of entire sample: 8–10 | N = 17 | N = 15 | To detect ASD by the help of eye-tracking data and ML | NR | Dynamic | Duration, amplitude, acceleration, deceleration, and speed of saccades | No data reduction | EM measures of each participant | LSTM on EM measures | LSTM distinguished ASD from TD in 83% of tested patients | RNN can distinguish ASD from TD | |
| M NR | M NR | SV of a JA offer | |||||||||||
| Carette et al, | Entire sample: 7.88 (SD NR) | N = 29 | N = 30 | To help with ASD diagnosis | CARS | Dynamic | Fixation, saccade, blink, and EGC | Images scaling down, greyscale format conversion and eventually PCA | Eye gaze scanpaths | Non-neural network approaches (1); | (1) Generally, non-neural network approaches achieved AUC ≈ 0.7 (2) All ANN provided an A greater than 90% | ANN 1-Layer (200) achieved the best performance (A of 92%) and there was no substantial improvement growing the ML model complexity | |
| M NR | M NR | SVs and NSVs including NS elements and a presenter attempting a JA offer | Neural network approaches (2) | ||||||||||
| Elbattah et al., | Entire sample: 7.88 (SD NR) | N = 29 | N = 30 | To apply unsupervised ML to discover clusters in ASD EM | CARS | Dynamic | Fixation, saccade, blink, and EGC | Greyscale format conversion (1); PCA (2); t-SNE (3): Autoencoder (4) | Eye gaze scanpaths | 12 k-means algorithm based on (1), (2), (3), and (4) and with different | Poor separation of clusters in | There was a tendency of clustering structure in the dataset with faster eye gaze related to higher ASD severity that best emerged with (4) + | |
| M NR | M NR | Same SS of Carette et al. ( | |||||||||||
| Kang et al., | 4.29 (1.07) | 4.26, (1.00) | N = 49 | N = 48 | To identify ASD using features from EEG and eye-tracking | Psychiatrists checking for DSM-V diagnostic criteria | Static | FDT | MRMR | Proportioned FDT in each AOI | SVM on selected features according to SS: (1), (2), and both types of faces (3) | (1) A 72.33%, AUC 0.8269 | Among ML models on EM, the best model was (3), achieving an A of 75.89% |
| 39 M | 36 M | SIs of girl face pictures of own-race (1) and other race (2) | (2) A 66.67%, AUC 0.7460 | ||||||||||
| (3) A 75.89%, AUC 0.8652 | |||||||||||||
| Li et al., | NR, age range: 4–7 | NR ,age range: 6–8 | N = 53 | N = 136 | To automatically recognize ASD in raw video data | Psychiatrists checking for DSM-IV diagnostic criteria | Static | EGT | No reduction (1) | Holistic Acc H of EGT | SVM based on (1), (2), or (3) on different number of video frames | Best model was (3) + SVM on 40 video frames with an A of 93.7% | (3) + SVM on raw video data is a promising method to classifying ASD |
| M NR | M NR | SIs of participant's mother | PCA (2) | ||||||||||
| KPCA (3) | |||||||||||||
| Li et al., | Dataset 1 and 2: NR, age range: 4–7 | Dataset 1 and 2: NR, TD age range: 6–8 | Dataset 1: N = 53 M NR | Dataset 1: N = 136 M NR | To help early ASD assessment using deep learning on raw video data | Psychiatrists checking for DSM-V diagnostic criteria | Static | Angle and length of EGT | KPCA for SVM | Acc H and nAcc H of angle | SVM and LSTM models on both dataset using Acc H and nAcc H | Methods using Acc H outperformed nAcc H methods, and LSTM outperformed SVM. LSTM with fused Acc H on dataset 2 achieved the best A (92.6%) | LSTM outperformed SVM in ASD discrimination |
| Dataset 2: Dataset 1 + 83 ASD (136) M NR | Dataset 2: Dataset 1 + 0 TD (136) | Same SS of Li et al. ( | Acc H and nAcc H of length | ||||||||||
| Combined angle and length Acc H and nAcc H | |||||||||||||
| Liu et al., | 7.85 (1.59) | TD-age = 7.73, (1.51) | N = 21 17 M | TD-age: N = 21 18 M | To propose an ASD prediction system based on ML techniques | AQ-Child | Static | EGC, EGT | (1) Sequence of EGC | RBF kernel SVM on (1), (2), (3) | (1) AUC 0.5561, A of 72.13% | The two features are complementary to each other and ML model with fused features outperformed others ML models | |
| 12 SIs depicting Chinese adult female faces | (2) BoW on EGC, BoW on eye motion, BoW on combined EGC and motion | In SVM on (2), EGC (4), eye motion (5) and both EGC and motion (6) were separately tested | (3) AUC 0.8208, A of 78.68% | ||||||||||
| TD-IQ = 5.69 (0.83) | TD-IQ: N = 20 | (3) Face, nose, mouth, left eye and right eye | (4) AUC 0.8902, A of 81.97% | ||||||||||
| 18 M | (5) AUC 0.9061, A of 85.25% | ||||||||||||
| (6) AUC 0.9207, A of 86.89% | |||||||||||||
| Liu et al., | 7.90 (1.45) | TD-age = 7.86 (1.38) | N = 29 | TD-age: N = 29 25 M | To examine whether face scanning patterns could be useful in ML-based ASD identification | AQ-Child | Static | Frequency distribution of face scanning coordinates without temporal information | Same-race faces: 16 | RBF kernel SVM on data from (1), (2), and all faces (3) | (1) A of 81.61%, AUC 82.40% | ML model on all faces achieved the best A in ASD discrimination from TD-age and TD-IQ | |
| 25 M | TD-IQ: N = 29 | 6 SIs of faces of the same race (1) or other race (2) | Other-race faces: 64 | (2) A of 90.80%, AUC 94.41% | |||||||||
| TD-IQ = 5.74 (1.01) | 25 M | All faces: 96 | (3) A of 88.51%, AUC 89.63% | ||||||||||
| Tao et al., | 8 (SD NR) | 8 (SD NR) | N = 14 | N = 14 | To test whether combined CNN and LSTM can classify ASD | Psychiatrists checking for DSM-V diagnostic criteria | Static | Mean fixation duration, fixation count, and EGC | SalGAN and data pre-processing | Image patches of predicted saliency map based on individual scanpath | 2 SP-ASDNet with different layer sizes: with batch normalization (1), and without (2) | Best model was (1) and it achieved an A of 74.22% | CNN-LSTM architecture can discriminate ASD from TD children |
| M NR | M NR | 300 SIs and NSIs | |||||||||||
| Vu et al., | NR, age range of entire sample: 2–10 | N = 16 | N = 16 | To examine the impact of different SIs and exposure time on the screening A for ASD | ADOS | Static | Fixation maps | No data reduction | Gaze points in fixation maps | Best models were: for (1) social scenes (A of 98.24%), for (2) 5 s (A of 95.24%), and for (3) social scenes for 5 s (A of 98.24%) | Social scene with full duration exposure (5 s) yielded the optimal result at nearly 100% of A | ||
| M NR | M NR | 12 SIs and NSIs related to social scenes, human faces, and object. SS had different exposure time (1, 3, 5 s) | |||||||||||
| Wan et al., | 4.6 (0.7) | 4.8 (0.4) | N = 37 | N = 37 | To develop an EM-based early diagnostic tool for ASD | Psychiatrists checking for DSM-V diagnostic criteria and CARS administration | Dynamic | FDT in each AOI | Permutation tests | Body and mouth AOIs | SVM on FDT in body and mouth AOIs | SVM achieved a classification A of 85.1% | Simple SVM model achieved same ASD classification A as more complex ET paradigms |
| 33 M | 27 M | Short SV of a young Asian female mouthing the alphabet | |||||||||||
A accuracy, Acc H accumulative histograms method, ADOS Autism Diagnostic Observation Scale, AQ-Child Autism Spectrum Quotient: Children’s Version, BoW “Bag Of Words” features histogram representation, CARS Child Autism Rating Scale, CNN convolutional neural network, EGC eye gaze coordinates, EGT eye gaze trajectories, FDT fixation duration total time, kNN kth nearest neighbours algorithm, KPCA kernel principal component analysis, LSTM long short-term memory network, MRMR minimum redundancy maximum relevance method, nAcc H non-accumulative histograms method, NR not reported, NS non-social, NSI non-social image, NSV non-social video, RBF radial basis function, SI social image, SS social stimuli, SV social video, TD-age typical developmental group matched for chronological age, TD-IQ typical developmental group matched for IQ, t-SNE t-Distributed Stochastic Neighbor Embedding technique