| Literature DB >> 34316724 |
Kaushik Vakadkar1, Diya Purkayastha1, Deepa Krishnan2.
Abstract
Autism Spectrum Disorder (ASD) is a neurological disorder which might have a lifelong impact on the language learning, speech, cognitive, and social skills of an individual. Its symptoms usually show up in the developmental stages, i.e., within the first two years after birth, and it impacts around 1% of the population globally [https://www.autism-society.org/whatis/facts-and-statistics/. Accessed 25 Dec 2019]. ASD is mainly caused by genetics or by environmental factors; however, its conditions can be improved by detecting and treating it at earlier stages. In the current times, clinical standardized tests are the only methods which are being used, to diagnose ASD. This not only requires prolonged diagnostic time but also faces a steep increase in medical costs. To improve the precision and time required for diagnosis, machine learning techniques are being used to complement the conventional methods. We have applied models such as Support Vector Machines (SVM), Random Forest Classifier (RFC), Naïve Bayes (NB), Logistic Regression (LR), and KNN to our dataset and constructed predictive models based on the outcome. The main objective of our paper is to thus determine if the child is susceptible to ASD in its nascent stages, which would help streamline the diagnosis process. Based on our results, Logistic Regression gives the highest accuracy for our selected dataset.Entities:
Keywords: Accuracy; Autism spectrum disorder; Confusion matrix; Dataset; Encoding; F1 score; KNN; Logistic regression; Machine learning; Precision; Preprocessing; Random forest; Recall; SVM
Year: 2021 PMID: 34316724 PMCID: PMC8296830 DOI: 10.1007/s42979-021-00776-5
Source DB: PubMed Journal: SN Comput Sci ISSN: 2661-8907
Summary of literature review
| Paper | Key findings | Limitations |
|---|---|---|
| [ | Used forward feature selection and under sampling Trained and tested six ML models on score sheets of 65 Social Responsiveness Scale from 2925 individuals having ASD or ADHD Found that out of the 65 behaviors 5 were sufficient to distinguish ASD from ADHD with an accuracy of 96.4% | The dataset was compiled from primarily autism-based collections, as a result of which there was quite a significant imbalance, in favor of the ASD class |
| [ | Metrics based on brain activity used for prediction of ASD Used SVM to obtain an accuracy of 95.9% with 2 clusters and 19 features | Constrained sample size/data set |
| [ | Uses SVM Integrates ML algorithm inside ASD screening tool Accuracy 97.6% | Imbalanced datasets Small size of dataset, with 612 autism and 11 non autism cases |
| [ | Derived a novel algorithm which combines the structural and functional features Chalked out various different representations of the functional connectivity of the brain Results are an indication that combining multimodal features give the highest accuracy for distinguishing cases | The ML models used show an increase in the accuracy of the prediction by a mere 4.2% for Autism, in comparison to the previous works carried out Datasets suffer significantly from variations |
| [ | Makes use of SVM, Naïve Bayes and Random Forest classification methods 95,577 records of children with 367 variables out of which 256 were found to be sufficient Clearly delineates different attributes used Created dataset having 4 classes (ASD: None, Mild, Moderate, Severe) Highest accuracy of 87.1% (2 class) and 54.1% (4 class) achieved with J48 algorithm (decision tree) | Does not predict the severity of ASD Cursory set of attributes (conditions) used for identification of ASD which might not always necessarily translate to a case of ASD |
| [ | Automated optimal feature selection using Binary Firefly algorithm (selected 10 out of 21 features as optimum) No class imbalance problem (Among 292 instances in ASD children dataset, there are 151 instances with class ‘yes’ and 141 instances with class ‘No’) Makes use of NB, J48, SVM, KNN models Highest accuracy of 97.95% achieved with SVM | Missing instances in the ASD child dataset Due to lesser number of instances in the dataset, there exists a chance of model overfitting on the dataset Certain disadvantages of swarm intelligence wrappers (Binary Firefly algorithm) |
| [ | Makes use of an AI system with sensor data to analyze the patient’s condition using facial expressions and emotions Sends regular alerts to the parents, thus helping the patient cope with ASD during times of COVID-19 The system consists of a smart wrist band with an interactive monitor and camera, connected to a mobile application Detects ASD using real-time gray scale images from a Kaggle dataset containing 35,887 images Out of all models, the Inception-ResNetV2 architecture achieved the highest accuracy of 78.56% | A low accuracy is achieved compared to other approaches Research work is in nascent stages |
| [ | Makes use of an ML model based on induction of rules called Rules-Machine Learning (RML) Generated non-redundant rules in a straightforward manner utilizing Covering learning Made use of tenfold cross-validation to partition dataset into 10 subsets RML offers classifiers with higher predictive accuracy compared to standard approaches like Boosting, Bagging and decision trees | RML proves to be ineffective in handling imbalanced data sets with respect to class labels Article does not include instances related to toddlers |
Fig. 1Architecture of proposed system
A comparison of the applied ML models
| LR | NB | SVM | KNN | RFC | |
|---|---|---|---|---|---|
| Accuracy | 97.15% | 94.79% | 93.84% | 90.52% | 81.52% |
| Confusion matrix | |||||
| F1 score | 0.98 | 0.96 | 0.95 | 0.93 | 0.88 |
Features mapping with Q-CHAT-10 screening method
| Dataset variable | Description |
|---|---|
| A1 | Child responding to you calling his/her name |
| A2 | Ease of getting eye contact from child |
| A3 | Child pointing to objects he/she wants |
| A4 | Child pointing to draw your attention to his/her interests |
| A5 | If the child shows pretense |
| A6 | Ease of child to follow where you point/look |
| A7 | If the child wants to comfort someone who is upset |
| A8 | Child’s first words |
| A9 | If the child uses basic gestures |
| A10 | If the child daydreams/stares at nothing |
Fig. 2ASD positive toddlers born with jaundice based on gender
Fig. 3Age distribution of ASD positive
Fig. 4Gender distribution of ASD traits
Fig. 5Ethnicity distribution of ASD traits
Confusion matrix for ASD prediction
| Predicted | Individual has ASD | Individual does not have ASD |
|---|---|---|
| ASD is predicted | True positive | False positive |
| ASD is not predicted | False negative | True negative |
Fig. 6Precision/recall curve for LR
Fig. 7Precision/recall curve for NB
Fig. 8Precision/recall curves for SVM