Literature DB >> 32365644

Fourier Transform Infrared Spectroscopy Based Complementary Diagnosis Tool for Autism Spectrum Disorder in Children and Adolescents.

Gulce Ogruc Ildiz1,2, Sevgi Bayari3, Ahmet Karadag1, Ersin Kaygisiz4, Rui Fausto2,5.   

Abstract

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that begins early in life and continues lifelong with strong personal and societal implications. It affects about 1%-2% of the children population in the world. The absence of auxiliary methods that can complement the clinical evaluation of ASD increases the probability of false identification of the disorder, especially in the case of very young children. In this study, analytical models for auxiliary diagnosis of ASD in children and adolescents, based on the analysis of patients' blood serum ATR-FTIR (Attenuated Total Reflectance-Fourier Transform Infrared) spectra, were developed. The models use chemometrics (either Principal Component Analysis (PCA) or Partial Least Squares Discriminant Analysis (PLS-DA)) methods, with the infrared spectra being the X-predictor variables. The two developed models exhibit excellent classification performance for samples of ASD individuals vs. healthy controls. Interestingly, the simplest, unsupervised PCA-based model results to have a global performance identical to the more demanding, supervised (PLS-DA)-based model. The developed PCA-based model thus appears as the more economical alternative one for use in the clinical environment. Hierarchical clustering analysis performed on the full set of samples was also successful in discriminating the two groups.

Entities:  

Keywords:  FTIR spectroscopy; autism spectrum disorder; chemometrics

Mesh:

Substances:

Year:  2020        PMID: 32365644      PMCID: PMC7249117          DOI: 10.3390/molecules25092079

Source DB:  PubMed          Journal:  Molecules        ISSN: 1420-3049            Impact factor:   4.411


1. Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that begins early in life and continues lifelong. It affects about 1%–2% of the children population in the world. Metabolic diseases, as well as genetic, toxic and environmental factors are recognized as causes of ASD. Symptoms mainly appear as difficulties in social interaction and communication, as well as limited and repetitive patterns of behavior [1,2,3,4,5,6]. Current diagnosis of ASD is based only on the clinical evaluation of the behavioral signs and symptoms. The absence of auxiliary methods that can complement the clinical evaluation increases the probability of false identification of the disorder, especially in the case of very young children. On the other hand, studies have shown that ASD should be detected at the earliest possible age for the treatment to be effective [7,8]. There have been many attempts to find ASD biomarkers in genetics, neuroimaging, gene expression, and measures of the body’s metabolism. Body fluids are easy to collect, and their analysis is less expensive compared to neuroimaging and genome studies. Recent studies show that blood samples are one of the most promising targets to search for characteristic biomarkers for ASD, not only due to their easy accessibility but also because of their important biological information content on the health status of children with this disorder [1,4,9,10,11,12,13,14,15,16,17]. Many studies have been reported aiming to contribute to a better understanding of the underlying causes of ASD. Nevertheless, no biomarkers have yet been identified that can support the clinical diagnosis. Spectroscopic methods, in particular FTIR (Fourier Transform Infrared) spectroscopy, complemented with multivariate methods such as PCA (Principal Component Analysis) and PLS-DA (Partial Least Squares Discriminant Analysis), are becoming powerful tools for the analysis of biological samples, especially body fluids [18,19,20,21,22,23,24]. This approach joins the possibility offered by spectroscopy of using molecular information contained in the spectral data, with the analytical efficiency of multivariate statistical methods to process that information. As a whole, the method is cheap, essentially non-destructive and it can be easily implemented in clinical environment. FTIR spectroscopy on blood samples has been used for the investigation of various diseases, including Parkinson’s and Alzheimer’s, but also different types of cancer and infections, among many others [19,21,23,25,26,27,28]. In the present study, we developed chemometrics models based on FTIR data, which can be used as complementary diagnostic tools of ASD in children and adolescents. Besides, specific spectral regions were identified that can act as biomarkers to help distinguishing autistic from healthy individuals.

2. Materials and Methods

2.1. Clinical Stage

2.1.1. Patients and Control Group Selection

A total of 60 children and adolescents (30 confirmed ASD cases and 30 controls; Table 1) participated in this study, after consents were obtained from their parents. The ASD group members were chosen among patients that are under treatment in the Child and Adolescent Psychiatry outpatient Clinic of the Pamukkale University (Denizli, Turkey). The group consisted of 23 boys and 7 girls within the age 4–17, diagnosed with ASD according to Diagnostic and Statistical Manual of Mental Disorders [2] criteria. In addition to the diagnostic evaluation, the Childhood Autism Rating Scale (CARS) [29] was applied to evaluate the severity level of the disease. Individuals with other psychiatric disorder and those having a chronic medical comorbid condition were excluded from the study. The healthy control included 22 boys and 8 girls, aged 6 to 16, without any medical or psychiatric history. The study has been approved by the Ethics Committee of Pamukkale University, Faculty of Medicine (date: 12/05/2015; number: 07).
Table 1

Age and gender distribution in the samples, according to group (ASD or control) .

ASDAgeSexASDAgeSexControlAgeSexControlAgeSex
A1 10B A16 9B C1 13B C16 10B
A2 9B A17 7B C2 8G C17 11B
A3 6B A18 4G C3 16B C18 10B
A4 7G A19 8B C4 9B C19 8B
A5 12B A20 5B C5 7G C20 8B
A6 14B A21 5B C6 11B C21 16B
A7 4G A22 5B C7 12B C22 10B
A8 10B A23 7B C8 9G C23 10B
A9 14G A24 17B C9 6G C24 11B
A10 5B A25 7B C10 14B C25 8B
A11 7B A26 7B C11 9G C26 12B
A12 6B A27 13G C12 13G C27 8G
A13 17B A28 8G C13 16B C28 8B
A14 14B A29 4B C14 12B C29 10B
A15 5B A30 13G C15 8B C30 7G

B, Boy; G, Girl. The first 15 members of the ASD group (A1-A15) and the 14 first members of the control group (C1-C14) were used to develop the models (calibration set); A16-A30 and C15-C29 (15 members of each group) were used to test the models; C30 was initially included in the calibration set, but was removed since it appeared as an outlier (see below). The numbering in the table was defined after the randomized split of the members of each group between the two sets (calibration and testing sets) and the preliminary investigation to exclude possible outliers.

2.1.2. Samples Preparation

Five milliliters of venous blood was taken from the antecubital vein of all participants in the study. The collected blood samples were allowed to clot and then centrifuged for 15 min at 1000 rpm in order to separate the serum from cellular material. The obtained serum samples were aliquoted and stored at −20 °C until the analysis.

2.2. Spectroscopic Stage

2.2.1. Sample Measurements

ATR-FTIR spectra were recorded on a Perkin Elmer Spectrum One spectrometer equipped with a KBr beam splitter and a deuterated triglycine sulfate (DTGS) detector, combined with a diamond GladiATR accessory (Pike Technologies). Sixty-four scans, covering the 4000–450 cm−1 wavenumber range, were co-added to produce each spectrum. A spectral resolution of 4 cm−1 was used. For each blood serum sample, 5 spectra were obtained. Before collecting each spectrum, the ATR crystal was first cleaned using sterile phosphate buffer followed by ethanol. Background was collected prior to each sample measurement. For the spectra collection, 1 µL of unfrozen blood serum samples were placed on the crystal surface and allowed to air dry (~12 min) at room temperature.

2.2.2. Data Pre-processing

Before analysis, the FTIR spectra were pre-processed by performing baseline correction, and area normalization. No smoothing or any other additional pre-processing of the spectra was performed. For the analyses, the 3700–2400 and 1800–900 cm−1 spectral regions were chosen. The full set of spectra belonging to the totality of samples of the control (C) or ASD (A) groups (5 × 30 spectra for each group) were then subjected to PCA, using the Nonlinear Iterative Partial Least Squares (NIPALS) algorithm [30], in order to detect outliers. This procedure resulted in the elimination of 5 replicas in total, all belonging to the same sample of the control group (C30 sample), which was excluded from the dataset. The average spectrum for each sample was then obtained, as well as the global mean-spectrum for each group (C and A). All data pre-processing was undertaken with the UnscramblerTM CAMO software (Version 10.5) [31].

2.2.3. Classification Models Development and Testing

The dataset used to develop and test the classification models included a total of 59 samples, 30 belonging to the ASD group (A) and 29 to the control group (C). The calibration set comprehended 29 samples (15 for the A group and 14 for the C group), while the test set was formed by 15 samples of each group, in a total of 30 samples. The samples used in the calibration and test sets were chosen randomly. Two models were built for classification of the samples, one using the PCA method and the other the PLS-DA method [32]. For both models, internal full cross-validation was used during calibration. For predictions, all samples in the test set were used with the two developed models. The hierarchical clustering technique was also applied to the full set of samples, as a preliminary unsupervised test to check the similarity of the samples within each group and the dissimilarity between the two groups. The performed cluster analysis used the Ward’s method with squared Euclidean distances [33,34]. All chemometric analyses were done using the UnscramblerTM CAMO software (Version 10.5) [31]. The prediction performance of the models was checked by calculating their sensitivity, specificity, precision, accuracy, and efficiency [35,36]). Sensitivity and specificity measure the ability of the model to correctly classify each class and to correctly identify the samples that do not belong to the modelled class, respectively, and are calculated according to: Sensitivity (%) = 100 × tp / (tp + fn); Specificity (%) = 100 × tn / (tn + fp), where fp and fn stand for false positive and false negative samples, respectively, and tp and tn stands for true positive and true negative samples. Precision (%) = 100 × tp / (tp + fp), measures of the quality of the positive predictions of the model. Efficiency and accuracy provide a single measure of the model performance, with efficiency combining the information given by the sensitivity and specificity analyses [efficiency (%) = 100 × (sensitivity + specificity) / 2], and accuracy measuring the proportion of correct classifications independent of the class [accuracy (%) = 100 × (correct classifications) / total samples].

3. Results and Discussion

3.1. Preliminary Data Analysis

Figure 1 shows the average IR spectra (area normalized) of the blood serum of the ASD and control groups. Table 2 presents the assignment of the bands, according to the literature [37,38,39,40,41,42,43]. The data seems to indicate that the blood serum of the ASD patients have an increase of protein total contents and a slight decrease of tyrosine compared to the control group, while the lipids total amount seems to be nearly identical. Though being only indicative, these results agree with the conclusions of Croonenberghs et al. [44], who reported an increased level of the total protein contents in the blood serum of children with ASD, Elbaz and co-workers [10], Tu, Chen and He [45] and Tirouvanziam et al. [46], who found a significant reduction in tyrosine, and Wiest and co-workers [47], who concluded that the lipid contents in the blood plasma of ASD children and the general population are identical. Besides, the dietary study by Levy et al. [48] also indicated that the protein intake is incremented in ASD children patients.
Figure 1

Average IR spectra of ASD (A-AVERAGE; thick blue line) and control (C-AVERAGE; thick red line) groups’ blood serum samples (3700–2400 and 1800–900 cm−1 regions). Thin lines account for the standard deviations.

Table 2

Assignments for the major bands in the FTIR spectrum of blood serum .

The assignments are according to the literature [37,38,39,40,41,42,43]. Wavenumbers in cm-1. AA, amino acids; ν, stretching; δ, bending; w, wagging; γ, rocking; s, symmetric; as, anti-symmetric. Bold style in the assignment columns indicate the expected major contributor to the band. In the case of Amide II and III, the main coordinates contributing to the mode are indicated; the first mode corresponds to the anti-phase combination of these coordinates, while the second corresponds to the in-phase combination. The high-wavenumber wing of the Amide A band is superimposed with the band originated in OH stretching vibrations, including those due to traces of water still present in the sample. The Amide B band is partially due to N-H stretching vibrations of amide groups involved in strong intramolecular H-bonds, and partially a result of a Fermi resonance interaction between νNH and the first overtone of the Amide II vibration.

Figure 2 presents the results of hierarchical clustering analysis performed on the full set of samples, which was undertaken as a preliminary unsupervised similarity test. The samples belonging to the two groups appear clearly discriminated. Noteworthy, the dendrogram also clearly shows that the homogeneity within the control group is significantly higher compared to the ASD group, as expected considering that ASD represents a range of mental disorders of the neurodevelopmental type with different levels of severity (these include disorders previously classified separately and designated as autism and childhood disintegrative disorders, Asperger’s syndrome, and pervasive developmental disorder not otherwise specified (PDD-NOS) [2]).
Figure 2

Cluster analysis of ASD (A; overlined in blue) and control (C; highlighted using the red color) groups’ blood serum IR spectra, according to the Ward’s method, using squared Euclidean distances.

3.2. Classification Models Development

As mentioned before, two models were built for classification of the samples, one based on the PCA method (PCAModel) and the other on the PLS-DA method [32] (PLSModel). For both models the calibration set included the same 29 samples (15 for the A group and 14 for the C group), and internal full cross-validation was used. Figure 3 shows the PCAModel 2D-scores plot (PC-2 vs. PC-1), where it can be seen that the ASD samples are well discriminated from those belonging to the control group along PC-1. Together, PC-1 and PC-2 explain 98% of the data variation for the training set (92% and 6% variance for PC-1 and PC-2 respectively), with the same numbers for validation. The model was developed using five principal components, accounting for a total variance of 99% for the training set (validation: 98%).
Figure 3

PCA scores plot (PC-2 vs. PC-1) for the PCAModel.

The loadings for PC-1 are given in Figure 4, where they are compared with the difference between the IR spectrum obtained by subtracting the average spectrum of the ASD group from the average spectrum of the control group (see Figure 1 for original spectra). The similarity between the data allows to assign a clear meaning to PC-1. In addition, this similarity can also be correlated with the circumstance that a large amount of variation in the dataset (92%) is explained by the first principal component. The fact that PC-1 loadings are very much similar to the difference between the average spectra of the two groups (control and ASD) is also relevant because it clearly demonstrates that the achieved discrimination is doubtlessly related with the different nature of the samples. Furthermore, it also validates our main approach to the problem under study, which is the statement that the whole IR spectrum acts as a holistic fingerprint (or spectroscopic biomarker) of the disease, since the PC-1 loadings have significant values for practically all variables (frequency values).
Figure 4

Left panel: Difference IR spectrum (average spectrum of ASD group (A) blood serum minus average spectrum of the control group (C) blood serum). Right panel: PC-1 loadings of PCAModel.

It is also interesting to note that the samples distribution within each group along PC-1 is substantially different, with the samples belonging to the control group spawning along a small range of values and those belonging to the ASD group (A) spreading a much wider range. This result reflects the greater homogeneity of the samples in the control group (C) compared to the ASD group, thus closely following the trend observed in the hierarchical clustering analysis discussed in the previous section. Along PC-2 the distribution of samples does not differ very much from one group to the other, indicating that the variance explained by this principal component reflects general small differences in the spectra that are not related with the different nature of the two groups (A vs. C). The 2D-scores plot (Factor-2 vs. Factor-1) for the PLSModel is shown in Figure 5. As it could be expected considering the results obtained using the unsupervised PCA-based model (PCAModel) described above, the supervised PLSModel discriminates well the ASD samples from the control ones. The results obtained with the two models are in fact very similar. In the PLSModel, in parallel to what was found for the PCAModel, the groups are discriminated along the axis explaining the largest fraction of variance (Factor-1, whose loadings are also identical to the difference between the average IR spectra of the control and ASD groups, like the PC-1 loadings in the PCAModel), while along Factor-2 the distribution of samples are rather identical. Additionally, as it was found for the PCAModel, along Factor-1 the samples belonging to the ASD group are dispersed along a much wider range of score values than those belonging to the control group. The reasons for the two last trends, common to the two developed models, have already been provided above.
Figure 5

Scores plot (Factor-2 vs. Factor-1) for the PLSModel.

In the PLSModel, the two latent variables explaining the largest amounts of variation (Factor-1 and Factor-2) account for 97% of the variation in X and 90% in Y for the training set (91% and 80% variance in X and Y, respectively, for Factor-1, and 6% and 10% for Factor-2), with similar numbers observed for validation (91% and 5% variance in X, and 78% and 12% variance in Y for Factor-1 and -2 respectively; the totals were 96% and 90% for X and Y variance, respectively). The model was developed using five latent variables, accounting for total X and Y variances of 99% and 98% for the training set (validation: 98% and 93%). The root-mean-square errors (RMSE) for training and validation are 0.10 and 0.14, respectively, which demonstrate the good quality of the regressions.

3.3. Predictions

Fifteen samples of each group (A and C) not used for calibration of the models were used as test set for predictions. The IR spectra of the test samples were pre-processed following the same steps as for the samples used in the calibration of the models. The results obtained for the predictions done using the two models are summarized in Figure 6, Figure 7 and Figure 8.
Figure 6

Projection scores plot (PC2 vs. PC-1) for the PCAModel.

Figure 7

Projection scores plot (Factor-2 vs. Factor-1) for the PLSModel.

Figure 8

PLSModel predicted Y values for ASD (A group) and control (C group) test samples. The predicted values are indicated by the horizontal red lines, and the deviations by the blue boxes. In the model, samples belonging to control group define class 1 (value 0 for Y) and samples belonging to ASD patients define class 2 (value 1 for Y).

Both models were able to classify correctly all samples included in the test set, with no false positives or negatives (global accuracy, 100%), so that the calculated values for all parameters chosen to measure the prediction performance of the models (sensitivity, specificity, precision, accuracy, and efficiency [35,36], whose meaning was given in Section 2.2.3) are maximal (100%). Figure 6 and Figure 7 show the projections of the test samples on the scores plots of the models, while Figure 8 gives the PLSModel predicted Y values for the samples and deviations, the last being an indicator of how reliable the predicted values are [49]. For the PCAModel, the predicted class for the samples was assigned based on the inspection of their projection on the model scores plot (Figure 6). The samples were ascribed as belonging to the same group to which the samples corresponding to its nearest three neighbor points belong. In the case of the PLSModel, the class assignments were done by considering the predicted Y values for the samples, using as threshold value for class separation the half distance between the reference Y values (0.5). It shall be noticed that the predicted Y values for the ASD test samples show a larger dispersion around the reference value compared to the control test samples. This result is in consonance with the already mentioned smaller homogeneity between the samples belonging to the ASD patients in comparison with those of the control group. The fact that the two models show a similar prediction performance is striking. In fact, the performance of the PCA-based model is so good that the model derived using the supervised PLS-DA method, which could a priori be expected to outperform the unsupervised PCA-based model, does not improve on this latter in a noticeable way. Under these circumstances, the simplest PCAModel appears to be the most convenient model for practical use. The PCAModel was also checked in relation to its robustness. For that, we performed an additional PCA calculation using all 59 samples in the training set. The idea was to verify if including a larger number of samples in the training set would visibly change the description of the data by the model. The obtained scores plot is presented in Figure 9. In this PCA PC-1 and PC-2 explain 95% of the data variation for the training set (89% and 6% variance for PC-1 and PC-2 respectively), with equal numbers for validation. The five principal components used in the PCA account for a total variance of 98% both for training set and validation. Noteworthy, the scores plot of this PCA (Figure 9), with all 59 samples in the calibration set, closely matches the projection graph obtained for the model developed with only 29 samples in the calibration set (see Figure 6), a result that clearly shows the robustness of the PCAModel.
Figure 9

Scores plot (PC-2 vs. PC-1) for the PCA done using all 59 samples.

4. Conclusions

In this study we developed analytical models for auxiliary diagnosis of ASD in children and adolescents, based on the analysis of patients’ blood serum infrared spectra. The models use chemometric (either PCA or PLS-DA) methods, with the infrared spectra acting as the X-predictor variables. The two developed models exhibit excellent classification/prediction performance. Remarkably, the simplest, unsupervised PCA-based model results to have a performance identical to the more expensive, supervised (PLS-DA)-based model. The developed PCA-based model thus appears as the best, more economical alternative for use in the clinical environment. It can be concluded that infrared spectrum of the blood serum can be used for the discrimination of ASD patients from healthy controls. Since it considers the whole spectroscopic information to achieve classification, this approach is conceptually more consistent than the putative alternative ones that aim to use spectroscopic data of complex biological materials to find specific molecular biomarkers for the disease. The obtained results regarding the relative similarity of the samples within each one of the two studied groups (ASD and control groups), clearly showing the much greater dissimilarity between the samples belonging to the ASD group, are in accordance with the modern psychiatric concept of ASD being a general type of disorder which includes a spectrum of clinical manifestations which previously were considered different psychiatric illnesses. The methodology proposed herein is reliable, fast, cheap, essentially non-invasive, and might be implemented easily in the clinical environment in order to help psychiatrists to establish with increased certainty the ASD diagnosis at an early stage of development of the illness.
  32 in total

1.  Determination of glucose in dried serum samples by Fourier-transform infrared spectroscopy.

Authors:  C Petibois; V Rigalleau; A M Melin; A Perromat; G Cazorla; H Gin; G Déléris
Journal:  Clin Chem       Date:  1999-09       Impact factor: 8.327

2.  Meta-analysis of Early Intensive Behavioral Intervention for children with autism.

Authors:  Sigmund Eldevik; Richard P Hastings; J Carl Hughes; Erik Jahr; Svein Eikeseth; Scott Cross
Journal:  J Clin Child Adolesc Psychol       Date:  2009-05

Review 3.  Oxidative stress as an etiological factor and a potential treatment target of psychiatric disorders. Part 2. Depression, anxiety, schizophrenia and autism.

Authors:  Irena Smaga; Ewa Niedzielska; Maciej Gawlik; Andrzej Moniczewski; Jan Krzek; Edmund Przegaliński; Joanna Pera; Małgorzata Filip
Journal:  Pharmacol Rep       Date:  2015-01-05       Impact factor: 3.024

4.  Abnormal fatty acids in Canadian children with autism.

Authors:  Joan Jory
Journal:  Nutrition       Date:  2015-12-02       Impact factor: 4.008

5.  Serum levels of SOD and risk of autism spectrum disorder: A case-control study.

Authors:  Lixuan Wang; Jianpu Jia; Junling Zhang; Kuo Li
Journal:  Int J Dev Neurosci       Date:  2016-04-16       Impact factor: 2.457

6.  Plasma fatty acid profiles in autism: a case-control study.

Authors:  M M Wiest; J B German; D J Harvey; S M Watkins; I Hertz-Picciotto
Journal:  Prostaglandins Leukot Essent Fatty Acids       Date:  2009-04       Impact factor: 4.006

7.  Metabolomics as a tool for discovery of biomarkers of autism spectrum disorder in the blood plasma of children.

Authors:  Paul R West; David G Amaral; Preeti Bais; Alan M Smith; Laura A Egnash; Mark E Ross; Jessica A Palmer; Burr R Fontaine; Kevin R Conard; Blythe A Corbett; Gabriela G Cezar; Elizabeth L R Donley; Robert E Burrier
Journal:  PLoS One       Date:  2014-11-07       Impact factor: 3.240

8.  Evaluation of metformin hydrochloride in Wistar rats by FTIR-ATR spectroscopy: A convenient tool in the clinical study of diabetes.

Authors:  P Ramalingam; Y Padmanabha Reddy; K Vinod Kumar; Babu Rao Chandu; K Rajendran
Journal:  J Nat Sci Biol Med       Date:  2014-07

Review 9.  Biomarkers in autism.

Authors:  Andre A S Goldani; Susan R Downs; Felicia Widjaja; Brittany Lawton; Robert L Hendren
Journal:  Front Psychiatry       Date:  2014-08-12       Impact factor: 4.157

10.  A Search for Blood Biomarkers for Autism: Peptoids.

Authors:  Sayed Zaman; Umar Yazdani; Yan Deng; Wenhao Li; Bharathi S Gadad; Linda Hynan; David Karp; Nichole Roatch; Claire Schutte; C Nathan Marti; Laura Hewitson; Dwight C German
Journal:  Sci Rep       Date:  2016-01-14       Impact factor: 4.379

View more
  1 in total

1.  PLS-DA Model for the Evaluation of Attention Deficit and Hyperactivity Disorder in Children and Adolescents through Blood Serum FTIR Spectra.

Authors:  Gulce Ogruc Ildiz; Ahmet Karadag; Ersin Kaygisiz; Rui Fausto
Journal:  Molecules       Date:  2021-06-03       Impact factor: 4.411

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.