| Literature DB >> 20657396 |
Vanessa Aguiar-Pulido1, José A Seoane, Juan R Rabuñal, Julián Dorado, Alejandro Pazos, Cristian R Munteanu.
Abstract
Single nucleotide polymorphisms (SNPs) can be used as inputs in disease computational studies such as pattern searching and classification models. Schizophrenia is an example of a complex disease with an important social impact. The multiple causes of this disease create the need of new genetic or proteomic patterns that can diagnose patients using biological information. This work presents a computational study of disease machine learning classification models using only single nucleotide polymorphisms at the HTR2A and DRD3 genes from Galician (Northwest Spain) schizophrenic patients. These classification models establish for the first time, to the best knowledge of the authors, a relationship between the sequence of the nucleic acid molecule and schizophrenia (Quantitative Genotype-Disease Relationships) that can automatically recognize schizophrenia DNA sequences and correctly classify between 78.3-93.8% of schizophrenia subjects when using datasets which include simulated negative subjects and a linear artificial neural network.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20657396 PMCID: PMC6257637 DOI: 10.3390/molecules15074875
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Flow chart of the QGDR model classification between the DNA structure (SNPs) and schizophrenia.
The classification models obtained for the evaluated schizophrenia patients using the SNP information at DRD3 and HTR2A.
| Data set | Gene | LNN | MLP | RBF | EC | MDR | Bayes
| Naïve
| SVM | Decis.
| DTNB | BFTree | AdaBoost |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
| 62.9% | 59.5% | 58.9% | 56.6% | 60.0% | 62.5% | 61.6% | 64.8% | 62.2% | 59.5% | 61.3% | 63.4% |
|
| 62.4% | 62.9% | 63.7% | 57.5% | 64.0% | 61.9% | 66.6% | 65.2% | 61.0% | 62.3% | 62.8% | 63.5% | |
|
| 64.5% | 64.7% | 62.5% | 58.7% | 64.0% | 61.2% | 64.8% | 64.9% | 61.5% | 66.2% | 62.9% | 65.9% | |
|
|
| 74.6% | 72.9% | 71.5% | 71.0% | 60.5% | 71.3% | 71.0% | 75.4% | 73.5% | 70.4% | 73.7% | 71.3% |
|
| 75.9% | 75.5% | 73.6% | 71.7% | 74.2% | 62.2% | 62.9% | 77.4% | 73.2% | 70.9% | 74.5% | 71.4% | |
|
| 78.2% | 76.8% | 74.4% | 71.5% | 70.7% | 62.9% | 63.3% | 76.8% | 73.1% | 73.2% | 75.0% | 71.4% | |
|
|
| 80.5% | 79.5% | 78.5% | 78.2% | 69.8% | 77.9% | 76.2% | 81.4% | 79.6% | 77.1% | 79.4% | 78.6% |
|
| 80.7% | 81.7% | 80.2% | 78.5% | 71.0% | 71.9% | 72.3% | 83.0% | 79.8% | 76.8% | 81.2% | 78.8% | |
|
| 81.4% | 82.2% | 80.2% | 78.6% | 71.3% | 71.7% | 72.0% | 82.6% | 79.4% | 78.5% | 81.2% | 78.8% | |
|
|
| 87.0% | 86.1% | 85.8% | 85.4% | 79.4% | 84.8% | 83.2% | 87.7% | 86.6% | 80.4% | 86.1% | 85.2% |
|
| 88.0% | 88.1% | 86.3% | 85.9% | 81.4% | 81.3% | 81.6% | 88.8% | 86.5% | 76.2% | 87.6% | 86.1% | |
|
| 87.8% | 88.4% | 86.5% | 85.8% | 81.4% | 81.3% | 81.3% | 88.5% | 86.7% | 79.2% | 87.9% | 86.1% | |
|
|
| 89.9% | 89.5% | 88.9% | 88.4% | 84.8% | 89.4% | 86.9% | 90.6% | 89.5% | 87.6% | 89.5% | 88.7% |
|
| 90.4% | 90.7% | 89.3% | 89.1% | 85.9% | 85.7% | 85.9% | 91.4% | 89.7% | 86.5% | 90.3% | 89.4% | |
|
| 91.5% | 91.3% | 89.3% | 89.1% | 86.1% | 85.7% | 85.6% | 91.2% | 89.5% | 89.1% | 90.9% | 89.4% | |
|
|
| 91.9% | 91.7% | 91.3% | 90.9% | 87.4% | 91.5% | 89.2% | 92.5% | 91.6% | 90.3% | 91.5% | 90.7% |
|
| 92.6% | 92.7% | 91.8% | 91.2% | 88.5% | 88.6% | 88.6% | 93.2% | 91.7% | 88.5% | 92.4% | 91.5% | |
|
| 92.6% | 93.0% | 91.6% | 91.2% | 89.3% | 88.5% | 88.5% | 93.0% | 91.6% | 91.1% | 92.5% | 91.5% | |
|
|
| 93.9% | 93.1% | 93.0% | 92.1% | 88.4% | 92.9% | 90.8% | 93.6% | 93.1% | 91.8% | 92.9% | 92.2% |
|
| 93.2% | 93.9% | 92.9% | 92.6% | 91.2% | 90.5% | 90.5% | 94.3% | 93.1% | 90.0% | 93.5% | 92.9% | |
|
| 93.9% | 94.2% | 93.1% | 92.6% | 91.2% | 90.4% | 90.4% | 94.2% | 93.1% | 92.6% | 93.8% | 92.9% |
Notes: LNN = Linear Neural Networks, MLP = Multilayer Perceptron; RBF = Radial Base Functions; EC = Evolutionary Computation; MDR = Multifactor Dimensionality Reduction; Bayes Nets = Bayesian Networks; SVM = Support Machine Vectors; Decis. Tb. = Decision Tables; DTNB = Decision Table Naïve Bayes Hybrid Classifier; BFTree = Best-First decision Tree classifier; AdaBoost = Adaptative Boosting.
Figure 2Correctly classified subjects depending on the simulated negative data for both genes; the dataset labels represent the proportion between real subjects (positive and negative = case and control) and simulated negative subjects.
Figure 3Area under the receiver operating characteristic curve (AUC-ROC) for LNN 40:152-1:1 (Model 1).
Figure 4Area under the receiver operating characteristic curve (AUC-ROC) for LNN 2:8-1:1 (Model 2).
Figure 5The general structure of an ANN for schizophrenia classification based on SNP inputs.