| Literature DB >> 30459659 |
Eugene Lin1,2,3, Chieh-Hsin Lin3,4,5, Yi-Lun Lai3, Chiung-Hsien Huang6, Yu-Jhen Huang7, Hsien-Yuan Lane3,7,8,9.
Abstract
The D-amino acid oxidase activator (DAOA, also known as G72) gene is a strong schizophrenia susceptibility gene. Higher G72 protein levels have been implicated in patients with schizophrenia. The current study aimed to differentiate patients with schizophrenia from healthy individuals using G72 single nucleotide polymorphisms (SNPs) and G72 protein levels by leveraging computational artificial intelligence and machine learning tools. A total of 149 subjects with 89 patients with schizophrenia and 60 healthy controls were recruited. Two G72 genotypes (including rs1421292 and rs2391191) and G72 protein levels were measured with the peripheral blood. We utilized three machine learning algorithms (including logistic regression, naive Bayes, and C4.5 decision tree) to build the optimal predictive model for distinguishing schizophrenia patients from healthy controls. The naive Bayes model using two factors, including G72 rs1421292 and G72 protein, appeared to be the best model for disease susceptibility (sensitivity = 0.7969, specificity = 0.9372, area under the receiver operating characteristic curve (AUC) = 0.9356). However, a model integrating G72 rs1421292 only slightly increased the discriminative power than a model with G72 protein alone (sensitivity = 0.7941, specificity = 0.9503, AUC = 0.9324). Among the three models with G72 protein alone, the naive Bayes with G72 protein alone had the best specificity (0.9503), while logistic regression with G72 protein alone was the most sensitive (0.8765). The findings remained similar after adjusting for age and gender. This study suggests that G72 protein alone, without incorporating the two G72 SNPs, may have been suitable enough to identify schizophrenia patients. We also recommend applying both naive Bayes and logistic regression models for the best specificity and sensitivity, respectively. Larger-scale studies are warranted to confirm the findings.Entities:
Keywords: D-amino acid oxidase activator; G72; artificial intelligence; machine learning algorithm; schizophrenia; single nucleotide polymorphism
Year: 2018 PMID: 30459659 PMCID: PMC6232512 DOI: 10.3389/fpsyt.2018.00566
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Demographic characteristics of schizophrenia patients and unmatched healthy individuals.
| N | 60 | 89 | |
| Gender | 0.825 | ||
| Male | 36 (61.9%) | 55 (70.4%) | |
| Female | 24 (38.1%) | 34 (29.6%) | |
| Age (year), mean (SD) | 32.8 ± 9.9 | 37.8 ± 10.5 | 0.004 |
| Education (year) | 15.1 ± 2.2 | 11.5 ± 2.0 | <0.0001 |
| Age at onset (year) | 22.9 ± 6.1 | ||
| Illness duration (m) | 169.3 ± 109.3 | ||
| PANSS total score | 94.8 ± 18.6 | ||
| G72 level (ng/μL) | 1.147 ± 0.574 | 4.057 ± 2.594 | <0.0001 |
PANSS: Positive and Negative Syndrome Scale.
Chi-square test for the categorical data; Student's t-test for continuous variables.
Five naive Bayes models for differentiating schizophrenia patients from unmatched healthy individuals.
| (1) Using G72 protein, rs1421292, rs2391191 | 0.9280 | 0.7945 | 0.9213 | 3 |
| (2) Using G72 protein, rs1421292 | 0.9356 | 0.7969 | 0.9372 | 2 |
| (3) Using G72 protein, rs2391191 | 0.9244 | 0.7924 | 0.9320 | 2 |
| (4) Using rs1421292, rs2391191 | 0.4612 | 0.9704 | 0.0070 | 2 |
| (5) Using G72 protein | 0.9324 | 0.7941 | 0.9503 | 1 |
Five C4.5 decision tree models for differentiating schizophrenia patients from unmatched healthy individuals.
| (6) Using G72 protein, rs1421292, rs2391191 | 0.8515 | 0.8236 | 0.8772 | 3 |
| (7) Using G72 protein, rs1421292 | 0.8525 | 0.8202 | 0.8843 | 2 |
| (8) Using G72 protein, rs2391191 | 0.8504 | 0.8275 | 0.8725 | 2 |
| (9) Using rs1421292, rs2391191 | 0.5000 | 1.0000 | 0.0000 | 2 |
| (10) Using G72 protein | 0.8506 | 0.8274 | 0.8725 | 1 |
Five logistic regression models for differentiating schizophrenia patients from unmatched healthy individuals.
| (11) Using G72 protein, rs1421292, rs2391191 | 0.9200 | 0.8567 | 0.8400 | 3 |
| (12) Using G72 protein, rs1421292 | 0.9272 | 0.8576 | 0.8923 | 2 |
| (13) Using G72 protein, rs2391191 | 0.9107 | 0.8713 | 0.8607 | 2 |
| (14) Using rs1421292, rs2391191 | 0.4533 | 0.9619 | 0.0088 | 2 |
| (15) Using G72 protein | 0.9175 | 0.8765 | 0.8577 | 1 |
Relationship between G72 genotypes and G72 protein level with schizophrenia patients and unmatched healthy individuals.
| N | 27 | 57 | 65 | 122 | |||
| G72 protein level, mean (SD) | 2.137 ± 1.819 | 3.219 ± 3.032 | 2.903 ± 2.139 | 3.051 ± 2.588 | 0.09 | 0.11 | 0.084 |
| N | 63 | 22 | 64 | 86 | |||
| G72 protein level, mean (SD) | 2.586 ± 2.083 | 3.570 ± 3.356 | 2.943 ± 2.499 | 3.104 ± 2.736 | 0.11 | 0.38 | 0.21 |
P value for comparing the subjects of the AA genotype with those of the TT genotype.
P value for comparing the subjects of the AA genotype with those of the TA genotype.
P value for comparing the subjects of the AA genotype with those of the TT or TA genotype.
P value for comparing the subjects of the AA genotype with those of the GG genotype.
P value for comparing the subjects of the AA genotype with those of the AG genotype.
P value for comparing the subjects of the AA genotype with those of the GG or AG genotype.
Demographic characteristics of schizophrenia patients and matched healthy individuals.
| N | 60 | 66 | |
| Gender | 0.783 | ||
| Male | 36 (61.9%) | 38 (57.6%) | |
| Female | 24 (38.1%) | 28 (42.4%) | |
| Age (year), mean (SD) | 32.8 ± 9.9 | 33.2 ± 7.2 | 0.820 |
| Education (year) | 15.1 ± 2.2 | 11.6 ± 2.0 | <0.0001 |
| Age at onset (year) | 21.3 ± 5.5 | ||
| Illness duration (m) | 141.1 ± 95.7 | ||
| PANSS total score | 95.9 ± 19.3 | ||
| G72 level (ng/μL) | 1.147 ± 0.574 | 4.188 ± 2.772 | <0.0001 |
PANSS: Positive and Negative Syndrome Scale.
Chi-square test for the categorical data; Student's t-test for continuous variables.
The models of naive Bayes, C4.5 decision tree, and Logistic regression for differentiating schizophrenia patients from matched healthy individuals using G72 protein.
| (16) Naive Bayes with G72 protein | 0.9396 | 0.7914 | 0.9660 | 1 |
| (17) C4.5 decision tree with G72 protein | 0.8510 | 0.7871 | 0.9152 | 1 |
| (18) Logistic regression with G72 protein | 0.9099 | 0.8483 | 0.9072 | 1 |