| Literature DB >> 27756223 |
Salma Jamal1,2, Sukriti Goyal1,2, Asheesh Shanker3, Abhinav Grover4.
Abstract
BACKGROUND: Alzheimer's disease (AD) is a complex progressive neurodegenerative disorder commonly characterized by short term memory loss. Presently no effective therapeutic treatments exist that can completely cure this disease. The cause of Alzheimer's is still unclear, however one of the other major factors involved in AD pathogenesis are the genetic factors and around 70 % risk of the disease is assumed to be due to the large number of genes involved. Although genetic association studies have revealed a number of potential AD susceptibility genes, there still exists a need for identification of unidentified AD-associated genes and therapeutic targets to have better understanding of the disease-causing mechanisms of Alzheimer's towards development of effective AD therapeutics.Entities:
Keywords: Alzheimer-associated genes; Functional annotations; Interaction networks; Machine learning; Molecular docking; Molecular dynamics; Sequence features
Mesh:
Year: 2016 PMID: 27756223 PMCID: PMC5070370 DOI: 10.1186/s12864-016-3108-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Lists the medians of the network features along with p-values between the Alz gene and NonAlz gene sets
| Network feature | Alz genes | NonAlz genes |
|
|---|---|---|---|
| Average shortest path length | 4.10 | 4.19 | 6.79E-05 |
| Closeness centrality | 0.24 | 0.23 | 1.88E-04 |
| Clustering coefficient | 0.03 | 0.06 | 1.91E-08 |
| Degree | 19 | 13 | 2.29E-05 |
| Eccentricity | 18 | 18 | 0 |
| Neighborhood connectivity | 88.4 | 108.7 | 1.18E-05 |
| Topological coefficient | 0.07 | 0.08 | 9.17E-02 |
| Radiality | 0.87 | 0.86 | 6.37E-05 |
Shows the medians of the sequence features and the p-values between the Alz proteins and NonAlz proteins sets
| Sequence feature | Alz genes | NonAlz genes |
|
|---|---|---|---|
| Molecular weight | 54349.54 | 49547.60 | 1.61E-02 |
| Residues | 491 | 443 | 1.49E-02 |
| Average residue weight | 111.83 | 111.90 | 3.09E-01 |
| Charge | 1 | 4 | 1.64E-07 |
| Isoelectric Point | 6.60 | 7.22 | 3.06E-08 |
| A280 Molar Extinction Coefficients | 50880 | 44380 | 7.66E-05 |
| A = Ala | 6.81 | 6.85 | 7.98E-01 |
| F = Phe | 3.77 | 3.56 | 1.48E-02 |
| L = Leu | 9.38 | 9.81 | 2.01E-02 |
| N = Asn | 3.78 | 3.46 | 1.22E-04 |
| P = Pro | 5.33 | 5.52 | 5.42E-02 |
| R = Arg | 5.09 | 5.55 | 4.89E-06 |
| S = Ser | 7.53 | 7.59 | 2.97E-01 |
| T = Thr | 5.31 | 5.04 | 6.63E-04 |
| Aliphatic | 27.7 | 27.6 | 6.34E-01 |
| Polar | 47.0 | 47.2 | 5.28E-01 |
| Non-polar | 52.9 | 52.7 | 5.28E-01 |
| Small | 50 | 49.3 | 3.80E-02 |
| Basic | 13.46 | 13.99 | 1.82E-04 |
| Aromatic | 10.63 | 10.15 | 4.97E-02 |
| Acidic | 11.94 | 11.73 | 3.64E-02 |
Selected features obtained after applying feature selection techniques
| Features category | ||
|---|---|---|
| Network features | Sequence features | Functional features |
| Clustering Coefficient | Charge | GO:0006916 ~ anti-apoptosis |
| Degree | Isoelectric Point | GO:0010942 ~ positive regulation of cell death |
| Average Shortest Path Length | R = Arg | GO:0043068 ~ positive regulation of programmed cell death |
| Closeness Centrality | Acidic | GO:0043066 ~ negative regulation of apoptosis |
| Neighborhood Connectivity | GO:0009725 ~ response to hormone stimulus | |
| GO:0009719 ~ response to endogenous stimulus | ||
| GO:0043005 ~ neuron projection | ||
| GO:0010941 ~ regulation of cell death | ||
| GO:0010033 ~ response to organic substance | ||
| GO:0032268 ~ regulation of cellular protein metabolic process | ||
| GO:0019899 ~ enzyme binding | ||
| Mutagenesis site | ||
| GO:0044093 ~ positive regulation of molecular function | ||
| GO:0008219 ~ cell death | ||
| Transmembrane protein | ||
| Lipoprotein | ||
| Active site: Proton acceptor | ||
| GO:0016023 ~ cytoplasmic membrane-bounded vesicle | ||
| GO:0042802 ~ identical protein binding | ||
| GO:0031982 ~ vesicle | ||
| Disease mutation | ||
| GO:0042127 ~ regulation of cell proliferation | ||
| GO:0000267 ~ cell fraction | ||
| GO:0005624 ~ membrane fraction | ||
Confusion matrix. Predictions by the cost sensitive classifier algorithms on the Entrez Gene dataset
| Classifier algorithms | True positives (TP) | True negatives (TN) | False positives (FP) | False negatives (FN) |
|---|---|---|---|---|
| Bayes Net | 47 | 2110 | 574 | 24 |
| Decision Table | 19 | 2032 | 652 | 52 |
| DTNB | 21 | 2133 | 551 | 50 |
| Functional Tree | 46 | 2004 | 680 | 25 |
| J48 | 44 | 2117 | 567 | 27 |
| Logistic Regression | 49 | 2148 | 536 | 22 |
| LWL (J48 + KNN) | 48 | 2111 | 573 | 23 |
| Naive Bayes | 51 | 2151 | 533 | 20 |
| NB Tree | 35 | 2070 | 614 | 36 |
| Random Forest | 42 | 2158 | 526 | 29 |
| SVM | 56 | 2058 | 626 | 15 |
Performance of the cost sensitive classifier algorithms on the Entrez gene dataset
| Classifier algorithms | TP rate/Recall | FP rate | Accuracy | Precision | F-measure | MCC |
|---|---|---|---|---|---|---|
| Bayes Net | 0.662 | 0.214 | 0.782 | 0.076 | 0.136 | 0.169 |
| Decision Table | 0.268 | 0.243 | 0.744 | 0.028 | 0.051 | 0.009 |
| DTNB | 0.296 | 0.205 | 0.781 | 0.037 | 0.065 | 0.035 |
| Functional Tree | 0.648 | 0.253 | 0.744 | 0.063 | 0.115 | 0.141 |
| J48 | 0.620 | 0.211 | 0.784 | 0.072 | 0.129 | 0.155 |
| Logistic Regression |
| 0.20 |
|
|
|
|
| LWL (J48 + KNN) | 0.676 | 0.213 | 0.783 | 0.077 | 0.139 | 0.175 |
| Naive Bayes | 0.718 | 0.199 |
|
|
|
|
| NB Tree | 0.493 | 0.229 | 0.764 | 0.054 | 0.097 | 0.098 |
| Random Forest | 0.592 | 0.196 |
| 0.074 | 0.131 | 0.154 |
| SVM |
| 0.233 | 0.767 |
|
|
|
List of the candidate genes predicted to be Alzheimer’s associated by all the classifier algorithms
| Entrez ID | Official gene symbol | Official gene name |
|---|---|---|
| 999 | CDH1 | Cadherin 1, type 1 |
| 22900 | CARD8 | Caspase recruitment domain family, member 8 |
| 2155 | F7 | Coagulation factor VII (serum prothrombin conversion accelerator) |
| 6453 | ITSN1 | Intersectin 1 (SH3 domain protein) |
| 3717 | JAK2 | Janus kinase 2 |
| 4792 | NFKBIA | Nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha |
| 5336 | PLCG2 | Phospholipase C, gamma 2 (phosphatidylinositol-specific) |
| 5925 | RB1 | Retinoblastoma 1 |
| 387 | RHOA | Ras homolog family member A |
| 11035 | RIPK3 | Receptor-interacting serine-threonine kinase 3 |
| 6776 | STAT5A | Signal transducer and activator of transcription 5A |
| 203068 | TUBB | Tubulin, beta class I |
| 7414 | VCL | Vinculin |
Fig. 1Depicts the interaction networks between the already established Alzheimer genes and the 13 novel genes predicted in the present study. a CDH1 (b) CARD8 (c) F7 (d) ITSN1 (e) JAK2 (f) STAT5 (g) NFKBIA (h) PLCG2 (i) Rb1 (j) RHOA (k) RIPK3 (l) TUBB (m) VCL
Docking scores and MMGBSA energy values for the top scoring compounds against seven novel candidate Alz-associated genes
| Candidate Alzheimer target | Docked compound | Glide score (kcal/mol) | ΔG (binding) (kcal/mol) |
|---|---|---|---|
| Cadherin 1 | AL-108 | –8.34 | –58.92 |
| Caspase recruitment domain family, member 8 | AL-108 | –6.90 | –36.50 |
| Janus kinase 2 | AL-108 | –10.87 | –74.34 |
| Nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha | PPI-1019 | –6.41 | –13.66 |
| Retinoblastoma 1 | AL-108 | –7.07 | –12.09 |
| Ras homolog family member A | AL-108 | –8.68 | –49.84 |
| Receptor-interacting serine-threonine kinase 3 | AL-108 | –8.99 | –77.07 |
Fig. 2Shows the RMSD plot of RIPK3, RhoA and NFKBIA
Fig. 3Shows the RMSD plot of JAK2, Rb1, Cadherin and Card8