| Literature DB >> 31760934 |
Tianyi Zhao1, Yang Hu1, Tianyi Zang2, Liang Cheng3.
Abstract
BACKGROUND: Alzheimer's disease (AD) imposes a heavy burden on society and every family. Therefore, diagnosing AD in advance and discovering new drug targets are crucial, while these could be achieved by identifying AD-related proteins. The time-consuming and money-costing biological experiment makes researchers turn to develop more advanced algorithms to identify AD-related proteins.Entities:
Keywords: Alzheimer’s disease; Gradient descent; Logistic regression; Proteins; Similarity of diseases
Mesh:
Substances:
Year: 2019 PMID: 31760934 PMCID: PMC6876080 DOI: 10.1186/s12859-019-3124-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Probability distribution of disease similarity
Similarities between AD and other diseases by five different methods
| DOID | SemFunSim | Wang | Lin | PSB | Resnik | Total |
|---|---|---|---|---|---|---|
| 0050784 | 0.02 | 0.48 | 0.40 | 0.06 | 2.55 | 3.52 |
| 0060368 | 0.01 | 0.48 | 0.42 | 0.06 | 2.55 | 3.52 |
| 0050765 | 0.00 | 0.63 | 0.33 | 0.00 | 2.55 | 3.52 |
| 14,784 | 0.00 | 0.63 | 0.33 | 0.00 | 2.55 | 3.52 |
| 1440 | 0.02 | 0.48 | 0.40 | 0.08 | 2.55 | 3.52 |
| 12,705 | 0.01 | 0.48 | 0.39 | 0.10 | 2.55 | 3.53 |
| 936 | 0.23 | 0.53 | 0.62 | 0.09 | 2.09 | 3.56 |
| 13,548 | 0.00 | 0.63 | 0.35 | 0.02 | 2.55 | 3.57 |
| 3981 | 0.00 | 0.63 | 0.35 | 0.03 | 2.55 | 3.57 |
| 4873 | 0.01 | 0.48 | 0.39 | 0.14 | 2.55 | 3.57 |
| 9277 | 0.01 | 0.63 | 0.38 | 0.00 | 2.55 | 3.58 |
| 0060264 | 0.01 | 0.63 | 0.39 | 0.00 | 2.55 | 3.59 |
| 12,704 | 0.03 | 0.48 | 0.44 | 0.08 | 2.55 | 3.59 |
| 1441 | 0.04 | 0.54 | 0.48 | 0.00 | 2.55 | 3.63 |
| 12,377 | 0.04 | 0.54 | 0.47 | 0.02 | 2.55 | 3.63 |
| 0050950 | 0.05 | 0.54 | 0.48 | 0.00 | 2.55 | 3.63 |
| 14,332 | 0.00 | 0.54 | 0.30 | 0.28 | 2.55 | 3.68 |
| 4752 | 0.03 | 0.54 | 0.44 | 0.12 | 2.55 | 3.69 |
| 2378 | 0.04 | 0.48 | 0.45 | 0.22 | 2.55 | 3.75 |
| 0050968 | 0.00 | 0.48 | 0.30 | 0.44 | 2.55 | 3.77 |
| 0050951 | 0.08 | 0.63 | 0.53 | 0.00 | 2.55 | 3.80 |
| 12,217 | 0.06 | 0.44 | 0.47 | 0.31 | 2.55 | 3.84 |
| 230 | 0.11 | 0.54 | 0.55 | 0.15 | 2.55 | 3.91 |
| 12,858 | 0.09 | 0.63 | 0.52 | 0.12 | 2.55 | 3.91 |
| 331 | 0.46 | 0.65 | 0.73 | 0.02 | 2.09 | 3.95 |
| 332 | 0.14 | 0.54 | 0.58 | 0.13 | 2.55 | 3.95 |
| 11,870 | 0.03 | 0.63 | 0.41 | 0.38 | 2.55 | 4.00 |
| 0050890 | 0.21 | 0.63 | 0.63 | 0.00 | 2.55 | 4.03 |
| 3213 | 0.19 | 0.63 | 0.65 | 0.10 | 2.55 | 4.12 |
| 2377 | 0.19 | 0.54 | 0.64 | 0.19 | 2.55 | 4.12 |
| 231 | 0.15 | 0.63 | 0.60 | 0.19 | 2.55 | 4.13 |
| 14,330 | 0.19 | 0.54 | 0.62 | 0.24 | 2.55 | 4.16 |
| 1289 | 0.57 | 0.75 | 0.83 | 0.27 | 2.55 | 4.98 |
| 680 | 1.00 | 0.87 | 1.00 | 0.00 | 3.60 | 6.47 |
Fig. 2Number of proteins corresponding to each disease
Fig. 3The work flow of selecting features and building mode
Work flow of LR
| Work flow of LR | |
|---|---|
| Step 1. Constructing a prediction function | |
| | |
| | |
| Step 2. Construction loss function | |
| | |
| y is true similarity, m is the number of sample | |
| Step 3. Newton method for getting the minimum | |
| | |
| |
Work flow of GD
| Work flow of GD | |
|---|---|
| Step 1. Finding descent direction | |
| | |
| Step 2. Moving x | |
| | |
| k is descent rate. | |
| Step 3. Repeat step 2, until satisfied with the following equation | |
| | |
| |
Fig. 4The work flow of selecting features and building models
Fig. 5Times that proteins are thought to be related to AD
Fig. 6The proportion of known AD-related proteins to novel AD-related proteins