| Literature DB >> 28744460 |
Xianchun Zou1, Guijun Wang1, Guoxian Yu1.
Abstract
Accurately annotating biological functions of proteins is one of the key tasks in the postgenome era. Many machine learning based methods have been applied to predict functional annotations of proteins, but this task is rarely solved by deep learning techniques. Deep learning techniques recently have been successfully applied to a wide range of problems, such as video, images, and nature language processing. Inspired by these successful applications, we investigate deep restricted Boltzmann machines (DRBM), a representative deep learning technique, to predict the missing functional annotations of partially annotated proteins. Experimental results on Homo sapiens, Saccharomyces cerevisiae, Mus musculus, and Drosophila show that DRBM achieves better performance than other related methods across different evaluation metrics, and it also runs faster than these comparing methods.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28744460 PMCID: PMC5506480 DOI: 10.1155/2017/1729301
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1An RBM with binary hidden units (h) representing latent features and visible units (v) encoding observed data.
Figure 2Network architecture of DRBM.
Statistics of experimental datasets. The data in the third column (N) is the number of proteins annotated with at least 1 term for a particular subontology. C is the number of involved GO terms; Avg ± Std is the average number of annotations of a protein and its standard deviation.
| Dataset |
|
| Avg ± Std | |
|---|---|---|---|---|
|
| BP | 11628 | 12514 | 60.24 ± 60.83 |
| CC | 12523 | 1574 | 20.17 ± 12.28 | |
| MF | 11628 | 3724 | 10.97 ± 8.81 | |
|
| ||||
|
| BP | 10990 | 13500 | 56.26 ± 61.08 |
| CC | 10549 | 1592 | 15.73 ± 10.25 | |
| MF | 9906 | 3775 | 9.59 ± 7.30 | |
|
| ||||
|
| BP | 4671 | 4909 | 44.13 ± 31.41 |
| CC | 4128 | 970 | 20.67 ± 10.30 | |
| MF | 4291 | 2203 | 9.60 ± 6.60 | |
|
| ||||
|
| BP | 6188 | 6645 | 48.53 ± 48.97 |
| CC | 4851 | 1097 | 15.10 ± 10.27 | |
| MF | 4489 | 2255 | 9.05 ± 5.75 | |
Experimental results on Homo sapiens.
| MacroAvg | AvgROC | 1 − RankLoss |
| ||
|---|---|---|---|---|---|
| BP | NtN | 0.0107 | 0.7498 | 0.6920 | 0.1712 |
| dRW | 0.6902 | 0.9044 | 0.8737 |
| |
| SVD | 0.7313 | 0.9053 | 0.9349 | 0.9206 | |
| AE | 0.5341 | 0.9049 | 0.8495 | 0.5617 | |
| DRBM |
|
|
| 0.9217 | |
|
| |||||
| CC | NtN | 0.0036 | 0.6569 | 0.6641 | 0.1063 |
| dRW | 0.6806 | 0.8999 | 0.9186 |
| |
| SVD | 0.7139 | 0.8942 | 0.9592 | 0.9157 | |
| AE |
| 0.8932 | 0.9629 | 0.8819 | |
| DRBM | 0.7982 |
|
| 0.9437 | |
|
| |||||
| MF | NtN | 0.3891 | 0.7767 | 0.8450 | 0.0121 |
| dRW | 0.7909 |
| 0.9208 |
| |
| SVD | 0.8022 | 0.8022 | 0.9526 | 0.9480 | |
| AE | 0.7683 | 0.9047 | 0.8186 | 0.5604 | |
| DRBM |
| 0.9085 |
| 0.9470 | |
Experimental results on Mus musculus.
| MacroAvg | AvgROC | 1 − RankLoss |
| ||
|---|---|---|---|---|---|
| BP | NtN | 0.0154 | 0.6950 | 0.7055 | 0.1542 |
| dRW | 0.5666 | 0.8155 | 0.8296 |
| |
| SVD | 0.6169 | 0.8220 | 0.9130 | 0.8914 | |
| AE | 0.4573 | 0.8139 | 0.8219 | 0.5340 | |
| DRBM |
|
|
| 0.8962 | |
|
| |||||
| CC | NtN | 0.0055 | 0.6244 | 0.6436 | 0.1062 |
| dRW | 0.4913 | 0.8001 | 0.7857 |
| |
| SVD | 0.5415 | 0.7847 | 0.8856 | 0.8539 | |
| AE | 0.6548 | 0.7933 | 0.9139 |
| |
| DRBM |
|
|
| 0.8644 | |
|
| |||||
| MF | NtN | 0.7338 | 0.9135 | 0.9401 | 0.0111 |
| dRW | 0.8742 | 0.9493 | 0.9474 |
| |
| SVD | 0.7408 | 0.9466 | 0.9703 | 0.9188 | |
| AE | 0.9035 | 0.9461 | 0.9724 | 0.7044 | |
| DRBM |
|
|
| 0.9652 | |
Experimental results on Saccharomyces cerevisiae.
| MacroAvg | AvgROC | 1 − RankLoss |
| ||
|---|---|---|---|---|---|
| BP | NtN | 0.0072 | 0.7026 | 0.7027 | 0.1172 |
| dRW | 0.8042 |
| 0.9337 |
| |
| SVD | 0.7794 | 0.9199 | 0.9659 | 0.9440 | |
| AE | 0.6990 | 0.9179 | 0.9252 | 0.5032 | |
| DRBM |
| 0.9256 |
| 0.9555 | |
|
| |||||
| CC | NtN | 0.0072 | 0.7026 | 0.7027 | 0.1172 |
| dRW | 0.8112 | 0.9264 | 0.9612 |
| |
| SVD | 0.7408 | 0.9274 | 0.9767 | 0.9198 | |
| AE | 0.8595 | 0.9262 | 0.9851 |
| |
| DRBM |
|
|
| 0.9744 | |
|
| |||||
| MF | NtN | 0.7338 | 0.9135 | 0.9401 | 0.0111 |
| dRW | 0.8742 |
| 0.9474 |
| |
| SVD | 0.7408 | 0.9466 | 0.9703 | 0.9188 | |
| AE | 0.9035 | 0.9461 | 0.9724 | 0.7044 | |
| DRBM |
| 0.9492 |
| 0.9652 | |
Experimental results on Drosophila.
| MacroAvg | AvgROC | 1 − RankLoss |
| ||
|---|---|---|---|---|---|
| BP | NtN |
| 0.8450 | 0.8958 | 0.9416 |
| dRW | 0.6875 | 0.8525 | 0.9011 |
| |
| SVD | 0.6852 | 0.8516 | 0.9479 | 0.9371 | |
| AE | 0.5882 | 0.8486 | 0.9049 | 0.5772 | |
| DRBM | 0.7699 |
|
| 0.9382 | |
|
| |||||
| CC | NtN | 0.0101 | 0.6475 | 0.7808 | 0.1957 |
| dRW | 0.6599 | 0.8425 | 0.9210 |
| |
| SVD | 0.6446 | 0.8222 | 0.9585 | 0.9156 | |
| AE | 0.7331 | 0.8251 | 0.9678 |
| |
| DRBM |
|
|
| 0.9448 | |
|
| |||||
| MF | NtN | 0.5071 | 0.7640 | 0.9065 | 0.0700 |
| dRW | 0.7346 |
| 0.9309 |
| |
| SVD | 0.7131 | 0.8125 | 0.9631 | 0.9549 | |
| AE | 0.7558 | 0.8133 | 0.9639 | 0.6429 | |
| DRBM |
| 0.8187 |
| 0.9499 | |
Runtime cost (seconds) on Homo sapiens and Mus musculus in BP subontology.
| NtN | dRW | SVD | AE | DRBM | |
|---|---|---|---|---|---|
|
| 30180 | 27660 | 1200 | 15840 | 6180 |
|
| 24180 | 28020 | 1260 | 33780 | 7500 |