| Literature DB >> 31277574 |
Nguyen Quoc Khanh Le1, Edward Kien Yee Yapp2, Hui-Yuan Yeh3.
Abstract
BACKGROUND: Electron transport chain is a series of protein complexes embedded in the process of cellular respiration, which is an important process to transfer electrons and other macromolecules throughout the cell. It is also the major process to extract energy via redox reactions in the case of oxidation of sugars. Many studies have determined that the electron transport protein has been implicated in a variety of human diseases, i.e. diabetes, Parkinson, Alzheimer's disease and so on. Few bioinformatics studies have been conducted to identify the electron transport proteins with high accuracy, however, their performance results require a lot of improvements. Here, we present a novel deep neural network architecture to address this problem.Entities:
Keywords: Cellular respiration; Convolutional neural network; Deep learning; Electron transport chain; Gated recurrent units; Position specific scoring matrix; Protein function prediction; Recurrent neural network; Transport protein; Web server
Mesh:
Substances:
Year: 2019 PMID: 31277574 PMCID: PMC6612191 DOI: 10.1186/s12859-019-2972-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The process of cellular respiration. The goal of cellular respiration is to accumulate electrons from organic compounds to create ATP, which is used to provide energy for most cellular reactions
Fig. 2The flowchart for identifying electron transport proteins using 1D RNN, GRU, and PSSM profiles. It included four subprocesses: data collection, feature set generation, neural network implementation and model evaluation
Statistics of all retrieved electron transport proteins and general transport proteins
| Original | Similarity < 30% | CV | IND | |
|---|---|---|---|---|
| Electron transport | 12,832 | 1324 | 1116 | 208 |
| General transport | 10,814 | 4569 | 3856 | 713 |
CV Cross-validation, IND Independent
Fig. 3Amino acid composition and variance of amino acid composition in electron transport and general transport proteins. There are numerous differences between the amino acid frequencies surrounding the electron transport proteins and general transport proteins. For instance, the amino acid E, I, F, or R could be adopted for classifying electron transport proteins
All layers with weights and trainable parameters in the proposed method
| Layer | Weights | Parameters |
|---|---|---|
| Conv1d (20, 200, 3) | ((200, 20, 3), (200,)) | 12,200 |
| AvgPool1d (3) | 0 | 0 |
| Conv1d (200, 200, 3) | ((200, 200, 3), (200,)) | 120,200 |
| AvgPool1d (3) | 0 | 0 |
| GRU (200, 200, 1) | ((600, 200), (600, 200), (600,), (600,)) | 241,200 |
| Linear (200, 32) | ((32, 200), (32,)) | 6432 |
| Dropout (0.5) | 0 | 0 |
| Linear (32, 1) | ((1, 32), (1,)) | 33 |
| Sigmoid () | 0 | 0 |
Predictive performance of classifying electron transport proteins using different neural networks
| CV | Independent | |||||||
|---|---|---|---|---|---|---|---|---|
| Sen | Spe | Acc | MCC | Sen | Spe | Acc | MCC | |
| kNN | 37.7(−) | 98.9(+) | 85.2(−) | 0.53(−) | 32.7(−) | 96.5(+) | 82.1(−) | 0.41(−) |
| RF | 64.8(−) | 97.1(+) | 89.8(−) | 0.69(−) | 56.3(−) | 96.4(+) | 87.3(−) | 0.61(−) |
| SVM | 74(−) | 96.2(+) | 91.2(−) | 0.74(−) | 74(−) | 91.7(−) | 87.7(−) | 0.65(−) |
| CNN | 73.8(−) | 95(−) | 90.3(−) | 0.71(−) | 78.2(+) | 92.5(−) | 89.5(−) | 0.69(−) |
| GRU | 83.7 | 96.3 | 93.5 | 0.81 | 79.8 | 95.9 | 92.3 | 0.77 |
Note: (kNN: k = 10, RF: n_estimators = 500, SVM: c = 32, g = 0.125, CNN: 128 filters, GRU: 32 filters, (+) for significantly better than GRU, (−) for significantly worse than GRU in a two-proportion z-test)
Fig. 4ROC Curves for predicting electron transport protein using GRU and PSSM profiles. (a) cross-validation testing, (b) independent testing
Predictive performance of classifying electron transport proteins using different sensitive multiple alignments
| Sen | Spe | Acc | MCC | |
|---|---|---|---|---|
| PSI-BLAST | 75.5 | 83.6 | 81.8 | 0.54 |
| HMM | 76 | 95 | 90.8 | 0.73 |
| PSSM | 79.8 | 95.9 | 92.3 | 0.77 |
Comparison with state-of-the-art predictions on the cross-validation dataset and independent dataset
| Sen | Spe | Acc | MCC | |
|---|---|---|---|---|
| Cross-validation | ||||
| TrSSP [ | 85 | 80 | 81.43 | 0.6 |
| Chen et al. [ | 71.6 | 93.5 | 90.1 | 0.62 |
| Le et al. [ | 74.6 | 95.8 | 92.9 | 0.7 |
| ET-CNN [ | 51.1 | 96.1 | 89.4 | 0.54 |
| ET-GRU | 83.7 | 96.3 | 93.5 | 0.81 |
| Independent | ||||
| ET-CNN [ | 52.9(−) | 96.6(+) | 86.8(−) | 0.59(−) |
| ET-GRU | 79.8 | 95.9 | 92.3 | 0.77 |
with ET-GRU as the base case, (+) and (−) indicates whether ET-CNN is significantly better or worse, respectively