| Literature DB >> 36211019 |
Farman Ali1, Omar Barukab2, Ajay B Gadicha3, Shruti Patil4, Omar Alghushairy5, Akram Y Sarhan6.
Abstract
DNA-binding proteins (DBPs) have crucial biotic activities including DNA replication, recombination, and transcription. DBPs are highly concerned with chronic diseases and are used in the manufacturing of antibiotics and steroids. A series of predictors were established to identify DBPs. However, researchers are still working to further enhance the identification of DBPs. This research designed a novel predictor to identify DBPs more accurately. The features from the sequences are transformed by F-PSSM (Filtered position-specific scoring matrix), PSSM-DPC (Position specific scoring matrix-dipeptide composition), and R-PSSM (Reduced position-specific scoring matrix). To eliminate the noisy attributes, we extended DWT (discrete wavelet transform) to F-PSSM, PSSM-DPC, and R-PSSM and introduced three novel descriptors, namely, F-PSSM-DWT, PSSM-DPC-DWT, and R-PSSM-DWT. Onward, the training of the four models were performed using LiXGB (Light eXtreme gradient boosting), XGB (eXtreme gradient boosting, ERT (extremely randomized trees), and Adaboost. LiXGB with R-PSSM-DWT has attained 6.55% higher accuracy on training and 5.93% on testing dataset than the best existing predictors. The results reveal the excellent performance of our novel predictor over the past studies. DBP-iDWT would be fruitful for establishing more operative therapeutic strategies for fatal disease treatment.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36211019 PMCID: PMC9534628 DOI: 10.1155/2022/2987407
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Architecture of the proposed model.
Figure 22-level structure of DWT.
Applied parameters with values.
| Parameter | Value |
|---|---|
| Era | 0.1 |
| No. of estimator | 500 |
| Alpha | 1 |
| Lambda | 1 |
| Max depth | 8 |
Results of encoders before DWT.
| Model | Encoder | Acc (%) | Sn (%) | Sp (%) | MCC (%) |
|---|---|---|---|---|---|
| Adaboost | F-PSSM | 71.52 | 80.42 | 62.54 | 43.67 |
| PSSM-DPC | 80.05 | 78.44 | 81.69 | 60.15 | |
| R-PSSM | 80.07 | 76.15 | 84.02 | 60.35 | |
|
| |||||
| ERT | F-PSSM | 75.18 | 84.74 | 58.97 | 44.56 |
| PSSM-DPC | 79.22 | 73.18 | 85.31 | 58.42 | |
| R-PSSM | 79.56 | 74.99 | 84.18 | 59.40 | |
|
| |||||
| XGB | F-PSSM | 74.57 | 82.17 | 66.90 | 49.67 |
| PSSM-DPC | 81.53 | 76.15 | 86.97 | 63.47 | |
| R-PSSM | 81.63 | 76.48 | 86.84 | 63.64 | |
|
| |||||
| LiXGB | F-PSSM | 76.60 | 82.47 | 66.01 | 48.75 |
| PSSM-DPC | 83.54 | 84.61 | 82.46 | 67.10 | |
| R-PSSM | 83.62 | 82.30 | 84.96 | 67.27 | |
Results of feature encoders after DWT.
| Model | Encoder | Acc (%) | Sn (%) | Sp (%) | MCC (%) |
|---|---|---|---|---|---|
| Adaboost | F-PSSM-DWT | 73.20 | 82.35 | 56.67 | 40.21 |
| PSSM-DPC-DWT | 81.81 | 80.45 | 83.19 | 63.66 | |
| R-PSSM-DWT | 82.23 | 77.68 | 86.81 | 64.77 | |
|
| |||||
| ERT | F-PSSM-DWT | 77.26 | 79.91 | 74.59 | 54.58 |
| PSSM-DPC-DWT | 81.53 | 76.15 | 86.97 | 63.47 | |
| R-PSSM-DWT | 83.05 | 81.30 | 84.82 | 66.15 | |
|
| |||||
| XGB | F-PSSM-DWT | 75.37 | 83.43 | 60.81 | 45.31 |
| PSSM-DPC-DWT | 82.45 | 83.65 | 81.25 | 64.91 | |
| R-PSSM-DWT | 83.61 | 82.66 | 84.56 | 67.23 | |
|
| |||||
| LiXGB | F-PSSM-DWT | 79.40 | 83.11 | 75.65 | 58.94 |
| PSSM-DPC-DWT | 84.74 | 84.30 | 85.19 | 69.49 | |
| R-PSSM-DWT | 86.84 | 86.60 | 87.08 | 73.69 | |
Comparative analysis with past work on the training set.
| Predictor | Acc (%) | Sn (%) | Sp (%) | MCC |
|---|---|---|---|---|
| iDNA-prot | 75.40 | 83.81 | 64.73 | 0.50 |
| iDNA-prot|dis | 77.30 | 79.40 | 75.27 | 0.54 |
| TargetDBP | 79.71 | 79.56 | 79.85 | 0.59 |
| MsDBP | 80.29 | 80.87 | 79.72 | 0.60 |
| PDBP-CNN | 82.02 | 87.49 | 76.50 | 0.64 |
| XGBoost | 81.42 | 84.11 | 78.43 | 0.62 |
| DBP-iDWT | 86.84 | 86.60 | 87.08 | 0.73 |
Comparative analysis with past work using testing dataset.
| Predictor | Acc (%) | Sn (%) | Sp (%) | MCC |
|---|---|---|---|---|
| PseDNA-pro | 67.23 | 78.38 | 56.08 | 0.35 |
| iDNAPro-PseAAC | 66.22 | 78.37 | 54.05 | 0.33 |
| iDNAProt-ES | 68.58 | 95.95 | 41.22 | 0.44 |
| DPP-PseAAC | 61.15 | 55.41 | 66.89 | 0.22 |
| TargetDBP | 76.69 | 76.35 | 77.03 | 0.53 |
| MsDBP | 66.99 | 70.69 | 63.18 | 0.33 |
| PDBP-fusion | 77.77 | 73.31 | 66.85 | 0.56 |
| DBP-iDWT | 82.83 | 90.37 | 75.07 | 0.66 |