| Literature DB >> 30717652 |
Ronesh Sharma1,2, Alok Sharma3,4,5,6, Ashwini Patil7, Tatsuhiko Tsunoda4,5,8.
Abstract
BACKGROUND: Molecular Recognition Features (MoRFs) are short protein regions present in intrinsically disordered protein (IDPs) sequences. MoRFs interact with structured partner protein and upon interaction, they undergo a disorder-to-order transition to perform various biological functions. Analyses of MoRFs are important towards understanding their function.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30717652 PMCID: PMC7653905 DOI: 10.1186/s12859-018-2396-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Datasets used to train and test a MoRF predictor
| Data sets | No. of Sequences | Total residues | No. of MoRF residues | No. of non-MoRF residues | |
|---|---|---|---|---|---|
| training set | TRAIN | 421 | 245,984 | 5396 | 240,588 |
| test sets | TEST | 419 | 258,829 | 5153 | 253,676 |
| NEW | 45 | 37,533 | 626 | 36,907 | |
| TEST464 | 464 | 296,362 | 5779 | 290, 583 | |
| TEST266 | 266 | 154,399 | 3305 | 151,094 | |
| validation set | EXP53 | 53 | 25,186 | 2432 | 22,754 |
Fig. 1Overview of the proposed method. Fuse score means that the model scores are combined to provide the whole sequence scores
Fig. 2Schematic illustration of extracting samples to score a query sequence. is the j-th amino acid in the query sequence and refers to the length of the query protein sequence
Fig. 3Combined model. MoRFpred-plus and PROMIS are our predictors while we download MoRFchibi predictor and integrate it with our proposed model
Fig. 4AUCs for the proposed model with varying window flank size values to process the output scores
AUCs using the test sets
| Predictors/models | TEST | TEST464 | TEST266 | EXP53 ALL | EXP53 LONG | EXP53 SHORT |
|---|---|---|---|---|---|---|
| ANCHOR | 0.6 | 0.605 | 0.599 | 0.615 | 0.586 | 0.683 |
| MoRFpred | 0.673 | 0.675 | 0.651 | 0.62 | 0.598 | 0.673 |
| MoRFchibi | 0.74 | 0.743 | 0.709 | 0.712 | 0.679 | 0.79 |
| MoRFpred-plus | 0.755 | 0.724 | 0.740 | 0.712 | 0.67 | 0.821 |
| MoRFchibi-light | 0.775 | 0.777 | 0.762 | 0.799 | 0.77 | 0.869 |
| PROMIS | 0.791 | 0.788 | 0.770 | 0.818 | 0.815 | 0.823 |
| MoRFchibi-web | 0.8 | 0.805 | 0.785 | 0.797 | 0.758 | 0.886 |
| OPAL | 0.815 | 0.816 | 0.795 | 0.836 | 0.823 | 0.870 |
| Proposed Model | 0.760 | 0.757 | 0.729 | 0.787 | 0.754 | 0.864 |
| Combined Model | 0.819 | 0.818 | 0.797 | 0.838 | 0.819 | 0.881 |
Fig. 5Percentage of MoRFs present in terminal and middle regions
Fig. 6Percentage of MoRFs per respective length for the TRAIN, TEST464 and EXP53 sets
FPR for a given TPR value for the combined model and OPAL using EXP53 SHORT
| TPR | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 |
| OPAL | 0.0113 | 0.0158 | 0.0414 | 0.0691 | 0.0902 | 0.1144 | 0.216 | 0.334 |
| Combined model | 0.0118 | 0.0175 | 0.0323 | 0.0593 | 0.0889 | 0.1150 | 0.1852 | 0.2913 |