| Literature DB >> 29751818 |
Selvaraj Muthukrishnan1,2, Munish Puri3,4.
Abstract
OBJECTIVES: The arrival of free oxygen on the globe, aerobic life is becoming possible. However, it has become very clear that the oxygen binding proteins are widespread in the biosphere and are found in all groups of organisms, including prokaryotes, eukaryotes as well as in fungi, plants, and animals. The exponential growth and availability of fresh annotated protein sequences in the databases motivated us to develop an improved version of "Oxypred" for identifying oxygen-binding proteins.Entities:
Keywords: Confusion matrix; Erythrocruorin; Hemerythrin; Hemocyanin; Hemoglobin; Leghemoglobin; Myoglobin; Oxygen binding proteins; ROC Analysis; Support Vector Machines
Mesh:
Substances:
Year: 2018 PMID: 29751818 PMCID: PMC5948687 DOI: 10.1186/s13104-018-3383-9
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Fig. 1Amino acid distribution difference between oxy and non-oxy sequences: It has been calculated based on median scores. a Difference between oxy-50 and non-50. b Difference between oxy-90 and non-90. c Differences within the oxy-sub-classes of oxy-50 datasets. d Differences within the oxy-sub-classes of oxy-90 datasets
The performance of oxy-proteins sub-class SVM-models (Ery, Hcy, Heme, Hemo, Leg and Myo) in different approach and comparison between oxy-50 and oxy-90 output data
| ACC | Sen | Sep | MCC | |||||
|---|---|---|---|---|---|---|---|---|
| 50% | 90% | 50% | 90% | 50% | 90% | 50% | 90% | |
| Ery | ||||||||
| AC | 95.65 | 97.14 | 34.03 | 67.33 | 96.79 | 97.75 | 0.53 | 0.80 |
| DC | 90.29 | 93.26 | 55.56 | 94.32 | 90.93 | 93.24 | 0.65 | 0.93 |
| PSSM | 94.15 | 93.56 | 64.58 | 90.91 | 94.69 | 93.61 | 0.76 | 0.92 |
| AC–DC | 90.43 | 89.17 | 61.11 | 94.03 | 90.97 | 89.07 | 0.70 | 0.91 |
| Hcy | ||||||||
| AC | 97.18 | 98.06 | 100.00 | 92.92 | 97.14 | 98.20 | 0.99 | 0.95 |
| DC | 93.36 | 95.09 | 98.44 | 94.17 | 93.28 | 95.12 | 0.96 | 0.94 |
| PSSM | 94.40 | 90.41 | 100.00 | 95.63 | 94.31 | 90.27 | 0.97 | 0.92 |
| AC–DC | 93.19 | 94.02 | 100.00 | 94.38 | 93.08 | 94.01 | 0.97 | 0.94 |
| Heme | ||||||||
| AC | 86.25 | 91.73 | 92.57 | 94.90 | 78.49 | 88.89 | 0.79 | 0.90 |
| DC | 89.57 | 93.21 | 98.98 | 99.26 | 78.01 | 87.79 | 0.87 | 0.93 |
| PSSM | 90.09 | 89.00 | 99.07 | 99.55 | 79.07 | 79.56 | 0.88 | 0.89 |
| AC–DC | 87.41 | 90.55 | 98.82 | 99.31 | 73.41 | 82.69 | 0.84 | 0.90 |
| Hemo | ||||||||
| AC | 82.99 | 87.43 | 87.80 | 94.87 | 80.01 | 81.34 | 0.77 | 0.85 |
| DC | 84.95 | 89.85 | 93.65 | 98.64 | 79.55 | 82.67 | 0.89 | 0.89 |
| PSSM | 87.26 | 88.17 | 97.78 | 99.24 | 80.74 | 79.12 | 0.88 | 0.88 |
| AC–DC | 83.49 | 87.08 | 96.92 | 99.09 | 75.16 | 77.26 | 0.84 | 0.87 |
| Leg | ||||||||
| AC | 98.76 | 99.13 | 97.92 | 100.00 | 98.77 | 99.12 | 0.99 | 0.99 |
| DC | 94.50 | 93.78 | 100.00 | 100.00 | 94.44 | 93.75 | 0.97 | 0.97 |
| PSSM | 98.25 | 97.43 | 100.00 | 100.00 | 98.23 | 97.42 | 0.99 | 0.99 |
| AC–DC | 96.84 | 94.53 | 100 | 100.00 | 96.81 | 94.50 | 0.98 | 0.97 |
| Myo | ||||||||
| AC | 92.92 | 96.62 | 59.38 | 86.50 | 93.47 | 96.86 | 0.71 | 0.91 |
| DC | 89.60 | 92.19 | 62.50 | 92.25 | 90.05 | 92.19 | 0.70 | 0.92 |
| PSSM | 93.06 | 91.02 | 76.56 | 90.75 | 93.33 | 91.03 | 0.83 | 0.90 |
| AC–DC | 86.60 | 85.95 | 67.19 | 91.50 | 86.91 | 85.82 | 0.71 | 0.87 |
AC Amino acid composition, DC dipeptide composition, PSSM position specific scoring matrix, AC–DC hybrid profile, ACC accuracy, Sen sensitivity, Sep specificity, MCC Matthews correlation coefficient
Fig. 2ROC curve oxy-classification in all approaches. The performance of oxypred2 developed models by ROC plots in all oxy sub-classes. The area under curve was measured for all approached models