| Literature DB >> 23879571 |
Alok Sharma1, Kuldip K Paliwal, Abdollah Dehzangi, James Lyons, Seiya Imoto, Satoru Miyano.
Abstract
BACKGROUND: Assigning a protein into one of its folds is a transitional step for discovering three dimensional protein structure, which is a challenging task in bimolecular (biological) science. The present research focuses on: 1) the development of classifiers, and 2) the development of feature extraction techniques based on syntactic and/or physicochemical properties.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23879571 PMCID: PMC3724710 DOI: 10.1186/1471-2105-14-233
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Multi-dimensional successive feature selection: backward elimination scheme.
Figure 2Multi-dimensional successive feature selection: forward selection scheme.
Physicochemical attributes used in the study
| Hydrophibicity (membrane buried helix) [ | ||
| Polarity [ | ||
| Polarizability parameter [ | ||
| Normalized frequency of alpha-helix [ | ||
| Normalized van der Waals volume [ | ||
| alpha-NH chemical shifts [ | ||
| A parameter of charge transfer capability [ | ||
| The Kerr-constant increments [ | ||
| Normalized hydrophobicity scales for beta-proteins [ | ||
| Normalized frequency of beta-sheet [ | ||
| Normalized frequency of beta-turn [ | ||
| Normalized frequency of reverse turn, with weights [ | ||
| Size [ | ||
| Amino acid composition [ | ||
| Frequency of the 1st residue in turn [ | ||
| Spin-spin coupling constants 3JHalpha-NH [ | ||
| Relative mutability [ | ||
| Direction of hydrophobic moment [ | ||
| Molecular weight [ | ||
| Optical rotation [ | ||
| Aperiodic indices for alpha-proteins [ | ||
| Aperiodic indices for beta-proteins [ | ||
| Aperiodic indices for alpha/beta-proteins [ | ||
| Volume [ | ||
| Partition energy [ | ||
| Heat capacity [ | ||
| Absolute entropy [ | ||
| Average accessible surface area [ | ||
| Percentage of buried residues [ | ||
| Percentage of exposed residues [ |
Residues of amino acids of the 30 attributes
| 1 | 0.61 | 1.07 | 0.46 | 0.47 | 2.02 | 0.07 | 0.61 | 2.22 | 1.15 | 1.53 | 1.18 | 0.06 | 1.95 | 0 | 0.6 | 0.05 | 0.05 | 1.32 | 2.65 | 1.88 |
| 2 | 0 | 1.48 | 49.7 | 49.9 | 0.35 | 0 | 51.6 | 0.13 | 49.5 | 0.13 | 1.43 | 3.38 | 1.58 | 3.53 | 52 | 1.67 | 1.66 | 0.13 | 2.1 | 1.61 |
| 3 | 0.046 | 0.128 | 0.105 | 0.151 | 0.29 | 0 | 0.23 | 0.186 | 0.219 | 0.186 | 0.221 | 0.134 | 0.131 | 0.18 | 0.291 | 0.062 | 0.108 | 0.14 | 0.409 | 0.298 |
| 4 | 0.486 | 0.2 | 0.288 | 0.538 | 0.318 | 0.12 | 0.4 | 0.37 | 0.402 | 0.42 | 0.417 | 0.193 | 0.208 | 0.418 | 0.262 | 0.2 | 0.272 | 0.379 | 0.462 | 0.161 |
| 5 | 1 | 2.43 | 2.78 | 3.78 | 5.89 | 0 | 4.66 | 4 | 4.77 | 4 | 4.43 | 2.95 | 2.72 | 3.95 | 6.13 | 1.6 | 2.6 | 3 | 8.08 | 6.47 |
| 6 | 8.249 | 8.312 | 8.41 | 8.368 | 8.228 | 8.391 | 8.415 | 8.195 | 8.408 | 8.423 | 8.418 | 8.747 | 0 | 8.411 | 8.274 | 8.38 | 8.236 | 8.436 | 8.094 | 8.183 |
| 7 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 8 | 49.1 | 0 | 0 | 0 | 54.7 | 64.6 | 75.7 | 18.9 | 0 | 15.6 | 6.8 | −3.6 | 43.8 | 20 | 133 | 44.4 | 31 | 29.5 | 70.5 | 0 |
| 9 | −0.08 | 0.76 | −0.71 | −1.31 | 1.53 | −0.84 | 0.43 | 1.39 | −0.09 | 1.24 | 1.27 | −0.7 | −0.01 | −0.4 | −0.09 | −0.93 | −0.59 | 1.09 | 2.25 | 1.53 |
| 10 | 0.83 | 1.19 | 0.54 | 0.37 | 1.38 | 0.75 | 0.87 | 1.6 | 0.74 | 1.3 | 1.05 | 0.89 | 0.55 | 1.1 | 0.93 | 0.75 | 1.19 | 1.7 | 1.37 | 1.47 |
| 11 | 0.74 | 0.96 | 1.52 | 0.95 | 0.66 | 1.56 | 0.95 | 0.47 | 1.19 | 0.5 | 0.6 | 1.46 | 1.56 | 0.96 | 1.01 | 1.43 | 0.98 | 0.59 | 0.6 | 1.14 |
| 12 | 0.77 | 0.81 | 1.41 | 0.99 | 0.59 | 1.64 | 0.68 | 0.51 | 0.96 | 0.58 | 0.41 | 1.28 | 1.91 | 0.98 | 0.88 | 1.32 | 1.04 | 0.47 | 0.76 | 1.05 |
| 13 | 2.5 | 3 | 2.5 | 5 | 6.5 | 0.5 | 6 | 5.5 | 7 | 5.5 | 6 | 5 | 5.5 | 6 | 7.5 | 3 | 5 | 5 | 7 | 7 |
| 14 | 8.6 | 2.9 | 5.5 | 6 | 3.6 | 8.4 | 2 | 4.5 | 6.6 | 7.4 | 1.7 | 4.3 | 5.2 | 3.9 | 4.9 | 7 | 6.1 | 6.6 | 1.3 | 3.4 |
| 15 | 0.06 | 0.149 | 0.147 | 0.056 | 0.059 | 0.102 | 0.14 | 0.043 | 0.055 | 0.061 | 0.068 | 0.161 | 0.102 | 0.074 | 0.07 | 0.12 | 0.086 | 0.062 | 0.077 | 0.082 |
| 16 | 6.5 | 7.7 | 7 | 7 | 9.4 | 5.6 | 8 | 7 | 6.5 | 6.5 | 0 | 7.5 | 0 | 6 | 6.9 | 6.5 | 6.9 | 7 | 0 | 6.8 |
| 17 | 100 | 20 | 106 | 102 | 41 | 49 | 66 | 96 | 56 | 40 | 94 | 134 | 56 | 93 | 65 | 120 | 97 | 74 | 18 | 41 |
| 18 | 0 | 0.76 | −0.98 | −0.89 | 0.92 | 0 | −0.75 | 0.99 | −0.99 | 0.89 | 0.94 | −0.86 | 0.22 | −1 | −0.96 | −0.67 | 0.09 | 0.84 | 0.67 | −0.93 |
| 19 | 89.09 | 121.15 | 133.1 | 147.13 | 165.19 | 75.07 | 155.16 | 131.17 | 146.19 | 131.2 | 149.21 | 132.12 | 115.13 | 146.15 | 174.2 | 105.09 | 119.12 | 117.15 | 204.24 | 181.19 |
| 20 | 1.8 | −16.5 | 5.05 | 12 | −34.5 | 0 | −38.5 | 12.4 | 14.6 | −11 | −10 | −5.6 | −86.2 | 6.3 | 12.5 | −7.5 | −28 | 5.63 | −33.7 | −10 |
| 21 | 0.8 | 0 | 1.6 | 0.4 | 1.2 | 2 | 0.96 | 0.85 | 0.94 | 0.8 | 0.39 | 1.1 | 2.1 | 1.6 | 0.96 | 1.3 | 0.6 | 0.8 | 0 | 1.8 |
| 22 | 1.1 | 1.05 | 1.41 | 1.4 | 0.6 | 1.3 | 0.85 | 0.67 | 0.94 | 0.52 | 0.69 | 1.57 | 1.77 | 0.81 | 0.93 | 1.13 | 0.88 | 0.58 | 0.62 | 0.41 |
| 23 | 0.93 | 0.92 | 1.22 | 1.05 | 0.71 | 1.45 | 0.96 | 0.58 | 0.91 | 0.59 | 0.6 | 1.36 | 1.67 | 0.83 | 1.01 | 1.25 | 1.08 | 0.62 | 0.68 | 0.98 |
| 24 | 31 | 55 | 54 | 83 | 132 | 3 | 96 | 111 | 119 | 111 | 105 | 56 | 32.5 | 85 | 124 | 32 | 61 | 84 | 170 | 136 |
| 25 | 0.1 | −1.42 | 0.78 | 0.83 | −2.12 | 0.33 | −0.5 | −1.13 | 1.4 | −1.18 | −1.59 | 0.48 | 0.73 | 0.95 | 1.91 | 0.52 | 0.07 | −1.27 | −0.51 | −0.21 |
| 26 | 29.22 | 50.7 | 37.09 | 41.84 | 48.52 | 23.71 | 59.64 | 45 | 57.1 | 48.03 | 69.32 | 38.3 | 36.13 | 44.02 | 26.37 | 32.4 | 35.2 | 40.35 | 56.92 | 51.73 |
| 27 | 30.88 | 53.83 | 40.66 | 44.98 | 51.06 | 24.74 | 65.99 | 49.71 | 63.21 | 50.62 | 55.32 | 41.7 | 39.21 | 46.62 | 68.43 | 35.65 | 36.5 | 42.75 | 60 | 51.15 |
| 28 | 27.8 | 15.5 | 60.6 | 68.2 | 25.5 | 24.5 | 50.7 | 22.8 | 103 | 27.6 | 33.5 | 60.1 | 51.5 | 68.7 | 94.7 | 42 | 45 | 23.7 | 34.7 | 55.2 |
| 29 | 51 | 74 | 19 | 16 | 58 | 52 | 34 | 66 | 3 | 60 | 52 | 22 | 25 | 16 | 5 | 35 | 30 | 64 | 49 | 24 |
| 30 | 15 | 5 | 50 | 55 | 10 | 10 | 34 | 13 | 85 | 16 | 20 | 49 | 45 | 56 | 67 | 32 | 32 | 14 | 17 | 41 |
1The first row from column 2 to the last column represents amino acid symbols (‘a,c,d,…,w,y’). The residues correspond to the attributes from Table 1 are given from 2 to the last row.
Protein fold recognition (shown in percentage) on all the datasets using HPZXV attributes used by Ding and Dubchak [[25]]
| HPZXV | 23.1% | 29.5% | 32.8% |
| HPZXV | 20.5% | 23.5% | 28.8% |
| HPZXV | 27.5% | 31.7% | 38.4% |
MD-SFS backward elimination approach on DD-dataset using brute-5 criterion
| | HPZXV | 23.1% | HPZXV | 29.5% | HPZXV | 32.8% |
| 1-10 | BPCVS | 30.2% | BZKSH | 31.6% | BZCKP | 40.5% |
| 1-15 | BZFTP | 32.9% | BPKZF | 33.3% | BVCKS | 38.8% |
| All | BPEVO | 39.7% | BPDFM | 35.2% | IUKaP | 44.0% |
MD-SFS backward elimination approach on DD-dataset using MA-based criterion
| | HPZXV | 23.1% | HPZXV | 29.5% | HPZXV | 32.8% |
| 1-10 | BPCVS,KXFH | 35.0% | BZKSH,XCFV | 37.6% | BZCKP,SFH | 44.1% |
| 1-15 | BZFTP,CEXSK | 39.1% | BPKZF,XSCHE,f | 40.2% | BVCKS,FPAZR | 45.3% |
| All | 39.7% | 43.6% | IUKaP, MBbNO | 50.9% | ||
*LDA-Atr: BPEVO, XaJRW, AUIQ.
*SVM-Atr: BPDFM, WSHbf, IcXCT, EZaJK, ON.
MD-SFS backward elimination approach on TG-dataset using brute-5 criterion
| | HPZXV | 20.5% | HPZXV | 23.5% | HPZXV | 28.8% |
| 1-10 | FXBPC | 25.4% | FXPVB | 29.8% | FXVPH | 34.2% |
| 1-15 | FBPZV | 25.9% | BTAXP | 30.4% | BPRfX | 37.3% |
| All | FJBaf | 28.3% | JTFQB | 31.0% | JbXMK | 39.5% |
MD-SFS backward elimination approach on TG-dataset using MA-based criterion
| | HPZXV | 20.5% | HPZXV | 23.5% | HPZXV | 28.8% |
| 1-10 | FXBPC,ZVH | 29.8% | FXPVB,CHKS | 30.7% | FXVPH,CKSBZ | 37.6% |
| 1-15 | FBPZV,TCfAE,KSR | 32.7% | BTAXP,Zf | 33.0% | BPRfX,EKFSA,HCV | 41.5% |
| All | 38.6% | 36.1% | 45.3% | |||
*LDA-Atr: FJBaf, ZIEVU, YXPAb, LCQGM, ROW.
*SVM-Atr: JTFQB, AEXCS, IfMLK.
*NB-Atr: JbXMK, aHEPC, YOBVf.
MD-SFS backward elimination approach on EDD-dataset using brute-5 criterion
| | HPZXV | 27.5% | HPZXV | 31.7% | HPZXV | 38.4% |
| 1-10 | FXPHC | 32.6% | BPCZX | 36.5% | BXPVF | 44.1% |
| 1-15 | BTXVZ | 33.5% | BTPVC | 37.5% | BXPAf | 45.7% |
| All | TJXbV | 36.3% | JTFOH | 38.2% | IXMEb | 46.6% |
MD-SFS backward elimination approach on EDD-dataset using MA-based criterion
| | HPZXV | 27.5% | HPZXV | 31.7% | HPZXV | 38.4% |
| 1-10 | FXPHC,BVSKZ | 38.8% | BPCZX,FHKV | 39.4% | BXPVF,SKHC | 47.7% |
| 1-15 | BTXVZ,fFAEP,HCRSK | 45.5% | BTPVC,fKXFA,HS | 43.3% | BXPAf,KFSZH,CT | 51.3% |
| All | 51.8% | 47.4% | 53.9% | |||
*LDA-Atr: TJXbV,GZfUP,BcLHa,YSIQW,DAFEK.
*SVM-Atr: JTFOH,fMCDb,LEaXG,KQBWA,UIS.
*NB-Atr: IXMEb,KGHfC,PABSJ,FWaQc.
MD-SFS forward selection approach on DD-dataset using brute-5 criterion
| | HPZXV | 23.1% | HPZXV | 29.5% | HPZXV | 32.8% |
| 1-10 | BPCFK | 31.9% | BVPKF | 32.9% | BKPVC | 39.3% |
| 1-15 | BPCTV | 32.8% | BVPKf | 33.1% | BefKC | 40.3% |
| All | BDEFa | 35.3% | JBPKG | 34.0% | BUDOG | 44.1% |
MD-SFS forward selection approach on DD-dataset using MA-based criterion
| | HPZXV | 23.1% | HPZXV | 29.5% | HPZXV | 32.8% |
| 1-10 | BPCFK,ZX | 34.7% | BVPKF,HXCZS | 37.9% | BKPVC,SFHZ | 43.8% |
| 1-15 | BPCTV,FKHE | 37.4% | BVPKf,XAHTF,SCEZ | 39.1% | BEFKC,VPFHT,ASR | 44.7% |
| All | BDEFa,ZPcCQ | 40.2% | 42.8% | 50.5% | ||
*SVM-Atr: JBPKG,FUfSC,XEDHa,NTIbZ.
*NB-Atr: BUDOG,baQZI,TMPKN,C.
MD-SFS forward selection approach on TG-dataset using brute-5 criterion
| | HPZXV | 20.5% | HPZXV | 23.5% | HPZXV | 28.8% |
| 1-10 | FXBPC | 25.4% | BVFXP | 29.9% | BPXVF | 34.2% |
| 1-15 | FXTBV | 26.6% | BTEPX | 31.8% | BPEXf | 36.6% |
| All | FJTaB | 30.1% | JTFWD | 31.6% | JTMWO | 39.2% |
MD-SFS forward selection approach on TG-dataset using MA-based criterion
| | HPZXV | 20.5% | HPZXV | 23.5% | HPZXV | 28.8% |
| 1-10 | FXBPC,ZVH | 29.8% | BVFXP,CH | 30.7% | BPXVF,SKCHZ | 37.6% |
| 1-15 | FXTBV,ACPZE,fH | 33.4% | BTEPX,AfFVS | 33.4% | BPEXf,RAKFS,HCV | 41.5% |
| All | 38.0% | 35.9% | 45.3% | |||
*LDA-Atr: FJTaB,IUCPG,cLEMO,bHAW.
*SVM-Atr: JTFWD,XBQSI,afcRO,bMAPN,Z.
*NB-Atr: JTMWO,CXBAK,RPUHG,aEQFf,IYb.
MD-SFS forward selection approach on EDD-dataset using brute-5 criterion
| | HPZXV | 27.5% | HPZXV | 31.7% | HPZXV | 38.4% |
| 1-10 | BXCVF | 32.5% | BPXFC | 36.2% | BXPZF | 44.0% |
| 1-15 | BTFPE | 36.0% | BTPZA | 38.0% | BXPAf | 45.7% |
| All | ITXJc | 36.2% | ITMJB | 39.1% | JTMWF | 46.8% |
MD-SFS forward selection approach on EDD-dataset using MA-based criterion
| | HPZXV | 27.5% | HPZXV | 31.7% | HPZXV | 38.4% |
| 1-10 | BXCVF,PHSKZ | 29.8% | BPXFC,ZHKV | 39.6% | BXPZF,HKSC | 47.6% |
| 1-15 | BTFPE,CXZAH,fVKRS | 33.4% | BTPZA,CXfFK,ERSHV | 42.8% | BXPAf,KEFST,H | 51.3% |
| All | 38.0% | 46.9% | 54.6% | |||
*LDA-Atr: ITXJc,EGaLB,KbVYC,SDFPO,ZHUfW,AQ.
*SVM-Atr: ITMJB,LOHAa,EXFKC,YDZbR,NfVUG,Q.
*NB-Atr: JTMWF,XKaHP,AOCfD.
Statistical analysis using DD-dataset
| Random selection | 9.6% | 17.3% | 14.7% |
| MD-SFS forward selection approach | 35.3% | 34.0% | 44.1% |
| MD-SFS backward elimination approach | 39.7% | 35.2% | 44.0% |
Statistical analysis using TG-dataset
| Random selection | 21.2% | 25.9% | 30.7% |
| MD-SFS forward selection approach | 30.1% | 31.6% | 39.2% |
| MD-SFS backward elimination approach | 28.3% | 31.0% | 39.5% |
Statistical analysis using EDD-dataset
| Random selection | 29.9% | 32.8% | 39.8% |
| MD-SFS forward selection approach | 36.2% | 39.1% | 46.8% |
| MD-SFS backward elimination approach | 36.3% | 38.2% | 46.6% |