| Literature DB >> 18545655 |
Abstract
Prediction of transmembrane helices (TMH) in alpha helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18545655 PMCID: PMC2396505 DOI: 10.1371/journal.pone.0002399
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1A flowchart diagram of the MemBrain protocol.
Performance comparison of various TMH predictorsa.
| Predictor | VTMH | VP | N-score | C-score | RMSD |
| THUMBU | 85.5% | 47.1% | 6.9±4.9 | 6.7±4.9 | 0.58±0.19 |
| SOSUI | 89.1% | 57.1% | 5.0±4.1 | 5.0±4.2 | 0.44±0.21 |
| DAS-TMfilter | 90.7% | 64.3% | 6.5±5.0 | 5.5±5.3 | 0.58±0.16 |
| TOP-PRED | 92.6% | 60.0% | 4.5±3.8 | 4.6±3.9 | 0.45±0.15 |
| TMHMM | 91.0% | 65.7% | 4.5±3.8 | 4.5±3.9 | 0.44±0.15 |
| Phobius | 91.8% | 71.4% | 4.6±4.0 | 4.4±4.1 | 0.44±0.19 |
|
|
|
|
|
|
|
The testing dataset consists of 378 TMH segments from 70 proteins (see Supplementary Table S2).
http://sparks.informatics.iupui.edu/Softwares-Services_files/thumbup.htm [16].
http://bp.nuap.nagoya-u.ac.jp/sosui/ [11].
http://mendel.imp.ac.at/sat/DAS/DAS.html [20].
http://bioweb.pasteur.fr/seqanal/interfaces/toppred.html [1].
http://www.cbs.dtu.dk/services/TMHMM/ [6].
http://phobius.cgb.ki.se/ [7].
http://chou.med.harvard.edu/bioinf/MemBrain/.
Figure 2TMH length distribution in (a) 70 known membrane protein structures in the testing dataset, (b) TMHs predicted by TMHMM [6], (c) TMHs predicted by Phobius [7], and (d) TMHs predicted by MemBrain.
Figure 3The residue-specific TMH propensity of lactose permease of Escherichia coli (PDB code: 1PV7) [26], illustrating the method of assignment of TMHs by dynamic threshold segmentation.
The observed TMHs, assigned in ref. [26], are shown as the gray boxes.