| Literature DB >> 19785723 |
Martin Klammer1, David N Messina, Thomas Schmitt, Erik L L Sonnhammer.
Abstract
BACKGROUND: Transmembrane (TM) proteins are proteins that span a biological membrane one or more times. As their 3-D structures are hard to determine, experiments focus on identifying their topology (i. e. which parts of the amino acid sequence are buried in the membrane and which are located on either side of the membrane), but only a few topologies are known. Consequently, various computational TM topology predictors have been developed, but their accuracies are far from perfect. The prediction quality can be improved by applying a consensus approach, which combines results of several predictors to yield a more reliable result.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19785723 PMCID: PMC2761906 DOI: 10.1186/1471-2105-10-314
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
TM topology predictors
| TopPred | 1.0 | Yes | No | No |
| PHDhtm | 2.1 | Yes | No | Yes |
| HMMTOP | 2.1 | Yes | No | Yes |
| TMHMM | 2.0 | Yes | No | No |
| PolyPhobius | 1.0 | Yes | Yes | Yes |
| Memsat | 3.0 | Yes | Yes | Yes |
| SignalP | 3.0 | No | Yes | No |
Features of the six TM topology predictors incorporated in MetaTM, plus the SP-only predictor SignalP.
Figure 1Segments consensus workflow. For clarity only three predictors are drawn. The orange elements represent predicted TM segments. (A) Scanning the results for the first segment. (B) Detecting overlapping segments and voting whether the group of overlapping segments should be added to the consensus result. (C) If the voting was positive (i. e. the SVM model predicts a TM segment - we assume that this is true in this case), a segment with averaged start and end positions was added to the consensus result (blue segment). (D) Masking the used segments and scanning for the next one.
Figure 2The masking procedure. For clarity only four predictors are drawn. The orange elements represent predicted TM segments. It is assumed in this example that three overlapping TM segments need to intersect with the voting window to have a positive consensus voting (i. e. the SVM model predicts a TM segment). (A) Only two segments are in the voting window, thus the voting result is negative. (B) Due to the negative voting result, the first segment in the window is masked and not used for further prediction anymore. The newly applied window contains three overlapping segments. (C) The consensus is reached in favor of the TM segment and therefore all overlapping segments are masked. Note that if both segments overlapping with the first window (the one displayed in (A)) had been excluded, there would not be any resulting consensus TM segments.
Data set categories
| TMsingleAndSP | 282 |
| TMmultiAndSP | 63 |
| TMsingleNoSP | 237 |
| TMmultiNoSP | 878 |
| GLBandSP | 1275 |
| GLBnoSP | 1087 |
The six categories of the data set.
N-terminal location prediction results
| TMsingleAndSP | 97.9% | 85.5% | 67.4% | 80.5% | 81.9% | 19.2% | |
| TMmultiAndSP | 73.0% | 65.1% | 66.7% | 81.0% | 30.2% | ||
| TMsingleNoSP | 84.8% | 69.2% | 72.2% | 79.8% | 78.9% | 74.3% | |
| TMmultiNoSP | 86.7% | 79.5% | 78.1% | 86.2% | 72.6% | 71.8% | |
| Average | 86.7% | 77.2% | 74.6% | 74.7% | 84.0% | 48.8% | |
| Average Rank | 3.3 | 4.5 | 4.5 | 5.0 | 2.3 | 6.5 |
MT: MetaTM, PP: PolyPhobius, TH: TMHMM, HT: HMMTOP, PH: PHDhtm, MS: Memsat, TP: TopPred. Average Rank is the average of the ranks achieved by each predictor in each category.
Number of TM segments prediction results
| TMsingleAndSP | 94.0% | 79.8% | 50.7% | 82.6% | 76.6% | 9.9% | |
| TMmultiAndSP | 88.9% | 63.5% | 63.5% | 61.9% | 71.4% | 15.9% | |
| TMsingleNoSP | 87.3% | 87.8% | 85.7% | 89.5% | 72.2% | ||
| TMmultiNoSP | 72.0% | 70.6% | 62.0% | 54.1% | 72.0% | 45.8% | |
| GLBandSP | 93.4% | 74.4% | 37.5% | 64.6% | 69.7% | 2.4% | |
| GLBnoSP | 97.0% | 96.0% | 86.6% | 71.6% | 90.4% | 49.5% | |
| Average | 88.4% | 77.7% | 67.1% | 70.1% | 78.3% | 32.6% | |
| Average Rank | 3.0 | 3.5 | 3.8 | 5.3 | 3.5 | 7.0 |
The abbreviations are the same as described in Table 3.
Entire topology prediction results
| TMsingleAndSP | 94.0% | 75.5% | 47.5% | 79.1% | 72.7% | 7.1% | |
| TMmultiAndSP | 81.0% | 52.4% | 57.1% | 47.6% | 65.1% | 3.2% | |
| TMsingleNoSP | 65.8% | 68.8% | 73.4% | 69.6% | 79.3% | 53.6% | |
| TMmultiNoSP | 66.1% | 63.7% | 51.4% | 65.7% | 45.7% | 37.2% | |
| GLBandSP | 93.4% | 74.4% | 37.5% | 64.6% | 69.7% | 2.4% | |
| GLBnoSP | 97.0% | 96.0% | 86.6% | 71.6% | 90.4% | 49.5% | |
| Average | 82.3% | 70.2% | 61.3% | 63.0% | 74.1% | 25.5% | |
| Average Rank | 3.2 | 3.8 | 4.5 | 5.0 | 3.2 | 7.0 |
The abbreviations are the same as described in Table 3. The prediction for globular proteins was counted as correct if no TM segment had been predicted, regardless of the N-terminal location prediction.
Signal peptide prediction results
| missed | 3.4% | 5.1% | |
| over-predicted | 6.7% | 16.3% | |
| average error | 5.9% | 9.2% |
The error rate of the signal peptide prediction.
ConPred II data set prediction results
| N-terminus | 83.1% | 78.4% | 73.2% | 74.5% | 66.7% | 77.5% | 68.8% | |
| # Segments | 76.6% | 74.0% | 59.3% | 66.7% | 60.2% | 71.4% | 50.2% | |
| Topology | 62.8% | 63.2% | 47.6% | 51.9% | 47.2% | 58.4% | 35.9% |
The abbreviations are the same as described in Table 3, except CP: ConPred II.