| Literature DB >> 26694538 |
Amrita Roy Choudhury1, Marjana Novič1.
Abstract
Predicting the transmembrane regions is an important aspect of understanding the structures and architecture of different β-barrel membrane proteins. Despite significant efforts, currently available β-transmembrane region predictors are still limited in terms of prediction accuracy, especially in precision. Here, we describe PredβTM, a transmembrane region prediction algorithm for β-barrel proteins. Using amino acid pair frequency information in known β-transmembrane protein sequences, we have trained a support vector machine classifier to predict β-transmembrane segments. Position-specific amino acid preference data is incorporated in the final prediction. The predictor does not incorporate evolutionary profile information explicitly, but is based on sequence patterns generated implicitly by encoding the protein segments using amino acid adjacency matrix. With a benchmark set of 35 β-transmembrane proteins, PredβTM shows a sensitivity and precision of 83.71% and 72.98%, respectively. The segment overlap score is 82.19%. In comparison with other state-of-art methods, PredβTM provides a higher precision and segment overlap without compromising with sensitivity. Further, we applied PredβTM to analyze the β-barrel membrane proteins without defined transmembrane regions and the uncharacterized protein sequences in eight bacterial genomes and predict possible β-transmembrane proteins. PredβTM can be freely accessed on the web at http://transpred.ki.si/.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26694538 PMCID: PMC4687927 DOI: 10.1371/journal.pone.0145564
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Amino acid adjacency matrix.
The 20×20 amino acid adjacency matrix of the given βTM protein segment is shown. The matrix elements representing the frequency of the corresponding amino acid pairs in the segment are highlighted.
Fig 2Prediction of transmembrane regions by the PredβTM algorithm.
a) The vertical colored bars represent the first residues of the 162 segments of the protein 1QJP. The segments in green are predicted as tm, and those in red as ntm by the βTM SVM classifier. Two stretches of consecutive tm segments are highlighted (S31:S53 and S117:S124). (b) The tm stretch R covering residues 117–133 is enlarged. The constituent segments (S117:S124) are shown in their full lengths to illustrate the overlap. The reported transmembrane region (TM = 122–129) is highlighted.
Comparative performance analysis of the developed PredβTM algorithm and five other available algorithms.
| Algorithm | Known TM β-strands | Predicted TM β-strands | True positives | %Sensitivity | %Precision | %Segment Overlap |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| B2TMpred | 442 | 872 | 367 | 83.03 | 42.09 | 73.16 |
| TBBpred | 442 | 807 | 328 | 74.21 | 40.64 | 79.57 |
| ConBBPred | 442 | 288 | 249 | 56.33 | 86.46 | 70.75 |
| TMBETA-NET | 442 | 690 | 317 | 71.72 | 45.94 | 65.05 |
| TMBpro | 442 | 471 | 330 | 74.66 | 70.06 | 73.31 |
aSensitivity: TP/(TP+FN), % of all observed transmembrane β-strands predicted correctly by the model.
bPrecision (Positive predictive value): TP/(TP+FP), % of all predicted transmembrane β-strands that are correctly predicted.
Transmembrane β-strand prediction for the proteins annotated as outer-membrane in eight bacterial genomes.
| Strains | No. of proteins | βTM proteins | βTM proteins with available TM data | βTM proteins with no TM data | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Total | Predicted | Total | TM regions | Predicted TM regions | True positives | %Sensitivity | %Precision | |||
| O157:H7 ( | 5285 | 49 | 49 | 17 | 202 | 203 | 180 | 89.11 | 88.67 | 32 |
| K12 ( | 4140 | 62 | 61 | 23 | 306 | 302 | 280 | 91.50 | 92.72 | 39 |
| PA01 ( | 5571 | 29 | 29 | 6 | 98 | 87 | 80 | 81.63 | 91.95 | 23 |
| CO92 ( | 3797 | 29 | 29 | 10 | 104 | 126 | 104 | 100 | 82.54 | 19 |
| RdKW20 ( | 1609 | 7 | 7 | 4 | 15 | 36 | 15 | 100 | 41.67 | 3 |
| CT18 ( | 3141 | 40 | 40 | 8 | 112 | 103 | 97 | 86.61 | 94.17 | 32 |
| Sd197 ( | 4062 | 30 | 30 | 11 | 154 | 162 | 148 | 96.10 | 91.36 | 19 |
| 26695 ( | 1593 | 13 | 13 | 0 | NA | NA | NA | NA | NA | 13 |
|
|
|
|
|
|
|
|
|
|
|
|
The outer-membrane proteins with known transmembrane β-strands are analyzed in detail to obtain the prediction sensitivity and precision.
aSensitivity: TP/(TP+FN), % of all observed transmembrane β-strands predicted correctly by the model.
bPrecision: TP/(TP+FP), % of all predicted transmembrane β-strands that are correctly predicted.
Predicted and known βTM proteins in eight bacterial genomes.
| Strains | No. of proteins | Uncharacterized proteins | Known βTM proteins | %Genome | |
|---|---|---|---|---|---|
| Total | Predicted as βTM | ||||
| O157:H7 ( | 5285 | 1854 | 182 | 49 | 4.37 |
| K12 ( | 4140 | 21 | 0 | 62 | 1.51 |
| PA01 ( | 5571 | 2308 | 405 | 29 | 7.79 |
| CO92 ( | 3797 | 1064 | 142 | 29 | 4.50 |
| RdKW20 ( | 1609 | 389 | 48 | 7 | 3.41 |
| CT18 ( | 3141 | 889 | 100 | 40 | 4.54 |
| Sd197 ( | 4062 | 832 | 81 | 30 | 2.73 |
| 26695 ( | 1593 | 615 | 88 | 13 | 6.60 |
| TOTAL | 29198 | 7972 | 1046 | 259 | 4.47 |
Uncharacterized proteins in the genomes that are predicted as βTM proteins by PredβTM are listed. The percentages of the genomes that are identified as βTM proteins (known and predicted by PredβTM) are also mentioned.