| Literature DB >> 22069449 |
David Jimenez-Morales1, Jie Liang.
Abstract
β-barrel membrane proteins play an important role in controlling the exchange and transport of ions and organic molecules across bacterial and mitochondrial outer membranes. They are also major regulators of apoptosis and are important determinants of bacterial virulence. In contrast to β-helical membrane proteins, their evolutionary pattern of residue substitutions has not been quantified, and there are no scoring matrices appropriate for their detection through sequence alignment. Using a Bayesian Monte Carlo estimator, we have calculated the instantaneous substitution rates of transmembrane domains of bacterial β-barrel membrane proteins. The scoring matrices constructed from the estimated rates, called bbTM for β-barrel Transmembrane Matrices, improve significantly the sensitivity in detecting homologs of β-barrel membrane proteins, while avoiding erroneous selection of both soluble proteins and other membrane proteins of similar composition. The estimated evolutionary patterns are general and can detect β-barrel membrane proteins very remote from those used for substitution rate estimation. Furthermore, despite the separation of 2-3 billion years since the proto-mitochondrion entered the proto-eukaryotic cell, mitochondria outer membrane proteins in eukaryotes can also be detected accurately using these scoring matrices derived from bacteria. This is consistent with the suggestion that there is no eukaryote-specific signals for translocation. With these matrices, remote homologs of β-barrel membrane proteins with known structures can be reliably detected at genome scale, allowing construction of high quality structural models of their transmembrane domains, at the rate of 131 structures per template protein. The scoring matrices will be useful for identification, classification, and functional inference of membrane proteins from genome and metagenome sequencing projects. The estimated substitution pattern will also help to identify key elements important for the structural and functional integrity of β-barrel membrane proteins, and will aid in the design of mutagenesis studies.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22069449 PMCID: PMC3206045 DOI: 10.1371/journal.pone.0026400
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The 11 template proteins, their composition, and hydrophobicity index values.
| # of Residues and TM Strands | Hydrophobicity Index (GES) | |||||
| PDB | TM | TM | TM | TM | TM | TM |
| 1A0S | 172/413/18 | 84 | 87 | −0.54 | −1.66 | 0.52 |
| 1BXW | 84/172/8 | 42 | 42 | −0.05 | −1.76 | 1.66 |
| 1E54 | 139/332/16 | 70 | 69 | −0.33 | −1.8 | 1.17 |
| 1FEP | 206/724/22 | 102 | 104 | −0.67 | −2.25 | 0.87 |
| 1I78 | 102/297/10 | 50 | 51 | −0.11 | −1.99 | 1.71 |
| 1KMO | 217/774/22 | 108 | 109 | −0.94 | −2.6 | 0.7 |
| 1NQE | 220/549/22 | 111 | 109 | −0.87 | −2.47 | 0.77 |
| 1QD6 | 124/240/12 | 59 | 64 | −0.63 | −2.64 | 1.16 |
| 1QJ8 | 75/148/8 | 35 | 40 | 0.2 | −1.02 | 1.27 |
| 2MPR | 178/427/16 | 90 | 87 | −0.75 | −2.5 | 1.04 |
| 2OMF | 153/340/16 | 76 | 77 | −0.66 | −2.38 | 1.04 |
| Mean | 152/401/16 | 75 | 76 | −0.49 | −2.10 | 1.08 |
TM: number of residues in the TM region; Total: total number of residue in the protein; # Strands: number of -strands in the TM region; TM: number of residues in the TM in-facing region; and TM: number of residues in the TM lipid out-facing region. The hydrophobicity is measured by the GES index [33], with negative values representing polarity and positive values hydrophobicity.
Figure 1Estimated amino acid substitution rates.
Estimated instantaneous rates of substitution for residues in the TM segments and at different TM interfaces from 11 template -barrel membrane proteins. The size of the bubble is proportional to the value of the estimated substitution rate. The instantaneous substitution rates (A) for all TM residues (); (B) for residues out-facing the membrane (); and (C) for residues in-facing the membrane ().
Figure 2Similarity in substitution pattern for residues in the TM region of
-barrel membrane proteins. Clustering trees showing grouping of residues in the transmembrane regions by similarity in substitution patterns. Residues are clustered by pairwise euclidean distance between the 19-dimensional vectors of instantaneous rates of residue substitutions.
Specificity of scoring matrices in detecting -barrel membrane proteins.
|
|
|
|
| B | P | S | P |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 20 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 52 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 142 | 5 |
|
| 0 | 0 | 0 | 0 | 5 | 319 | 42 |
|
| 0 | 0 | 0 | 0 | 45 | 689 | 181 |
Cumulative number of random sequences incorrectly identified as homologs of -barrel membrane proteins at different -value resulting from Blast searches against a database of 362 randomized membrane proteins sequences using as queries the concatenated transmembrane segments of 20 template -barrel membrane proteins.
Specificity of scoring matrices: Blast searches against a data set of membrane proteins with other architecture and a data set of globular proteins (oMBp/Globular).
|
|
|
|
| B | P | S | P |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 2/2 | 0/0 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 21/3 | 0/0 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 0/1 | 98/5 | 1/1 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 3/6 | 457/13 | 3/2 |
|
| 0/0 | 0/0 | 0/0 | 0/0 | 13/26 | 1780/42 | 28/31 |
Cumulative number of sequences of membrane proteins with other architecture and globular protein sequences incorrectly identified as homologs of -barrel membrane proteins at different -values resulting from Blast searches against the oMBp/Globular data set. The number of sequences part of oMBp is 10,951 (1,061 from Archaea and 9,890 from Eukaryota). The size of the data set Globular is 127,485 globular protein sequences (16,814 Archaea and 110,671 Eukaryota). We used as queries the concatenated transmembrane sequences of the 20 template proteins.
Performance of bbTM matrices in detecting homologs of -barrel membrane protein sequences from the “true-positive” database.
|
|
|
|
| B | P | S | P |
|
| 49 | 62 | 56 | 5 | 48 | 46 | 8 |
|
| 116 | 106 | 121 | 32 | 121 | 119 | 41 |
|
| 122 | 121 | 129 | 42 | 133 | 130 | 79 |
|
| 128 | 127 | 143 | 83 | 141 | 143 | 102 |
|
| 138 | 131 | 147 | 95 | 148 | 145 | 107 |
|
| 146 | 139 | 168 | 109 | 176 | 170 | 119 |
|
| 153 | 144 | 206 | 120 | 200 | 202 | 136 |
|
| 191 | 166 | 245 | 126 | 272 | 260 | 202 |
Cumulative number of proteins identified as homologs of 20 template -barrel membrane proteins at different -values obtained from Blast searches against the “true-positive” database of 3,079 sequences of -barrel membrane proteins.
Performance of bbTM matrices in detecting homologs from the non-redundant NCBI protein sequence database.
|
|
|
|
| B | P | S | P |
|
| 821 | 934 | 897 | 65 | 605 | 608 | 103 |
|
| 1556 | 1579 | 1977 | 294 | 1781 | 1832 | 416 |
|
| 2020 | 1879 | 2211 | 504 | 2120 | 2749 | 649 |
|
| 2201 | 2135 | 2327 | 650 | 2309 | 4040 | 812 |
|
| 2262 | 2212 | 2377 | 708 | 2385 | 5516 | 1142 |
|
| 2322 | 2288 | 2464 | 856 | 2477 | 7495 | 1475 |
|
| 2407 | 2437 | 2602 | 1198 | 2570 | 8538 | 1677 |
|
| 2573 | 2573 | 2757 | 1503 | 2799 | 9192 | 1966 |
Cumulative number of proteins identified as homologs of the 20 template -barrel membrane proteins at different -value obtained from Blast searches against the non-redundant NCBI protein database of 13,135,398 sequences.
Performance of bbTM matrices in detecting homologs of the human mitochondrial proteins VDAC, TOM40 and SAM50.
|
|
|
|
|
|
| 266 | 277 | 269 |
|
| 335 | 324 | 348 |
|
| 355 | 354 | 360 |
|
| 364 | 361 | 371 |
|
| 369 | 364 | 373 |
|
| 378 | 370 | 381 |
|
| 381 | 376 | 384 |
|
| 383 | 379 | 388 |
Cumulative number of proteins identified as homologs of the human mitochondrial -barrel membrane proteins VDAC-1 (uniprot: VDAC1_HUMAN), TOM40 (uniprot: TOM40_HUMAN) and SAM50 (uniprot: SAM50_HUMAN), at different -value obtained from Blast searches against the non-redundant NCBI database of 13,135,398 sequences. These hits are all confirmed to be mitochondria proteins by manual inspection of annotation.
Figure 3Number of -barrel membrane proteins homologous to the 20 proteins with known structures.
There are altogether 2,619 proteins in the OMPdb database [44] of -barrel membrane proteins, whose TM regions can be mapped onto one of the 20 proteins by using the bbTM scoring matrix. Structures of the TM regions of these proteins can then be predicted by using template-based structure prediction methods.