| Literature DB >> 19760129 |
Libusha Kelly1, Ursula Pieper, Narayanan Eswar, Franklin A Hays, Min Li, Zygy Roe-Zurz, Deanna L Kroetz, Kathleen M Giacomini, Robert M Stroud, Andrej Sali.
Abstract
Membrane proteins serve as cellular gatekeepers, regulators, and sensors. Prior studies have explored the functional breadth and evolution of proteins and families of particular interest, such as the diversity of transport-associated membrane protein families in prokaryotes and eukaryotes, the composition of integral membrane proteins, and family classification of all human G-protein coupled receptors. However, a comprehensive analysis of the content and evolutionary associations between membrane proteins and families in a diverse set of genomes is lacking. Here, a membrane protein annotation pipeline was developed to define the integral membrane genome and associations between 21,379 proteins from 34 genomes; most, but not all of these proteins belong to 598 defined families. The pipeline was used to provide target input for a structural genomics project that successfully cloned, expressed, and purified 61 of our first 96 selected targets in yeast. Furthermore, the methodology was applied (1) to explore the evolutionary history of the substrate-binding transmembrane domains of the human ABC transporter superfamily, (2) to identify the multidrug resistance-associated membrane proteins in whole genomes, and (3) to identify putative new membrane protein families.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19760129 PMCID: PMC2780624 DOI: 10.1007/s10969-009-9069-8
Source DB: PubMed Journal: J Struct Funct Genomics ISSN: 1345-711X
Fig. 1Membrane protein annotation pipeline. All Pfam-A families with at least three predicted transmembrane helices (TMH) were used to identify membrane protein families in 34 genomes (cyan). Sequences predicted to have three or more TMHs in each genome were collected (red). In parallel, Pfam family membership was defined where available for each sequence profile (yellow). Automated multiple sequence alignment profiles were generated for each sequence. A database of profiles was constructed, and each profile was compared to all other profiles in the database to link membrane proteins (blue). The annotation pipeline can be generally used as input to an experimental structure determination pipeline. Finally, resulting structures can be used as templates to generate comparative models for all homologous sequences. The five steps are detailed in Methods
Fig. 2PFAM membrane protein families in 34 organisms. Organism names are listed horizontally at the top (columns). 476 Pfam membrane protein families are on the vertical axis (rows). Colors indicate binning for the number of times a particular family appears in an organism. White indicates a particular family is not found in an organism; light blue means the family appears once; medium blue, between two and 49 times; and dark blue means the family is found 50 or more times. The red and yellow bars show the clustering of eukaryotes and prokaryotes, respectively. The heatmap was constructed using the R function “heatmap.2” with hierarchical clustering and default parameters [37]
Fig. 3Target selection for membrane protein structural genomics. Structural coverage of the known IMG (Integral Membrane Genome) sequence space was defined by taking sequences from 598 IM Pfam families and clustering them at 30% sequence identity
Fig. 4Links between the transmembrane domains of human ABC transporters. The transmembrane domains of 48 human ABC transporter proteins were excised from their complete sequences. Profiles were generated for each transmembrane domain and run against the membrane protein profile database (Methods). Significantly related profiles are linked and colored according to organism with red, blue, and yellow representing eukaryotes, bacteria, and archaea, respectively. The two major clusters represent ABCA and G family members (dark and light green); and ABCB, C, and D family members (purple)