Literature DB >> 34864908

RCSB Protein Data Bank: Improved Annotation, Search, and Visualization of Membrane Protein Structures Archived in the PDB.

Sebastian Bittrich1, Yana Rose1, Joan Segura1, Robert Lowe2,3, John D Westbrook2,3,4, Jose M Duarte1, Stephen K Burley1,2,3,5,4.   

Abstract

MOTIVATION: Membrane proteins are encoded by approximately one fifth of human genes but account for more than half of all US FDA approved drug targets. Thanks to new technological advances, the number of membrane proteins archived in the PDB is growing rapidly. However, automatic identification of membrane proteins or inference of membrane location is not a trivial task.
RESULTS: We present recent improvements to the RCSB Protein Data Bank web portal (RCSB PDB, rcsb.org) that provide a wealth of new membrane protein annotations integrated from 4 external resources: OPM, PDBTM, MemProtMD, and mpstruc. We have substantially enhanced the presentation of data on membrane proteins. The number of membrane proteins with annotations available on rcsb.org was increased by ∼80%. Users can search for these annotations, explore corresponding tree hierarchies, display membrane segments at the 1D amino acid sequence level, and visualize the predicted location of the membrane layer in 3D. AVAILABILITY: Annotations, search, tree data, and visualization are available at our rcsb.org web portal. Membrane visualization is supported by the open-source Mol* viewer (molstar.org and github.com/molstar/molstar). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34864908      PMCID: PMC8826025          DOI: 10.1093/bioinformatics/btab813

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Membranes define cellular and organellar boundaries. They are composed of phospholipid bilayers. Membrane proteins are either embedded in or associated with the phospholipid bilayer. Membrane proteins are crucial for cell survival and communication across membranes, serving as molecular transporters, signal receptors, ion channels and even enzymes. Recent improvements in experimental methods (e.g. use of cryo-electron microscopy and inclusion of detergents, lipid molecules, vesicles and nanodiscs) are providing a wealth of new possibilities for membrane protein structure determination Membrane proteins have diverse spatiotemporal characteristics. Integral membrane proteins are permanently attached to a lipid bilayer while peripheral ones form transient complexes with the membrane. Transmembrane proteins traverse the membrane bilayer at least once, whereas monotopic membrane proteins are attached to a single face of the lipid bilayer. Information on membrane proteins provided by dedicated resources such as OPM (Lomize ), PDBTM (Kozma ), MemProtMD (Newport ) and mpstruc (White, 2009) differ in coverage (Shimizu ) and type of available information (see Supplementary Tables S1 and S2). Historically, this complexity made it challenging for users to explore the plethora of information on membrane proteins freely available from the Protein Data Bank (PDB) archive. Herein, we present new features that provide consistent ways to search, browse and visualize membrane proteins by integrating information from trusted external sources into the recently streamlined RCSB PDB data management and delivery architecture (Burley ; Rose ), emphasizing flexibility, fidelity, maintainability and sustainability.

2 Results

On June 16 2021, the PDB archive housed 10 133 polymer entities annotated as membrane proteins by the previously integrated mpstruc resource. The newly integrated trusted resources (OPM, PDBTM and MemProtMD) increased coverage by ∼80% to 18 247 (see Supplementary Table S2). On the rcsb.org web portal, users can search, browse and visualize data on membrane proteins independent of annotation provenance. Links to the external data resources provide details, such as protein classification, amino acid sequence-level data or curated membrane locations (see Supplementary Table S1). To aid PDB data consumers in analyzing their search results, we display the distribution of hits annotated as membrane proteins in the search result Refinements panel (see Supplementary Fig. S1). Clicking on a membrane resource will drill-down into a subset of the results. Membrane annotations are programmatically accessible via RCSB PDB Search (search.rcsb.org), Data (data.rcsb.org) and Annotation APIs (1d-coordinates.rcsb.org).

2.1 Improved structure summary page

We have revamped the RCSB PDB Structure Summary page (Fig.  1), which provides summary information for each PDB entry. PDB entries are designated with a four-character alphanumeric PDB ID (e.g. 3SN6) and contain at least one polymer entity, which refer to chemically unique molecules in an entry. Detailed definitions can be found in (Burley ).
Fig. 1.

Tabs in the header of the RCSB PDB structure summary page provide an overview of available information on membrane proteins. (A) Visualization of predicted membrane orientation in Mol*. (B) Annotation details. (C) Orange boxes provide access to integrated external resources. (D) 1D visualization of membrane segments in Protein Feature View

Tabs in the header of the RCSB PDB structure summary page provide an overview of available information on membrane proteins. (A) Visualization of predicted membrane orientation in Mol*. (B) Annotation details. (C) Orange boxes provide access to integrated external resources. (D) 1D visualization of membrane segments in Protein Feature View Annotations integrated from external resources and links enable users to access additional details. Entities are annotated as membrane proteins if applicable. The entire PDB structure is annotated as membrane protein (Fig.  1C) if at least one entity is annotated as either transmembrane or membrane-associated by OPM, PDBTM, MemProtMD or mpstruc. The ‘Membrane Protein’ link in blue font (Fig.  1C) takes users to the Annotation tab of the structure entry (see Supplementary Fig. S3). With the exception of mpstruc, links in the orange boxes lead to structure-specific pages of the integrated external resources.

2.2 Visualize predicted membrane location in Mol*

We contributed a new implementation of the ANVIL algorithm (Postic ) to the Mol* package. The algorithm uses only 3D structure information to predict the membrane location. ANVIL is a simplified version of the TMDET algorithm (Tusnády ) used by PDBTM. The Mol* 3D viewer (Sehnal ) was extended with a customized set of membrane visualization tools (see Supplementary Fig. S2) that display predicted membrane boundaries. (N.B.: This visualization is independent of annotation provenance.) The RCSB image gallery allows access to this visualization for specific assemblies or the crystallographic asymmetric unit (Fig.  1A). Users should always visit external resources for reliable membrane location predictions. The ANVIL implementation is merely a visualization tool and may output flawed predictions (see Supplementary Table S3 for examples).

2.3 Membrane protein annotations

The Annotations page of each membrane protein structure contains a summary of extant annotations (see Supplementary Fig. S3). OPM and mpstruc provide detailed hierarchies, generic annotations are displayed for PDBTM and MemProtMD. Clicking a link highlighted with bold font will launch a search for polymer entities that share this annotation. All annotations are updated once per week.

2.4 Browse membrane annotation trees

Users can browse tree hierarchies provided by OPM and mpstruc using the Browse Annotations feature on the rcsb.org web portal. Increasingly fine-grained classifications are available by clicking on branches of the tree (see Supplementary Fig. S4). The link at the end of each line triggers the corresponding search and returns all matching entities. Like all annotation trees depicted on rcsb.org, the mpstruc and OPM tree can either be explored individually or accessed via the Advanced Search panel.

2.5 Explore membrane segments in the Protein Feature View

The OPM and PDBTM resources provide sequence-level data on segments that are embedded in or associated with a membrane. This information can be visualized in the Protein Feature View (Segura ), which allows exploring the relation of membrane segments to other 1D sequence features such as secondary structure elements or ligand binding sites, or 3D structure features (see Supplementary Fig. S5).

3 Conclusions

We report integration of information from four trusted membrane protein data resources. Coverage of membrane proteins in the rcsb.org web portal improved substantially and users now have access to new 1D and 3D visualizations for membrane proteins. Recent RCSB PDB led innovations (Rose ; Segura ) and the Mol* 3D viewer (Sehnal ), collaboratively developed by the Protein Data Bank in Europe and RCSB PDB, enabled seamless integration of new features into the rcsb.org web portal search infrastructure. Click here for additional data file.
  11 in total

1.  Transmembrane proteins in the Protein Data Bank: identification and classification.

Authors:  Gábor E Tusnády; Zsuzsanna Dosztányi; István Simon
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

2.  Membrane positioning for high- and low-resolution protein structures through a binary classification approach.

Authors:  Guillaume Postic; Yassine Ghouzam; Vincent Guiraud; Jean-Christophe Gelly
Journal:  Protein Eng Des Sel       Date:  2015-12-19       Impact factor: 1.650

Review 3.  Biophysical dissection of membrane proteins.

Authors:  Stephen H White
Journal:  Nature       Date:  2009-05-21       Impact factor: 49.962

Review 4.  Comparative analysis of membrane protein structure databases.

Authors:  Kentaro Shimizu; Wei Cao; Gull Saad; Michiru Shoji; Tohru Terada
Journal:  Biochim Biophys Acta Biomembr       Date:  2018-01-10       Impact factor: 3.747

5.  Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures.

Authors:  David Sehnal; Sebastian Bittrich; Mandar Deshpande; Radka Svobodová; Karel Berka; Václav Bazgier; Sameer Velankar; Stephen K Burley; Jaroslav Koča; Alexander S Rose
Journal:  Nucleic Acids Res       Date:  2021-07-02       Impact factor: 16.971

6.  PDBTM: Protein Data Bank of transmembrane proteins after 8 years.

Authors:  Dániel Kozma; István Simon; Gábor E Tusnády
Journal:  Nucleic Acids Res       Date:  2012-11-30       Impact factor: 16.971

7.  The MemProtMD database: a resource for membrane-embedded protein structures and their lipid interactions.

Authors:  Thomas D Newport; Mark S P Sansom; Phillip J Stansfeld
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 19.160

8.  RCSB Protein Data Bank 1D Tools and Services.

Authors:  Joan Segura; Yana Rose; John Westbrook; Stephen K Burley; Jose M Duarte
Journal:  Bioinformatics       Date:  2020-12-12       Impact factor: 6.937

9.  RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences.

Authors:  Stephen K Burley; Charmi Bhikadiya; Chunxiao Bi; Sebastian Bittrich; Li Chen; Gregg V Crichlow; Cole H Christie; Kenneth Dalenberg; Luigi Di Costanzo; Jose M Duarte; Shuchismita Dutta; Zukang Feng; Sai Ganesan; David S Goodsell; Sutapa Ghosh; Rachel Kramer Green; Vladimir Guranović; Dmytro Guzenko; Brian P Hudson; Catherine L Lawson; Yuhe Liang; Robert Lowe; Harry Namkoong; Ezra Peisach; Irina Persikova; Chris Randle; Alexander Rose; Yana Rose; Andrej Sali; Joan Segura; Monica Sekharan; Chenghua Shao; Yi-Ping Tao; Maria Voigt; John D Westbrook; Jasmine Y Young; Christine Zardecki; Marina Zhuravleva
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

10.  RCSB Protein Data Bank: Architectural Advances Towards Integrated Searching and Efficient Access to Macromolecular Structure Data from the PDB Archive.

Authors:  Yana Rose; Jose M Duarte; Robert Lowe; Joan Segura; Chunxiao Bi; Charmi Bhikadiya; Li Chen; Alexander S Rose; Sebastian Bittrich; Stephen K Burley; John D Westbrook
Journal:  J Mol Biol       Date:  2020-11-10       Impact factor: 6.151

View more
  3 in total

1.  RCSB Protein Data Bank: Celebrating 50 years of the PDB with new tools for understanding and visualizing biological macromolecules in 3D.

Authors:  Stephen K Burley; Charmi Bhikadiya; Chunxiao Bi; Sebastian Bittrich; Li Chen; Gregg V Crichlow; Jose M Duarte; Shuchismita Dutta; Maryam Fayazi; Zukang Feng; Justin W Flatt; Sai J Ganesan; David S Goodsell; Sutapa Ghosh; Rachel Kramer Green; Vladimir Guranovic; Jeremy Henry; Brian P Hudson; Catherine L Lawson; Yuhe Liang; Robert Lowe; Ezra Peisach; Irina Persikova; Dennis W Piehl; Yana Rose; Andrej Sali; Joan Segura; Monica Sekharan; Chenghua Shao; Brinda Vallat; Maria Voigt; John D Westbrook; Shamara Whetstone; Jasmine Y Young; Christine Zardecki
Journal:  Protein Sci       Date:  2021-11-06       Impact factor: 6.725

2.  Therapeutic material basis and underling mechanisms of Shaoyao Decoction-exerted alleviation effects of colitis based on GPX4-regulated ferroptosis in epithelial cells.

Authors:  Juan Li; Xiangge Tian; Jinming Liu; Yuying Mo; Xiaoyi Guo; Yang Qiu; Yuejian Liu; Xiaochi Ma; Yan Wang; Yongjian Xiong
Journal:  Chin Med       Date:  2022-08-16       Impact factor: 4.546

Review 3.  Protein-protein interaction prediction with deep learning: A comprehensive review.

Authors:  Farzan Soleymani; Eric Paquet; Herna Viktor; Wojtek Michalowski; Davide Spinello
Journal:  Comput Struct Biotechnol J       Date:  2022-09-19       Impact factor: 6.155

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.