| Literature DB >> 33186584 |
Yana Rose1, Jose M Duarte1, Robert Lowe2, Joan Segura1, Chunxiao Bi1, Charmi Bhikadiya2, Li Chen2, Alexander S Rose1, Sebastian Bittrich1, Stephen K Burley3, John D Westbrook4.
Abstract
The US Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves many millions of unique users worldwide by delivering experimentally-determined 3D structures of biomolecules integrated with >40 external data resources via RCSB.org, application programming interfaces (APIs), and FTP downloads. Herein, we present the architectural redesign of RCSB PDB data delivery services that build on existing PDBx/mmCIF data schemas. New data access APIs (data.rcsb.org) enable efficient delivery of all PDB archive data. A novel GraphQL-based API provides flexible, declarative data retrieval along with a simple-to-use REST API. A powerful new search system (search.rcsb.org) seamlessly integrates heterogeneous types of searches across the PDB archive. Searches may combine text attributes, protein or nucleic acid sequences, small-molecule chemical descriptors, 3D macromolecular shapes, and sequence motifs. The new RCSB.org architecture adheres to the FAIR Principles, empowering users to address a wide array of research problems in fundamental biology, biomedicine, biotechnology, bioengineering, and bioenergy.Entities:
Keywords: FAIR principles; computer architecture; databases; structural biology
Mesh:
Substances:
Year: 2020 PMID: 33186584 PMCID: PMC9093041 DOI: 10.1016/j.jmb.2020.11.003
Source DB: PubMed Journal: J Mol Biol ISSN: 0022-2836 Impact factor: 6.151
Figure 1.Data management and delivery system underpinning the new RCSB architecture.
Figure 2.Schema usage by different components of the data management and delivery system.
Figure 3.Example search and data access queries: (a) query that combines text (1), sequence (2), structure shape (3), and chemical similarity (4) searches; (b) GraphQL API query including essential entry details (1–2), information details of the macromolecular entity data hierarchy (2–4) and small-molecules (5).