| Literature DB >> 30395289 |
Yasset Perez-Riverol1, Attila Csordas1, Jingwen Bai1, Manuel Bernal-Llinares1, Suresh Hewapathirana1, Deepti J Kundu1, Avinash Inuganti1, Johannes Griss1,2, Gerhard Mayer3, Martin Eisenacher3, Enrique Pérez1, Julian Uszkoreit3, Julianus Pfeuffer4, Timo Sachsenberg4, Sule Yilmaz5, Shivani Tiwary5, Jürgen Cox5, Enrique Audain6, Mathias Walzer1, Andrew F Jarnuczak1, Tobias Ternent1, Alvis Brazma1, Juan Antonio Vizcaíno1.
Abstract
The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30395289 PMCID: PMC6323896 DOI: 10.1093/nar/gky1106
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the PRIDE ecosystem, including the resources (PRIDE Archive and PRIDE Peptidome, in orange), tools (PRIDE Inspector and PX Submission Tool, in red), software libraries (in black), web interface and API (in green) and the external resources where PRIDE data are disseminated to (in purple).
Figure 2.Screenshots of the new PRIDE Archive web interface. (A) The project (dataset) page provides a general overview of every submitted dataset. (B) The PRIDE Archive search page, where it is possible for users to query PRIDE Archive using keywords and additional properties such as species, tissues and instruments, among others. (C) Real-time statistics (including number of submitted datasets per month, number of submitted datasets per instrument type, etc.) are now provided.
Figure 3.(A) Number of submitted datasets to PRIDE per month (from beginning of 2004 till September 2018). (B) Number of submitted datasets per experimental approach per year (from 2014 till September 2018).
Figure 4.Number of submitted datasets to PRIDE Archive per taxonomy identifier.
Figure 5.Data volume (in terabytes) downloaded from PRIDE Archive per year.
Figure 6.Screenshot of the Ensembl genome browser showing the visualization of peptide evidence as ‘TrackHubs’ coming from PRIDE Archive and PRIDE Cluster (now PRIDE Peptidome). All peptides shown come from mouse data (GRCm38).