| Literature DB >> 30252093 |
Jie Ma1, Tao Chen1, Songfeng Wu1, Chunyuan Yang1, Mingze Bai2, Kunxian Shu2, Kenli Li3, Guoqing Zhang4, Zhong Jin5, Fuchu He1, Henning Hermjakob1,6, Yunping Zhu1.
Abstract
Sharing of research data in public repositories has become best practice in academia. With the accumulation of massive data, network bandwidth and storage requirements are rapidly increasing. The ProteomeXchange (PX) consortium implements a mode of centralized metadata and distributed raw data management, which promotes effective data sharing. To facilitate open access of proteome data worldwide, we have developed the integrated proteome resource iProX (http://www.iprox.org) as a public platform for collecting and sharing raw data, analysis results and metadata obtained from proteomics experiments. The iProX repository employs a web-based proteome data submission process and open sharing of mass spectrometry-based proteomics datasets. Also, it deploys extensive controlled vocabularies and ontologies to annotate proteomics datasets. Users can use a GUI to provide and access data through a fast Aspera-based transfer tool. iProX is a full member of the PX consortium; all released datasets are freely accessible to the public. iProX is based on a high availability architecture and has been deployed as part of the proteomics infrastructure of China, ensuring long-term and stable resource support. iProX will facilitate worldwide data analysis and sharing of proteomics experiments.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30252093 PMCID: PMC6323926 DOI: 10.1093/nar/gky869
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Workflow of the proteomics data submission and curation process in iProX. The upper layer (blue rectangles) illustrates the data submission process for users, whereas the bottom layer (orange rectangles) represents the data curation process for iProX curators.
Figure 2.Summary of the datasets released in iProX. Distribution figures of the species, MS instrument and data size of datasets public available in iProX (by the end of July 2018).
Figure 3.System architecture and infrastructure of iProX. Based on a hyper-converged architecture, the application, virtualization and infrastructure layer are implemented for the iProX repository. The infrastructure layer includes hardware resources of computing, storage and network, while the virtualization layer enables a unified and virtualized resource pool to provide the real-time running resources required by iProX. The application layer uses virtual machines to achieve load balance and high availability, including load balance, application, database and datafile servers. The load balance is achieved by using the nginx and keepalived technologies on load balance and application servers, whereas two database servers are real-time synchronized to guarantee the high availability, and two file servers are used for high-speed data transfer and backup.