Literature DB >> 31094692

Enabling Massive XML-Based Biological Data Management in HBase.

Jian Liu, Qiuru Liu, Lei Zhang, Shuhui Su, Yongzhuang Liu.   

Abstract

Publishing biological data in XML formats is attractive for organizations who would like to provide their bioinformatics resources in an extensible and machine-readable format. In the era of big data, massive XML-based biological data management is emerged as a challengeable issue. With the continuous growth of the XML-based biological data sets, it is usually frustrating to use traditional declarative query languages to provide efficient query capabilities in terms of processing speed and scale. In this study, we report a novel platform to store and query massive XML-based biological data collections. A prototype tool for constructing HBase tables from XML-based biological data collections is first developed, and then a formal approach to transform the XML query model into the MapReduce query model is proposed. Finally, an evaluation of the query performance of the proposed approach on the existing XML-based biological databases is presented, showing that the performance advantages of the proposed solution. The source code of the massive XML-based biological data management platform is freely available at https://github.com/lyotvincent/X2H.

Entities:  

Mesh:

Year:  2020        PMID: 31094692     DOI: 10.1109/TCBB.2019.2915811

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  4 in total

1.  An effective biomedical data migration tool from resource description framework to JSON.

Authors:  Jian Liu; Mo Yang; Lei Zhang; Weijun Zhou
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

2.  NeoPeptide: an immunoinformatic database of T-cell-defined neoantigens.

Authors:  Wei-Jun Zhou; Zhi Qu; Chao-Yang Song; Yang Sun; An-Li Lai; Ma-Yao Luo; Yu-Zhe Ying; Hu Meng; Zhao Liang; Yan-Jie He; Yu-Hua Li; Jian Liu
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

3.  Notifiable diseases interoperable framework toward improving Iran public health surveillance system: Lessons learned from COVID-19 pandemic.

Authors:  Mostafa Shanbehzadeh; Hadi Kazemi-Arpanahi; Ali Asghar Valipour; Atefeh Zahedi
Journal:  J Educ Health Promot       Date:  2021-05-31

4.  iProX in 2021: connecting proteomics data sharing with big data.

Authors:  Tao Chen; Jie Ma; Yi Liu; Zhiguang Chen; Nong Xiao; Yutong Lu; Yinjin Fu; Chunyuan Yang; Mansheng Li; Songfeng Wu; Xue Wang; Dongsheng Li; Fuchu He; Henning Hermjakob; Yunping Zhu
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.