Literature DB >> 32750853

META-BASE: A Novel Architecture for Large-Scale Genomic Metadata Integration.

Anna Bernasconi, Arif Canakoglu, Marco Masseroli, Stefano Ceri.   

Abstract

The integration of genomic metadata is, at the same time, an important, difficult, and well-recognized challenge. It is important because a wealth of public data repositories is available to drive biological and clinical research; combining information from various heterogeneous and widely dispersed sources is paramount to a number of biological discoveries. It is difficult because the domain is complex and there is no agreement among the various metadata definitions, which refer to different vocabularies and ontologies. It is well-recognized in the bioinformatics community because, in the common practice, repositories are accessed one-by-one, learning their specific metadata definitions as result of long and tedious efforts, and such practice is error-prone. In this paper, we describe META-BASE, an architecture for integrating metadata extracted from a variety of genomic data sources, based upon a structured transformation process. We present a variety of innovative techniques for data extraction, cleaning, normalization and enrichment. We propose a general, open and extensible pipeline that can easily incorporate any number of new data sources, and propose the resulting repository-already integrating several important sources-which is exposed by means of practical user interfaces to respond biological researchers' needs.

Entities:  

Mesh:

Year:  2022        PMID: 32750853     DOI: 10.1109/TCBB.2020.2998954

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  3 in total

1.  GeMI: interactive interface for transformer-based Genomic Metadata Integration.

Authors:  Giuseppe Serna Garcia; Michele Leone; Anna Bernasconi; Mark J Carman
Journal:  Database (Oxford)       Date:  2022-06-03       Impact factor: 4.462

2.  Shortcomings of SARS-CoV-2 genomic metadata.

Authors:  Landen Gozashti; Russell Corbett-Detig
Journal:  BMC Res Notes       Date:  2021-05-17

3.  Genomic data integration and user-defined sample-set extraction for population variant analysis.

Authors:  Tommaso Alfonsi; Anna Bernasconi; Arif Canakoglu; Marco Masseroli
Journal:  BMC Bioinformatics       Date:  2022-09-29       Impact factor: 3.307

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.