Literature DB >> 26529777

Ontology-Based Search of Genomic Metadata.

Javier D Fernandez, Maurizio Lenzerini, Marco Masseroli, Francesco Venco, Stefano Ceri.   

Abstract

The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.

Mesh:

Year:  2015        PMID: 26529777     DOI: 10.1109/TCBB.2015.2495179

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  2 in total

1.  Semantic concept schema of the linear mixed model of experimental observations.

Authors:  Hanna Ćwiek-Kupczyńska; Katarzyna Filipiak; Augustyn Markiewicz; Philippe Rocca-Serra; Alejandra N Gonzalez-Beltran; Susanna-Assunta Sansone; Emilie J Millet; Fred van Eeuwijk; Agnieszka Ławrynowicz; Paweł Krajewski
Journal:  Sci Data       Date:  2020-02-27       Impact factor: 6.444

2.  GenoSurf: metadata driven semantic search system for integrated genomic datasets.

Authors:  Arif Canakoglu; Anna Bernasconi; Andrea Colombo; Marco Masseroli; Stefano Ceri
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.