Literature DB >> 11448877

Extending traditional query-based integration approaches for functional characterization of post-genomic data.

B A Eckman1, A S Kosky, L A Laroco .   

Abstract

MOTIVATION: To identify and characterize regions of functional interest in genomic sequence requires full, flexible query access to an integrated, up-to-date view of all related information, irrespective of where it is stored (within an organization or across the Internet) and its format (traditional database, flat file, web site, results of runtime analysis). Wide-ranging multi-source queries often return unmanageably large result sets, requiring non-traditional approaches to exclude extraneous data.
RESULTS: Target Informatics Net (TINet) is a readily extensible data integration system developed at GlaxoSmith- Kline (GSK), based on the Object-Protocol Model (OPM) multidatabase middleware system of Gene Logic Inc. Data sources currently integrated include: the Mouse Genome Database (MGD) and Gene Expression Database (GXD), GenBank, SwissProt, PubMed, GeneCards, the results of runtime BLAST and PROSITE searches, and GSK proprietary relational databases. Special-purpose class methods used to filter and augment query results include regular expression pattern-matching over BLAST HSP alignments and retrieving partial sequences derived from primary structure annotations. All data sources and methods are accessible through an SQL-like query language or a GUI, so that when new investigations arise no additional programming beyond query specification is required. The power and flexibility of this approach are illustrated in such integrated queries as: (1) 'find homologs in genomic sequence to all novel genes cloned and reported in the scientific literature within the past three months that are linked to the MeSH term 'neoplasms"; (2) 'using a neuropeptide precursor query sequence, return only HSPs where the target genomic sequences conserve the G[KR][KR] motif at the appropriate points in the HSP alignment'; and (3) 'of the human genomic sequences annotated with exon boundaries in GenBank, return only those with valid putative donor/acceptor sites and start/stop codons'.

Entities:  

Mesh:

Substances:

Year:  2001        PMID: 11448877     DOI: 10.1093/bioinformatics/17.7.587

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Visualizing information across multidimensional post-genomic structured and textual databases.

Authors:  Ying Tao; Carol Friedman; Yves A Lussier
Journal:  Bioinformatics       Date:  2004-12-14       Impact factor: 6.937

2.  An ontology-based comparative anatomy information system.

Authors:  Ravensara S Travillian; Kremena Diatchka; Tejinder K Judge; Katarzyna Wilamowska; Linda G Shapiro
Journal:  Artif Intell Med       Date:  2010-12-10       Impact factor: 5.326

3.  Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE.

Authors:  Marilyn Safran; Vered Chalifa-Caspi; Orit Shmueli; Tsviya Olender; Michal Lapidot; Naomi Rosen; Michael Shmoish; Yakov Peter; Gustavo Glusman; Ester Feldmesser; Avital Adato; Inga Peter; Miriam Khen; Tal Atarot; Yoram Groner; Doron Lancet
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  pubmed2ensembl: a resource for mining the biological literature on genes.

Authors:  Joachim Baran; Martin Gerner; Maximilian Haeussler; Goran Nenadic; Casey M Bergman
Journal:  PLoS One       Date:  2011-09-29       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.