Literature DB >> 15955782

A semantic analysis of the annotations of the human genome.

Purvesh Khatri1, Bogdan Done, Archana Rao, Arina Done, Sorin Draghici.   

Abstract

The correct interpretation of any biological experiment depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are ubiquitous and used by all life scientists in most experiments. However, it is well known that such databases are incomplete and many annotations may also be incorrect. In this paper we describe a technique that can be used to analyze the semantic content of such annotation databases. Our approach is able to extract implicit semantic relationships between genes and functions. This ability allows us to discover novel functions for known genes. This approach is able to identify missing and inaccurate annotations in existing annotation databases, and thus help improve their accuracy. We used our technique to analyze the current annotations of the human genome. From this body of annotations, we were able to predict 212 additional gene-function assignments. A subsequent literature search found that 138 of these gene-functions assignments are supported by existing peer-reviewed papers. An additional 23 assignments have been confirmed in the meantime by the addition of the respective annotations in later releases of the Gene Ontology database. Overall, the 161 confirmed assignments represent 75.95% of the proposed gene-function assignments. Only one of our predictions (0.4%) was contradicted by the existing literature. We could not find any relevant articles for 50 of our predictions (23.58%). The method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15955782      PMCID: PMC2435251          DOI: 10.1093/bioinformatics/bti538

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  Practical limits of function prediction.

Authors:  D Devos; A Valencia
Journal:  Proteins       Date:  2000-10-01

2.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature.

Authors:  Soumya Raychaudhuri; Jeffrey T Chang; Patrick D Sutphin; Russ B Altman
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

3.  Predicting gene ontology functions from ProDom and CDD protein domains.

Authors:  Jonathan Schug; Sharon Diskin; Joan Mazzarelli; Brian P Brunk; Christian J Stoeckert
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

4.  Profiling gene expression using onto-express.

Authors:  Purvesh Khatri; Sorin Draghici; G Charles Ostermeier; Stephen A Krawetz
Journal:  Genomics       Date:  2002-02       Impact factor: 5.736

5.  Singular value decomposition for genome-wide expression data processing and modeling.

Authors:  O Alter; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  2000-08-29       Impact factor: 11.205

6.  Global functional profiling of gene expression.

Authors:  Sorin Draghici; Purvesh Khatri; Rui P Martins; G Charles Ostermeier; Stephen A Krawetz
Journal:  Genomics       Date:  2003-02       Impact factor: 5.736

7.  Genomic-scale comparison of sequence- and structure-based methods of function prediction: does structure provide additional insight?

Authors:  J S Fetrow; N Siew; J A Di Gennaro; M Martinez-Yamout; H J Dyson; J Skolnick
Journal:  Protein Sci       Date:  2001-05       Impact factor: 6.725

8.  SREBP-2 and NF-Y are involved in the transcriptional regulation of squalene epoxidase.

Authors:  Masaaki Nagai; Jun Sakakibara; Yuichi Nakamura; Fumitake Gejyo; Teruo Ono
Journal:  Biochem Biophys Res Commun       Date:  2002-07-05       Impact factor: 3.575

9.  Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters.

Authors:  Lani F Wu; Timothy R Hughes; Armaity P Davierwala; Mark D Robinson; Roland Stoughton; Steven J Altschuler
Journal:  Nat Genet       Date:  2002-06-24       Impact factor: 38.330

10.  Model-based cluster analysis of microarray gene-expression data.

Authors:  Wei Pan; Jizhen Lin; Chap T Le
Journal:  Genome Biol       Date:  2002-01-29       Impact factor: 13.583

View more
  25 in total

1.  Ontological analysis of gene expression data: current tools, limitations, and open problems.

Authors:  Purvesh Khatri; Sorin Drăghici
Journal:  Bioinformatics       Date:  2005-06-30       Impact factor: 6.937

2.  Analysis of microarray experiments of gene expression profiling.

Authors:  Adi L Tarca; Roberto Romero; Sorin Draghici
Journal:  Am J Obstet Gynecol       Date:  2006-08       Impact factor: 8.661

3.  Gene expression is highly correlated on the chromosome level in urinary bladder cancer.

Authors:  George I Lambrou; Maria Adamaki; Dimitris Delakas; Demetrios A Spandidos; Spyros Vlahopoulos; Apostolos Zaravinos
Journal:  Cell Cycle       Date:  2013-05-08       Impact factor: 4.534

4.  Detecting phenotype-specific interactions between biological processes from microarray data and annotations.

Authors:  Nadeem A Ansari; Riyue Bao; Călin Voichiţa; Sorin Drăghici
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2012 Sep-Oct       Impact factor: 3.710

5.  Predicting novel human gene ontology annotations using semantic analysis.

Authors:  Bogdan Done; Purvesh Khatri; Arina Done; Sorin Drăghici
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2010 Jan-Mar       Impact factor: 3.710

6.  mspecLINE: bridging knowledge of human disease with the proteome.

Authors:  Jeremy Handcock; Eric W Deutsch; John Boyle
Journal:  BMC Med Genomics       Date:  2010-03-10       Impact factor: 3.063

7.  Functional analysis: evaluation of response intensities--tailoring ANOVA for lists of expression subsets.

Authors:  Fabrice Berger; Bertrand De Meulder; Anthoula Gaigneaux; Sophie Depiereux; Eric Bareke; Michael Pierre; Benoît De Hertogh; Mauro Delorenzi; Eric Depiereux
Journal:  BMC Bioinformatics       Date:  2010-10-13       Impact factor: 3.169

8.  CHD3 proteins and polycomb group proteins antagonistically determine cell identity in Arabidopsis.

Authors:  Ernst Aichinger; Corina B R Villar; Sara Farrona; José C Reyes; Lars Hennig; Claudia Köhler
Journal:  PLoS Genet       Date:  2009-08-14       Impact factor: 5.917

9.  Topological properties of co-occurrence networks in published gene expression signatures.

Authors:  Heiko Muller; Francesco Acquati
Journal:  Bioinform Biol Insights       Date:  2008-04-17

10.  GS2: an efficiently computable measure of GO-based similarity of gene sets.

Authors:  Troy Ruths; Derek Ruths; Luay Nakhleh
Journal:  Bioinformatics       Date:  2009-03-16       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.