Literature DB >> 23297035

Assessing identity, redundancy and confounds in Gene Ontology annotations over time.

Jesse Gillis1, Paul Pavlidis.   

Abstract

MOTIVATION: The Gene Ontology (GO) is heavily used in systems biology, but the potential for redundancy, confounds with other data sources and problems with stability over time have been little explored.
RESULTS: We report that GO annotations are stable over short periods, with 3% of genes not being most semantically similar to themselves between monthly GO editions. However, we find that genes can alter their 'functional identity' over time, with 20% of genes not matching to themselves (by semantic similarity) after 2 years. We further find that annotation bias in GO, in which some genes are more characterized than others, has declined in yeast, but generally increased in humans. Finally, we discovered that many entries in protein interaction databases are owing to the same published reports that are used for GO annotations, with 66% of assessed GO groups exhibiting this confound. We provide a case study to illustrate how this information can be used in analyses of gene sets and networks. AVAILABILITY: Data available at http://chibi.ubc.ca/assessGO.

Entities:  

Mesh:

Year:  2013        PMID: 23297035      PMCID: PMC3570208          DOI: 10.1093/bioinformatics/bts727

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  30 in total

1.  Intrinsic errors in genome annotation.

Authors:  D Devos; A Valencia
Journal:  Trends Genet       Date:  2001-08       Impact factor: 11.639

2.  The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; David Binns; Wolfgang Fleischmann; Paul Kersey; Nicola Mulder; Tom Oinn; John Maslen; Anthony Cox; Rolf Apweiler
Journal:  Genome Res       Date:  2003-03-12       Impact factor: 9.043

3.  Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome.

Authors:  Mark O Collins; Holger Husi; Lu Yu; Julia M Brandon; Chris N G Anderson; Walter P Blackstock; Jyoti S Choudhary; Seth G N Grant
Journal:  J Neurochem       Date:  2006-04       Impact factor: 5.372

4.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

5.  Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles.

Authors:  Warren A Cheung; Bf Francis Ouellette; Wyeth W Wasserman
Journal:  Genome Med       Date:  2012-09-28       Impact factor: 11.117

6.  On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report.

Authors:  Paul D Thomas; Valerie Wood; Christopher J Mungall; Suzanna E Lewis; Judith A Blake
Journal:  PLoS Comput Biol       Date:  2012-02-16       Impact factor: 4.475

7.  "Guilt by association" is the exception rather than the rule in gene networks.

Authors:  Jesse Gillis; Paul Pavlidis
Journal:  PLoS Comput Biol       Date:  2012-03-29       Impact factor: 4.475

8.  Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach.

Authors:  Carson Andorf; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2007-08-03       Impact factor: 3.169

9.  A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.

Authors:  Lourdes Peña-Castillo; Murat Tasan; Chad L Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Sara Mostafavi; Guan Ning Lin; Gabriel F Berriz; Francis D Gibbons; Gert Lanckriet; Jian Qiu; Charles Grant; Zafer Barutcuoglu; David P Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A Blake; Minghua Deng; Michael I Jordan; William S Noble; Quaid Morris; Judith Klein-Seetharaman; Ziv Bar-Joseph; Ting Chen; Fengzhu Sun; Olga G Troyanskaya; Edward M Marcotte; Dong Xu; Timothy R Hughes; Frederick P Roth
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

10.  Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Authors:  Alexandra M Schnoes; Shoshana D Brown; Igor Dodevski; Patricia C Babbitt
Journal:  PLoS Comput Biol       Date:  2009-12-11       Impact factor: 4.475

View more
  34 in total

Review 1.  Management of Dynamic Biomedical Terminologies: Current Status and Future Challenges.

Authors:  M Da Silveira; J C Dos Reis; C Pruski
Journal:  Yearb Med Inform       Date:  2015-08-13

2.  Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks.

Authors:  Jie Tan; Georgia Doing; Kimberley A Lewis; Courtney E Price; Kathleen M Chen; Kyle C Cady; Barret Perchuk; Michael T Laub; Deborah A Hogan; Casey S Greene
Journal:  Cell Syst       Date:  2017-07-12       Impact factor: 10.304

3.  Multiset Statistics for Gene Set Analysis.

Authors:  Michael A Newton; Zhishi Wang
Journal:  Annu Rev Stat Appl       Date:  2015-04       Impact factor: 5.810

4.  Evolutionary Selection and Constraint on Human Knee Chondrocyte Regulation Impacts Osteoarthritis Risk.

Authors:  Daniel Richard; Zun Liu; Jiaxue Cao; Ata M Kiapour; Jessica Willen; Siddharth Yarlagadda; Evelyn Jagoda; Vijaya B Kolachalama; Jakob T Sieker; Gary H Chang; Pushpanathan Muthuirulan; Mariel Young; Anand Masson; Johannes Konrad; Shayan Hosseinzadeh; David E Maridas; Vicki Rosen; Roman Krawetz; Neil Roach; Terence D Capellini
Journal:  Cell       Date:  2020-03-26       Impact factor: 41.582

Review 5.  Genetic variants in Alzheimer disease - molecular and brain network approaches.

Authors:  Chris Gaiteri; Sara Mostafavi; Christopher J Honey; Philip L De Jager; David A Bennett
Journal:  Nat Rev Neurol       Date:  2016-06-10       Impact factor: 42.937

6.  Monitoring changes in the Gene Ontology and their impact on genomic data analysis.

Authors:  Matthew Jacobson; Adriana Estela Sedeño-Cortés; Paul Pavlidis
Journal:  Gigascience       Date:  2018-08-01       Impact factor: 6.524

7.  Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

Authors:  Jesse Gillis; Sara Ballouz; Paul Pavlidis
Journal:  J Proteomics       Date:  2014-01-27       Impact factor: 4.044

8.  Gene networks underlying convergent and pleiotropic phenotypes in a large and systematically-phenotyped cohort with heterogeneous developmental disorders.

Authors:  Tallulah Andrews; Stephen Meader; Anneke Vulto-van Silfhout; Avigail Taylor; Julia Steinberg; Jayne Hehir-Kwa; Rolph Pfundt; Nicole de Leeuw; Bert B A de Vries; Caleb Webber
Journal:  PLoS Genet       Date:  2015-03-17       Impact factor: 5.917

9.  Genome-Wide Detection and Analysis of Multifunctional Genes.

Authors:  Yuri Pritykin; Dario Ghersi; Mona Singh
Journal:  PLoS Comput Biol       Date:  2015-10-05       Impact factor: 4.475

10.  Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression.

Authors:  Sonja Lehtinen; Jon Lees; Jürg Bähler; John Shawe-Taylor; Christine Orengo
Journal:  PLoS One       Date:  2015-08-19       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.