Literature DB >> 20495000

FuncBase: a resource for quantitative gene function annotation.

John E Beaver1, Murat Tasan, Francis D Gibbons, Weidong Tian, Timothy R Hughes, Frederick P Roth.   

Abstract

SUMMARY: Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. For example, a custom Cytoscape viewer shows functional linkage graphs relevant to the gene or function of interest. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. AVAILABILITY: FuncBase as well as all underlying data and annotations are freely available via http://func.med.harvard.edu/

Entities:  

Mesh:

Year:  2010        PMID: 20495000      PMCID: PMC2894510          DOI: 10.1093/bioinformatics/btq265

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Computational prediction—e.g. of gene function, gene phenotype, protein interactions or genetic interactions—offers a statistically sound form of triage for reducing experimental tasks that would be prohibitive otherwise. For example, in genetic disease mapping, a candidate gene approach can reduce the study size required to establish significance. This is critically important, since large association studies are costly and may be infeasible for rare diseases. Functions are commonly represented by Gene Ontology (GO; Ashburner et al., 2000) terms, which encompass molecular functions, cellular locations and biological processes. Experimentalists differ in their requirements for function prediction. To maximize new discoveries, some will wish to cast a wide net that may include many false positives. Others, for whom follow-up experiments are more resource-intensive, will wish to proceed conservatively. Therefore, FuncBase displays quantitative confidence measures by which predictions may be ranked. Because users typically have additional domain knowledge that they can draw upon to filter out unlikely predictions, FuncBase shows predictions in the context of underlying evidence. FuncBase currently displays function annotations for several species. For each species, annotations are based on machine learning algorithms applied to an integrated data collection including protein motif annotation, phenotype and disease association, phylogenetic profiles, protein interactions and gene expression. Full descriptions for the underlying machine learning algorithm are provided in Tian et al. (2008), Pena-Castillo et al. (2008) and Taşan et al. (2008).

2 BACKGROUND

For each gene-function pair examined, a gene function prediction algorithm may provide a binary ‘black or white’ classification, a ranking or a quantitative confidence measure. Interfaces displaying gene function predictions currently take one of three forms. In the first form, binary calls are incorporated into an existing species-specific database, such as the Saccharomyces Genome Database (SGD; Cherry et al., 1998) or the Mouse Genome Informatics resource (MGI; Bult et al., 2008). While ‘black or white’ calls are useful for archiving accepted knowledge about gene function, they are incomplete guides to grey areas of current knowledge. The second form of interface enables users to apply prediction algorithms to datasets provided by the user. This second form is taken by such websites as GeneMANIA (Mostafavi et al., 2008) and VIRGO (Massjouni et al., 2006). A third form, represented by FuncBase , STRING (von Mering et al., 2007) and BioPIXIE (Myers et al., 2005), is a browser of precalculated predictions ranked by confidence score, together with their literature verification status. Relaxing the requirement that quantitative predictions be generated ‘on the fly’ allows use of more computationally intensive prediction algorithms.

3 FEATURES

View predictions by gene or function: Predictions in FuncBase can be viewed either by function (GO term) or by gene. Users may search for their gene or function using a rich search syntax (Section 4) permitting entry of gene or protein synonyms from multiple identifier systems, and text-matching within gene or function descriptions (Fig. 1A).
Fig. 1.

Search (A) for an annotation report of a GO term (B) or gene (C). GO term reports show evidence of functional relationships (D) and function-related gene properties (E). The user may provide opinions (F) on any quantitative annotation. Gene reports also present evidence based on functional relationships (G).

Search (A) for an annotation report of a GO term (B) or gene (C). GO term reports show evidence of functional relationships (D) and function-related gene properties (E). The user may provide opinions (F) on any quantitative annotation. Gene reports also present evidence based on functional relationships (G). Both function and gene views (examples shown in Figs 1B and C) allow predictions to be sorted by the confidence score from any available prediction method. GO annotations previously assigned by the corresponding species-specific authority are displayed next to each prediction. View supporting evidence: Users may wish to further filter quantitative annotations based on their domain knowledge. Therefore, FuncBase displays key pieces of evidence underlying annotations. Some annotation algorithms take a guilt-by-profiling approach—e.g. genes involved in ‘negative regulation of microtubule polymerization or depolymerization’ (GO:0031111) tend to contain a DH protein domain (InterPro pattern IPR000219). Therefore, each function view displays the gene properties that are most predictive of that function. A table (Fig. 1E), available by clicking an annotation row, indicates all properties held by the corresponding gene. Some annotation algorithms take a guilt-by-association approach, in which GO annotations are ‘transferred’ between genes with evidence of a functional relationship (e.g. physical interaction between the corresponding proteins). Different variants of the functional linkage graphs are appropriate for different GO terms (see Taşan et al., 2008 and Tian et al., 2008), so in function views one graph is displayed (Fig. 1D), and in gene views FuncBase three functional linkage graph versions are shown that correspond to the three branches of the GO (Fig. 1G). Functional linkage graphs can be viewed in FuncBase as static images, or manipulated within Cytoscape (Shannon et al., 2003). Quantitative annotations from multiple sources: A unique feature of FuncBase is its ability to accommodate prediction sets from multiple bioinformatics teams differing by input data or algorithm. For example, 10 prediction sets are available for Mus musculus. We invite others to submit predictions associated with peer-reviewed publications for sharing via FuncBase. User feedback: FuncBase is governed by the philosophy that annotation in general and predictive annotation in particular is a work in progress, and that users will often bring domain knowledge that supersedes current or predicted annotation. Therefore, for every gene/function combination displayed, a form invites expert users to provide feedback on whether they agree, disagree or are uncertain about this annotation (Fig. 1F). Free text notes can be attached to any opinion. Current tallies of true and false responses are shared among all users and made available in summary form to the appropriate species authority. Community feedback on predictions gathered and shared in real time is novel to the FuncBase quantitative annotation resource.

4 IMPLEMENTATION

The back-end of FuncBase consists of the Pylons MVC framework, the Lucene search provider and the PostgreSQL database server. The front-end uses ExtJS (Javascript) and a modified version of Cytoscape 2.6. Most web site actions are accomplished through asynchronous browser–server communication. Functional linkage graph layout is via BioLayoutKK within Cytoscape, using linkage certainties as edge weights.
  12 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

3.  SGD: Saccharomyces Genome Database.

Authors:  J M Cherry; C Adler; C Ball; S A Chervitz; S S Dwight; E T Hester; Y Jia; G Juvik; T Roe; M Schroeder; S Weng; D Botstein
Journal:  Nucleic Acids Res       Date:  1998-01-01       Impact factor: 16.971

4.  Discovery of biological networks from diverse functional genomic data.

Authors:  Chad L Myers; Drew Robson; Adam Wible; Matthew A Hibbs; Camelia Chiriac; Chandra L Theesfeld; Kara Dolinski; Olga G Troyanskaya
Journal:  Genome Biol       Date:  2005-12-19       Impact factor: 13.583

5.  STRING 7--recent developments in the integration and prediction of protein interactions.

Authors:  Christian von Mering; Lars J Jensen; Michael Kuhn; Samuel Chaffron; Tobias Doerks; Beate Krüger; Berend Snel; Peer Bork
Journal:  Nucleic Acids Res       Date:  2006-11-10       Impact factor: 16.971

6.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

7.  The Mouse Genome Database (MGD): mouse biology and model systems.

Authors:  Carol J Bult; Janan T Eppig; James A Kadin; Joel E Richardson; Judith A Blake
Journal:  Nucleic Acids Res       Date:  2007-12-23       Impact factor: 16.971

8.  An en masse phenotype and function prediction system for Mus musculus.

Authors:  Murat Taşan; Weidong Tian; David P Hill; Francis D Gibbons; Judith A Blake; Frederick P Roth
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

9.  A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.

Authors:  Lourdes Peña-Castillo; Murat Tasan; Chad L Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Sara Mostafavi; Guan Ning Lin; Gabriel F Berriz; Francis D Gibbons; Gert Lanckriet; Jian Qiu; Charles Grant; Zafer Barutcuoglu; David P Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A Blake; Minghua Deng; Michael I Jordan; William S Noble; Quaid Morris; Judith Klein-Seetharaman; Ziv Bar-Joseph; Ting Chen; Fengzhu Sun; Olga G Troyanskaya; Edward M Marcotte; Dong Xu; Timothy R Hughes; Frederick P Roth
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

10.  GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function.

Authors:  Sara Mostafavi; Debajyoti Ray; David Warde-Farley; Chris Grouios; Quaid Morris
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

View more
  9 in total

1.  Novel cardiovascular gene functions revealed via systematic phenotype prediction in zebrafish.

Authors:  Gabriel Musso; Murat Tasan; Christian Mosimann; John E Beaver; Eva Plovie; Logan A Carr; Hon Nian Chua; Julie Dunham; Khalid Zuberi; Harold Rodriguez; Quaid Morris; Leonard Zon; Frederick P Roth; Calum A MacRae
Journal:  Development       Date:  2014-01       Impact factor: 6.868

2.  Differential expression of a set of genes in follicular and classic variants of papillary thyroid carcinoma.

Authors:  Yusuf Ziya Igci; Ahmet Arslan; Ersin Akarsu; Suna Erkilic; Mehri Igci; Serdar Oztuzcu; Beyhan Cengiz; Bulent Gogebakan; Ecir Ali Cakmak; A Tuncay Demiryurek
Journal:  Endocr Pathol       Date:  2011-06       Impact factor: 3.943

3.  A genome-wide gene function prediction resource for Drosophila melanogaster.

Authors:  Han Yan; Kavitha Venkatesan; John E Beaver; Niels Klitgord; Muhammed A Yildirim; Tong Hao; David E Hill; Michael E Cusick; Norbert Perrimon; Frederick P Roth; Marc Vidal
Journal:  PLoS One       Date:  2010-08-12       Impact factor: 3.240

4.  GeneMANIA prediction server 2013 update.

Authors:  Khalid Zuberi; Max Franz; Harold Rodriguez; Jason Montojo; Christian Tannus Lopes; Gary D Bader; Quaid Morris
Journal:  Nucleic Acids Res       Date:  2013-07       Impact factor: 16.971

5.  TF-centered downstream gene set enrichment analysis: Inference of causal regulators by integrating TF-DNA interactions and protein post-translational modifications information.

Authors:  Qi Liu; Yejun Tan; Tao Huang; Guohui Ding; Zhidong Tu; Lei Liu; Yixue Li; Hongyue Dai; Lu Xie
Journal:  BMC Bioinformatics       Date:  2010-12-14       Impact factor: 3.169

6.  A Resource of Quantitative Functional Annotation for Homo sapiens Genes.

Authors:  Murat Taşan; Harold J Drabkin; John E Beaver; Hon Nian Chua; Julie Dunham; Weidong Tian; Judith A Blake; Frederick P Roth
Journal:  G3 (Bethesda)       Date:  2012-02-01       Impact factor: 3.154

7.  PILGRM: an interactive data-driven discovery platform for expert biologists.

Authors:  Casey S Greene; Olga G Troyanskaya
Journal:  Nucleic Acids Res       Date:  2011-06-07       Impact factor: 16.971

8.  Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study.

Authors:  Aleksey Shatunov; Kin Mok; Stephen Newhouse; Michael E Weale; Bradley Smith; Caroline Vance; Lauren Johnson; Jan H Veldink; Michael A van Es; Leonard H van den Berg; Wim Robberecht; Philip Van Damme; Orla Hardiman; Anne E Farmer; Cathryn M Lewis; Amy W Butler; Olubunmi Abel; Peter M Andersen; Isabella Fogh; Vincenzo Silani; Adriano Chiò; Bryan J Traynor; Judith Melki; Vincent Meininger; John E Landers; Peter McGuffin; Jonathan D Glass; Hardev Pall; P Nigel Leigh; John Hardy; Robert H Brown; John F Powell; Richard W Orrell; Karen E Morrison; Pamela J Shaw; Christopher E Shaw; Ammar Al-Chalabi
Journal:  Lancet Neurol       Date:  2010-10       Impact factor: 44.182

9.  Application of comparative biology in GO functional annotation: the mouse model.

Authors:  Harold J Drabkin; Karen R Christie; Mary E Dolan; David P Hill; Li Ni; Dmitry Sitnikov; Judith A Blake
Journal:  Mamm Genome       Date:  2015-07-04       Impact factor: 2.957

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.