Literature DB >> 27522084

Extensive complementarity between gene function prediction methods.

Vedrana Vidulin1, Tomislav Šmuc1, Fran Supek1,2.   

Abstract

MOTIVATION: The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions.
RESULTS: Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them.
AVAILABILITY AND IMPLEMENTATION: The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2016        PMID: 27522084     DOI: 10.1093/bioinformatics/btw532

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  The evolutionary signal in metagenome phyletic profiles predicts many gene functions.

Authors:  Vedrana Vidulin; Tomislav Šmuc; Sašo Džeroski; Fran Supek
Journal:  Microbiome       Date:  2018-07-10       Impact factor: 14.650

2.  Integrated entropy-based approach for analyzing exons and introns in DNA sequences.

Authors:  Junyi Li; Li Zhang; Huinian Li; Yuan Ping; Qingzhe Xu; Rongjie Wang; Renjie Tan; Zhen Wang; Bo Liu; Yadong Wang
Journal:  BMC Bioinformatics       Date:  2019-06-10       Impact factor: 3.169

3.  INGA 2.0: improving protein function prediction for the dark proteome.

Authors:  Damiano Piovesan; Silvio C E Tosatto
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

4.  Predicting multicellular function through multi-layer tissue networks.

Authors:  Marinka Zitnik; Jure Leskovec
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

5.  Patterns of diverse gene functions in genomic neighborhoods predict gene function and phenotype.

Authors:  Matej Mihelčić; Tomislav Šmuc; Fran Supek
Journal:  Sci Rep       Date:  2019-12-20       Impact factor: 4.379

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.