Literature DB >> 33584804

Hierarchical Microbial Functions Prediction by Graph Aggregated Embedding.

Yujie Hou1,2, Xiong Zhang1,3, Qinyan Zhou1,4, Wenxing Hong1,5, Ying Wang1,5,6.   

Abstract

Matching 16S rRNA gene sequencing data to a metabolic reference database is a meaningful way to predict the metabolic function of bacteria and archaea, bringing greater insight to the working of the microbial community. However, some operational taxonomy units (OTUs) cannot be functionally profiled, especially for microbial communities from non-human samples cultured in defective media. Therefore, we herein report the development of Hierarchical micrObial functions Prediction by graph aggregated Embedding (HOPE), which utilizes co-occurring patterns and nucleotide sequences to predict microbial functions. HOPE integrates topological structures of microbial co-occurrence networks with k-mer compositions of OTU sequences and embeds them into a lower-dimensional continuous latent space, while maximally preserving topological relationships among OTUs. The high imbalance among KEGG Orthology (KO) functions of microbes is recognized in our framework that usually yields poor performance. A hierarchical multitask learning module is used in HOPE to alleviate the challenge brought by the long-tailed distribution among classes. To test the performance of HOPE, we compare it with HOPE-one, HOPE-seq, and GraphSAGE, respectively, in three microbial metagenomic 16s rRNA sequencing datasets, including abalone gut, human gut, and gut of Penaeus monodon. Experiments demonstrate that HOPE outperforms baselines on almost all indexes in all experiments. Furthermore, HOPE reveals significant generalization ability. HOPE's basic idea is suitable for other related scenarios, such as the prediction of gene function based on gene co-expression networks. The source code of HOPE is freely available at https://github.com/adrift00/HOPE.
Copyright © 2021 Hou, Zhang, Zhou, Hong and Wang.

Entities:  

Keywords:  deep learning; functions prediction; graph embedding; hierarchical multi task learning; microbial co-occurrence networks

Year:  2021        PMID: 33584804      PMCID: PMC7874084          DOI: 10.3389/fgene.2020.608512

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


  23 in total

Review 1.  Belowground biodiversity and ecosystem functioning.

Authors:  Richard D Bardgett; Wim H van der Putten
Journal:  Nature       Date:  2014-11-27       Impact factor: 49.962

2.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data.

Authors:  Salman H Khan; Munawar Hayat; Mohammed Bennamoun; Ferdous A Sohel; Roberto Togneri
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2017-08-17       Impact factor: 10.451

3.  Inferring correlation networks from genomic survey data.

Authors:  Jonathan Friedman; Eric J Alm
Journal:  PLoS Comput Biol       Date:  2012-09-20       Impact factor: 4.475

4.  CombFunc: predicting protein function using heterogeneous data sources.

Authors:  Mark N Wass; Geraint Barton; Michael J E Sternberg
Journal:  Nucleic Acids Res       Date:  2012-05-27       Impact factor: 16.971

5.  Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data.

Authors:  Kathrin P Aßhauer; Bernd Wemheuer; Rolf Daniel; Peter Meinicke
Journal:  Bioinformatics       Date:  2015-05-07       Impact factor: 6.937

6.  Comparison of metatranscriptomic samples based on k-tuple frequencies.

Authors:  Ying Wang; Lin Liu; Lina Chen; Ting Chen; Fengzhu Sun
Journal:  PLoS One       Date:  2014-01-02       Impact factor: 3.240

Review 7.  Microbial functional diversity: From concepts to applications.

Authors:  Arthur Escalas; Lauren Hale; James W Voordeckers; Yunfeng Yang; Mary K Firestone; Lisa Alvarez-Cohen; Jizhong Zhou
Journal:  Ecol Evol       Date:  2019-10-02       Impact factor: 2.912

8.  Protein function prediction by massive integration of evolutionary analyses and multiple data sources.

Authors:  Domenico Cozzetto; Daniel W A Buchan; Kevin Bryson; David T Jones
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

9.  Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures.

Authors:  Ying Wang; Lei Fu; Jie Ren; Zhaoxia Yu; Ting Chen; Fengzhu Sun
Journal:  Front Microbiol       Date:  2018-05-03       Impact factor: 5.640

10.  deepNF: deep network fusion for protein function prediction.

Authors:  Vladimir Gligorijevic; Meet Barot; Richard Bonneau
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.937

View more
  1 in total

1.  Rapid evolution of a novel protective symbiont into keystone taxon in Caenorhabditis elegans microbiota.

Authors:  Kayla C King; Alejandro Cabezas-Cruz; Alejandra Wu-Chuang; Kieran A Bates; Dasiel Obregon; Agustín Estrada-Peña
Journal:  Sci Rep       Date:  2022-08-18       Impact factor: 4.996

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.