Literature DB >> 34084563

Identifying viruses from metagenomic data using deep learning.

Jie Ren1, Kai Song2, Chao Deng1, Nathan A Ahlgren3, Jed A Fuhrman4, Yi Li5, Xiaohui Xie5, Ryan Poplin6, Fengzhu Sun1.   

Abstract

BACKGROUND: The recent development of metagenomic sequencing makes it possible to massively sequence microbial genomes including viral genomes without the need for laboratory culture. Existing reference-based and gene homology-based methods are not efficient in identifying unknown viruses or short viral sequences from metagenomic data.
METHODS: Here we developed a reference-free and alignment-free machine learning method, DeepVirFinder, for identifying viral sequences in metagenomic data using deep learning.
RESULTS: Trained based on sequences from viral RefSeq discovered before May 2015, and evaluated on those discovered after that date, DeepVirFinder outperformed the state-of-the-art method VirFinder at all contig lengths, achieving AUROC 0.93, 0.95, 0.97, and 0.98 for 300, 500, 1000, and 3000 bp sequences respectively. Enlarging the training data with additional millions of purified viral sequences from metavirome samples further improved the accuracy for identifying virus groups that are under-represented. Applying DeepVirFinder to real human gut metagenomic samples, we identified 51,138 viral sequences belonging to 175 bins in patients with colorectal carcinoma (CRC). Ten bins were found associated with the cancer status, suggesting viruses may play important roles in CRC.
CONCLUSIONS: Powered by deep learning and high throughput sequencing metagenomic data, DeepVirFinder significantly improved the accuracy of viral identification and will assist the study of viruses in the era of metagenomics.

Entities:  

Keywords:  deep learning; machine learning; metagenome; virus identification

Year:  2020        PMID: 34084563      PMCID: PMC8172088          DOI: 10.1007/s40484-019-0187-4

Source DB:  PubMed          Journal:  Quant Biol        ISSN: 2095-4689


  50 in total

1.  PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning.

Authors:  Zhencheng Fang; Jie Tan; Shufang Wu; Mo Li; Congmin Xu; Zhongjie Xie; Huaiqiu Zhu
Journal:  Gigascience       Date:  2019-06-01       Impact factor: 6.524

2.  The human gut virome: inter-individual variation and dynamic response to diet.

Authors:  Samuel Minot; Rohini Sinha; Jun Chen; Hongzhe Li; Sue A Keilbaugh; Gary D Wu; James D Lewis; Frederic D Bushman
Journal:  Genome Res       Date:  2011-08-31       Impact factor: 9.043

3.  Metavir: a web server dedicated to virome analysis.

Authors:  Simon Roux; Michaël Faubladier; Antoine Mahul; Nils Paulhe; Aurélien Bernard; Didier Debroas; François Enault
Journal:  Bioinformatics       Date:  2011-09-11       Impact factor: 6.937

4.  Fast and sensitive protein alignment using DIAMOND.

Authors:  Benjamin Buchfink; Chao Xie; Daniel H Huson
Journal:  Nat Methods       Date:  2014-11-17       Impact factor: 28.547

5.  Predicting the impact of non-coding variants on DNA methylation.

Authors:  Haoyang Zeng; David K Gifford
Journal:  Nucleic Acids Res       Date:  2017-06-20       Impact factor: 16.971

6.  Gut DNA viromes of Malawian twins discordant for severe acute malnutrition.

Authors:  Alejandro Reyes; Laura V Blanton; Song Cao; Guoyan Zhao; Mark Manary; Indi Trehan; Michelle I Smith; David Wang; Herbert W Virgin; Forest Rohwer; Jeffrey I Gordon
Journal:  Proc Natl Acad Sci U S A       Date:  2015-09-08       Impact factor: 11.205

7.  Predicting effects of noncoding variants with deep learning-based sequence model.

Authors:  Jian Zhou; Olga G Troyanskaya
Journal:  Nat Methods       Date:  2015-08-24       Impact factor: 28.547

8.  A universal SNP and small-indel variant caller using deep neural networks.

Authors:  Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo
Journal:  Nat Biotechnol       Date:  2018-09-24       Impact factor: 54.908

9.  Phage_Finder: automated identification and classification of prophage regions in complete bacterial genome sequences.

Authors:  Derrick E Fouts
Journal:  Nucleic Acids Res       Date:  2006-10-24       Impact factor: 16.971

10.  PHASTER: a better, faster version of the PHAST phage search tool.

Authors:  David Arndt; Jason R Grant; Ana Marcu; Tanvir Sajed; Allison Pon; Yongjie Liang; David S Wishart
Journal:  Nucleic Acids Res       Date:  2016-05-03       Impact factor: 16.971

View more
  38 in total

1.  DeePhage: distinguishing virulent and temperate phage-derived sequences in metavirome data with a deep learning approach.

Authors:  Shufang Wu; Zhencheng Fang; Jie Tan; Mo Li; Chunhui Wang; Qian Guo; Congmin Xu; Xiaoqing Jiang; Huaiqiu Zhu
Journal:  Gigascience       Date:  2021-09-08       Impact factor: 6.524

2.  Accurate identification of bacteriophages from metagenomic data using Transformer.

Authors:  Jiayu Shang; Xubo Tang; Ruocheng Guo; Yanni Sun
Journal:  Brief Bioinform       Date:  2022-07-18       Impact factor: 13.994

3.  Long-read metagenomics of multiple displacement amplified DNA of low-biomass human gut phageomes by SACRA pre-processing chimeric reads.

Authors:  Yuya Kiguchi; Suguru Nishijima; Naveen Kumar; Masahira Hattori; Wataru Suda
Journal:  DNA Res       Date:  2021-10-11       Impact factor: 4.477

4.  inPhocus: Current State and Challenges of Phage Research in Singapore.

Authors:  Navin Kumar Verma; Si Jia Tan; John Chen; Hanrong Chen; Muhammad Hafiz Ismail; Scott A Rice; Pablo Bifani; Sukumar Hariharan; Vivek Daniel Paul; Bharathi Sriram; Linh Chi Dam; Chia Ching Chan; Peiying Ho; Boon Chong Goh; Shimin Jasmine Chung; Kenneth Choon Meng Goh; Shu Hua Thong; Andrea Lay-Hoon Kwa; Adam Ostrowski; Thet Tun Aung; Halimah Razali; Shermaine W Y Low; Mani Shankar Bhattacharyya; Hemant K Gautam; Rajamani Lakshminarayanan; Thomas Sicheritz-Pontén; Martha R J Clokie; Wilfried Moreira; Maurice Adrianus Monique van Steensel
Journal:  Phage (New Rochelle)       Date:  2022-03-18

5.  Explainable deep neural networks for novel viral genome prediction.

Authors:  Chandra Mohan Dasari; Raju Bhukya
Journal:  Appl Intell (Dordr)       Date:  2021-06-25       Impact factor: 5.019

6.  efam: an expanded, metaproteome-supported HMM profile database of viral protein families.

Authors:  Ahmed A Zayed; Dominik Lücking; Mohamed Mohssen; Dylan Cronin; Ben Bolduc; Ann C Gregory; Katherine R Hargreaves; Paul D Piehowski; Richard A White; Eric L Huang; Joshua N Adkins; Simon Roux; Cristina Moraru; Matthew B Sullivan
Journal:  Bioinformatics       Date:  2021-06-16       Impact factor: 6.931

7.  Methane-derived carbon flows into host-virus networks at different trophic levels in soil.

Authors:  Sungeun Lee; Ella T Sieradzki; Alexa M Nicolas; Robin L Walker; Mary K Firestone; Christina Hazard; Graeme W Nicol
Journal:  Proc Natl Acad Sci U S A       Date:  2021-08-10       Impact factor: 11.205

Review 8.  The Human Gut Phageome: Origins and Roles in the Human Gut Microbiome.

Authors:  Eleanor M Townsend; Lucy Kelly; George Muscatt; Joshua D Box; Nicole Hargraves; Daniel Lilley; Eleanor Jameson
Journal:  Front Cell Infect Microbiol       Date:  2021-06-04       Impact factor: 5.293

9.  Bacteriophage classification for assembled contigs using graph convolutional network.

Authors:  Jiayu Shang; Jingzhe Jiang; Yanni Sun
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

10.  Viromes outperform total metagenomes in revealing the spatiotemporal patterns of agricultural soil viral communities.

Authors:  Christian Santos-Medellin; Laura A Zinke; Anneliek M Ter Horst; Danielle L Gelardi; Sanjai J Parikh; Joanne B Emerson
Journal:  ISME J       Date:  2021-02-21       Impact factor: 10.302

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.