Literature DB >> 27899557

Alignment-free $d_2^*$ oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences.

Nathan A Ahlgren1, Jie Ren2, Yang Young Lu2, Jed A Fuhrman3, Fengzhu Sun3,2,4.   

Abstract

Viruses and their host genomes often share similar oligonucleotide frequency (ONF) patterns, which can be used to predict the host of a given virus by finding the host with the greatest ONF similarity. We comprehensively compared 11 ONF metrics using several k-mer lengths for predicting host taxonomy from among ∼32 000 prokaryotic genomes for 1427 virus isolate genomes whose true hosts are known. The background-subtracting measure [Formula: see text] at k = 6 gave the highest host prediction accuracy (33%, genus level) with reasonable computational times. Requiring a maximum dissimilarity score for making predictions (thresholding) and taking the consensus of the 30 most similar hosts further improved accuracy. Using a previous dataset of 820 bacteriophage and 2699 bacterial genomes, [Formula: see text] host prediction accuracies with thresholding and consensus methods (genus-level: 64%) exceeded previous Euclidian distance ONF (32%) or homology-based (22-62%) methods. When applied to metagenomically-assembled marine SUP05 viruses and the human gut virus crAssphage, [Formula: see text]-based predictions overlapped (i.e. some same, some different) with the previously inferred hosts of these viruses. The extent of overlap improved when only using host genomes or metagenomic contigs from the same habitat or samples as the query viruses. The [Formula: see text] ONF method will greatly improve the characterization of novel, metagenomic viruses.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27899557      PMCID: PMC5224470          DOI: 10.1093/nar/gkw1002

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  52 in total

1.  Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach.

Authors:  Ji Qi; Bin Wang; Bai-Iin Hao
Journal:  J Mol Evol       Date:  2004-01       Impact factor: 2.395

Review 2.  Marine viruses--major players in the global ecosystem.

Authors:  Curtis A Suttle
Journal:  Nat Rev Microbiol       Date:  2007-10       Impact factor: 60.633

3.  Alignment-free sequence comparison (I): statistics and power.

Authors:  Gesine Reinert; David Chew; Fengzhu Sun; Michael S Waterman
Journal:  J Comput Biol       Date:  2009-12       Impact factor: 1.479

4.  Isolation of an aerobic sulfur oxidizer from the SUP05/Arctic96BD-19 clade.

Authors:  Katharine T Marshall; Robert M Morris
Journal:  ISME J       Date:  2012-08-09       Impact factor: 10.302

5.  A measure of the similarity of sets of sequences not requiring sequence alignment.

Authors:  B E Blaisdell
Journal:  Proc Natl Acad Sci U S A       Date:  1986-07       Impact factor: 11.205

6.  RNA viral community in human feces: prevalence of plant pathogenic viruses.

Authors:  Tao Zhang; Mya Breitbart; Wah Heng Lee; Jin-Quan Run; Chia Lin Wei; Shirlena Wee Ling Soh; Martin L Hibberd; Edison T Liu; Forest Rohwer; Yijun Ruan
Journal:  PLoS Biol       Date:  2006-01       Impact factor: 8.029

7.  Evidence of host-virus co-evolution in tetranucleotide usage patterns of bacteriophages and eukaryotic viruses.

Authors:  David T Pride; Trudy M Wassenaar; Chandrabali Ghose; Martin J Blaser
Journal:  BMC Genomics       Date:  2006-01-18       Impact factor: 3.969

8.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences.

Authors:  Hanno Teeling; Jost Waldmann; Thierry Lombardot; Margarete Bauer; Frank Oliver Glöckner
Journal:  BMC Bioinformatics       Date:  2004-10-26       Impact factor: 3.169

9.  A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes.

Authors:  Bas E Dutilh; Noriko Cassman; Katelyn McNair; Savannah E Sanchez; Genivaldo G Z Silva; Lance Boling; Jeremy J Barr; Daan R Speth; Victor Seguritan; Ramy K Aziz; Ben Felts; Elizabeth A Dinsdale; John L Mokili; Robert A Edwards
Journal:  Nat Commun       Date:  2014-07-24       Impact factor: 14.919

10.  Genome Sequence of "Candidatus Thioglobus autotrophica" Strain EF1, a Chemoautotroph from the SUP05 Clade of Marine Gammaproteobacteria.

Authors:  Vega Shah; Robert M Morris
Journal:  Genome Announc       Date:  2015-10-22
View more
  76 in total

1.  CAFE: aCcelerated Alignment-FrEe sequence analysis.

Authors:  Yang Young Lu; Kujin Tang; Jie Ren; Jed A Fuhrman; Michael S Waterman; Fengzhu Sun
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

Review 2.  Metaviromics coupled with phage-host identification to open the viral 'black box'.

Authors:  Kira Moon; Jang-Cheon Cho
Journal:  J Microbiol       Date:  2021-02-23       Impact factor: 3.422

3.  Discovery of an expansive bacteriophage family that includes the most abundant viruses from the human gut.

Authors:  Natalya Yutin; Kira S Makarova; Ayal B Gussow; Mart Krupovic; Anca Segall; Robert A Edwards; Eugene V Koonin
Journal:  Nat Microbiol       Date:  2017-11-13       Impact factor: 17.745

4.  A virus or more in (nearly) every cell: ubiquitous networks of virus-host interactions in extreme environments.

Authors:  Jacob H Munson-McGee; Shengyun Peng; Samantha Dewerff; Ramunas Stepanauskas; Rachel J Whitaker; Joshua S Weitz; Mark J Young
Journal:  ISME J       Date:  2018-02-21       Impact factor: 10.302

5.  Cooccurrence of Broad- and Narrow-Host-Range Viruses Infecting the Bloom-Forming Toxic Cyanobacterium Microcystis aeruginosa.

Authors:  Daichi Morimoto; Kento Tominaga; Yosuke Nishimura; Naohiro Yoshida; Shigeko Kimura; Yoshihiko Sako; Takashi Yoshida
Journal:  Appl Environ Microbiol       Date:  2019-08-29       Impact factor: 4.792

Review 6.  Trends in GPCR drug discovery: new agents, targets and indications.

Authors:  Alexander S Hauser; Misty M Attwood; Mathias Rask-Andersen; Helgi B Schiöth; David E Gloriam
Journal:  Nat Rev Drug Discov       Date:  2017-10-27       Impact factor: 84.694

7.  An Uncultivated Virus Infecting a Nanoarchaeal Parasite in the Hot Springs of Yellowstone National Park.

Authors:  Jacob H Munson-McGee; Colleen Rooney; Mark J Young
Journal:  J Virol       Date:  2020-01-17       Impact factor: 5.103

8.  Read-SpaM: assembly-free and alignment-free comparison of bacterial genomes with low sequencing coverage.

Authors:  Anna-Katharina Lau; Svenja Dörrer; Chris-André Leimeister; Christoph Bleidorn; Burkhard Morgenstern
Journal:  BMC Bioinformatics       Date:  2019-12-17       Impact factor: 3.169

9.  Identifying viruses from metagenomic data using deep learning.

Authors:  Jie Ren; Kai Song; Chao Deng; Nathan A Ahlgren; Jed A Fuhrman; Yi Li; Xiaohui Xie; Ryan Poplin; Fengzhu Sun
Journal:  Quant Biol       Date:  2020-03

10.  Efficient dilution-to-extinction isolation of novel virus-host model systems for fastidious heterotrophic bacteria.

Authors:  Holger H Buchholz; Michelle L Michelsen; Luis M Bolaños; Emily Browne; Michael J Allen; Ben Temperton
Journal:  ISME J       Date:  2021-01-25       Impact factor: 10.302

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.