Literature DB >> 34132786

efam: an expanded, metaproteome-supported HMM profile database of viral protein families.

Ahmed A Zayed1,2, Dominik Lücking3, Mohamed Mohssen1,2,4, Dylan Cronin1, Ben Bolduc1, Ann C Gregory5,6, Katherine R Hargreaves1,7, Paul D Piehowski8, Richard A White9,10,11,12, Eric L Huang8, Joshua N Adkins8, Simon Roux13, Cristina Moraru14, Matthew B Sullivan1,2,4,15.   

Abstract

MOTIVATION: Viruses infect, reprogram, and kill microbes, leading to profound ecosystem consequences, from elemental cycling in oceans and soils to microbiome-modulated diseases in plants and animals. Although metagenomic datasets are increasingly available, identifying viruses in them is challenging due to poor representation and annotation of viral sequences in databases.
RESULTS: Here we establish efam, an expanded collection of Hidden Markov Model (HMM) profiles that represent viral protein families conservatively identified from the Global Ocean Virome 2.0 dataset. This resulted in 240,311 HMM profiles, each with at least 2 protein sequences, making efam >7-fold larger than the next largest, pan-ecosystem viral HMM profile database. Adjusting the criteria for viral contig confidence from "conservative" to "eXtremely Conservative" resulted in 37,841 HMM profiles in our efam-XC database. To assess the value of this resource, we integrated efam-XC into VirSorter viral discovery software to discover viruses from less-studied, ecologically distinct oxygen minimum zone (OMZ) marine habitats. This expanded database led to an increase in viruses recovered from every tested OMZ virome by ∼24% on average (up to ∼42%) and especially improved the recovery of often-missed shorter contigs (<5 kb). Additionally, to help elucidate lesser-known viral protein functions, we annotated the profiles using multiple databases from the DRAM pipeline and virion-associated metaproteomic data, which doubled the number of annotations obtainable by standard, single-database annotation approaches. Together, these marine resources (efam and efam-XC) are provided as searchable, compressed HMM databases that will be updated bi-annually to help maximize viral sequence discovery and study from any ecosystem. AVAILABILITY: The resources are available on the iVirus platform at (doi.org/10.25739/9vze-4143). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34132786      PMCID: PMC9502166          DOI: 10.1093/bioinformatics/btab451

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.931


  59 in total

1.  An efficient algorithm for large-scale detection of protein families.

Authors:  A J Enright; S Van Dongen; C A Ouzounis
Journal:  Nucleic Acids Res       Date:  2002-04-01       Impact factor: 16.971

2.  Insights into the phylogeny and coding potential of microbial dark matter.

Authors:  Christian Rinke; Patrick Schwientek; Alexander Sczyrba; Natalia N Ivanova; Iain J Anderson; Jan-Fang Cheng; Aaron Darling; Stephanie Malfatti; Brandon K Swan; Esther A Gies; Jeremy A Dodsworth; Brian P Hedlund; George Tsiamis; Stefan M Sievert; Wen-Tso Liu; Jonathan A Eisen; Steven J Hallam; Nikos C Kyrpides; Ramunas Stepanauskas; Edward M Rubin; Philip Hugenholtz; Tanja Woyke
Journal:  Nature       Date:  2013-07-14       Impact factor: 49.962

Review 3.  Profile hidden Markov models.

Authors:  S R Eddy
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

4.  A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria.

Authors:  Kathryn M Kauffman; Fatima A Hussain; Joy Yang; Philip Arevalo; Julia M Brown; William K Chang; David VanInsberghe; Joseph Elsherbini; Radhey S Sharma; Michael B Cutler; Libusha Kelly; Martin F Polz
Journal:  Nature       Date:  2018-01-24       Impact factor: 49.962

5.  A cross-platform toolkit for mass spectrometry and proteomics.

Authors:  Matthew C Chambers; Brendan Maclean; Robert Burke; Dario Amodei; Daniel L Ruderman; Steffen Neumann; Laurent Gatto; Bernd Fischer; Brian Pratt; Jarrett Egertson; Katherine Hoff; Darren Kessner; Natalie Tasman; Nicholas Shulman; Barbara Frewen; Tahmina A Baker; Mi-Youn Brusniak; Christopher Paulse; David Creasy; Lisa Flashner; Kian Kani; Chris Moulding; Sean L Seymour; Lydia M Nuwaysir; Brent Lefebvre; Frank Kuhlmann; Joe Roark; Paape Rainer; Suckau Detlev; Tina Hemenway; Andreas Huhmer; James Langridge; Brian Connolly; Trey Chadick; Krisztina Holly; Josh Eckels; Eric W Deutsch; Robert L Moritz; Jonathan E Katz; David B Agus; Michael MacCoss; David L Tabb; Parag Mallick
Journal:  Nat Biotechnol       Date:  2012-10       Impact factor: 54.908

6.  Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation.

Authors:  Ana Laura Grazziotin; Eugene V Koonin; David M Kristensen
Journal:  Nucleic Acids Res       Date:  2016-10-26       Impact factor: 16.971

7.  MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins.

Authors:  Deyvid Amgarten; Lucas P P Braga; Aline M da Silva; João C Setubal
Journal:  Front Genet       Date:  2018-08-07       Impact factor: 4.599

8.  Minimum Information about an Uncultivated Virus Genome (MIUViG).

Authors:  Simon Roux; Evelien M Adriaenssens; Bas E Dutilh; Eugene V Koonin; Andrew M Kropinski; Mart Krupovic; Jens H Kuhn; Rob Lavigne; J Rodney Brister; Arvind Varsani; Clara Amid; Ramy K Aziz; Seth R Bordenstein; Peer Bork; Mya Breitbart; Guy R Cochrane; Rebecca A Daly; Christelle Desnues; Melissa B Duhaime; Joanne B Emerson; François Enault; Jed A Fuhrman; Pascal Hingamp; Philip Hugenholtz; Bonnie L Hurwitz; Natalia N Ivanova; Jessica M Labonté; Kyung-Bum Lee; Rex R Malmstrom; Manuel Martinez-Garcia; Ilene Karsch Mizrachi; Hiroyuki Ogata; David Páez-Espino; Marie-Agnès Petit; Catherine Putonti; Thomas Rattei; Alejandro Reyes; Francisco Rodriguez-Valera; Karyna Rosario; Lynn Schriml; Frederik Schulz; Grieg F Steward; Matthew B Sullivan; Shinichi Sunagawa; Curtis A Suttle; Ben Temperton; Susannah G Tringe; Rebecca Vega Thurber; Nicole S Webster; Katrine L Whiteson; Steven W Wilhelm; K Eric Wommack; Tanja Woyke; Kelly C Wrighton; Pelin Yilmaz; Takashi Yoshida; Mark J Young; Natalya Yutin; Lisa Zeigler Allen; Nikos C Kyrpides; Emiley A Eloe-Fadrosh
Journal:  Nat Biotechnol       Date:  2018-12-17       Impact factor: 54.908

9.  Phage-specific metabolic reprogramming of virocells.

Authors:  Cristina Howard-Varona; Morgan M Lindback; G Eric Bastien; Natalie Solonenko; Ahmed A Zayed; HoBin Jang; Bill Andreopoulos; Heather M Brewer; Tijana Glavina Del Rio; Joshua N Adkins; Subhadeep Paul; Matthew B Sullivan; Melissa B Duhaime
Journal:  ISME J       Date:  2020-01-02       Impact factor: 10.302

10.  Doubling of the known set of RNA viruses by metagenomic analysis of an aquatic virome.

Authors:  Yuri I Wolf; Sukrit Silas; Yongjie Wang; Shuang Wu; Michael Bocek; Darius Kazlauskas; Mart Krupovic; Andrew Fire; Valerian V Dolja; Eugene V Koonin
Journal:  Nat Microbiol       Date:  2020-07-20       Impact factor: 17.745

View more
  2 in total

Review 1.  The Use of Bacteriophages in Biotechnology and Recent Insights into Proteomics.

Authors:  Ana G Abril; Mónica Carrera; Vicente Notario; Ángeles Sánchez-Pérez; Tomás G Villa
Journal:  Antibiotics (Basel)       Date:  2022-05-13

2.  The International Virus Bioinformatics Meeting 2022.

Authors:  Franziska Hufsky; Denis Beslic; Dimitri Boeckaerts; Sebastian Duchene; Enrique González-Tortuero; Andreas J Gruber; Jiarong Guo; Daan Jansen; John Juma; Kunaphas Kongkitimanon; Antoni Luque; Muriel Ritsch; Gabriel Lencioni Lovate; Luca Nishimura; Célia Pas; Esteban Domingo; Emma Hodcroft; Philippe Lemey; Matthew B Sullivan; Friedemann Weber; Fernando González-Candelas; Sarah Krautwurst; Alba Pérez-Cataluña; Walter Randazzo; Gloria Sánchez; Manja Marz
Journal:  Viruses       Date:  2022-05-05       Impact factor: 5.818

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.