Literature DB >> 34020538

Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses.

Zhi-Luo Deng1, Akshay Dhingra2, Adrian Fritz1, Jasper Götting2, Philipp C Münch3, Lars Steinbrück2, Thomas F Schulz2, Tina Ganzenmüller2, Alice C McHardy1.   

Abstract

Infection with human cytomegalovirus (HCMV) can cause severe complications in immunocompromised individuals and congenitally infected children. Characterizing heterogeneous viral populations and their evolution by high-throughput sequencing of clinical specimens requires the accurate assembly of individual strains or sequence variants and suitable variant calling methods. However, the performance of most methods has not been assessed for populations composed of low divergent viral strains with large genomes, such as HCMV. In an extensive benchmarking study, we evaluated 15 assemblers and 6 variant callers on 10 lab-generated benchmark data sets created with two different library preparation protocols, to identify best practices and challenges for analyzing such data. Most assemblers, especially metaSPAdes and IVA, performed well across a range of metrics in recovering abundant strains. However, only one, Savage, recovered low abundant strains and in a highly fragmented manner. Two variant callers, LoFreq and VarScan2, excelled across all strain abundances. Both shared a large fraction of false positive variant calls, which were strongly enriched in T to G changes in a 'G.G' context. The magnitude of this context-dependent systematic error is linked to the experimental protocol. We provide all benchmarking data, results and the entire benchmarking workflow named QuasiModo, Quasispecies Metric determination on omics, under the GNU General Public License v3.0 (https://github.com/hzi-bifo/Quasimodo), to enable full reproducibility and further benchmarking on these and other data.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  HCMV; benchmark; genome assembly; strain mixtures; variant calling; virus

Mesh:

Year:  2021        PMID: 34020538      PMCID: PMC8138829          DOI: 10.1093/bib/bbaa123

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  55 in total

Review 1.  Next-generation sequencing in clinical virology: Discovery of new viruses.

Authors:  Sibnarayan Datta; Raghvendra Budhauliya; Bidisha Das; Soumya Chatterjee; Vijay Veer
Journal:  World J Virol       Date:  2015-08-12

2.  HIV Haplotype Inference Using a Propagating Dirichlet Process Mixture Model.

Authors:  Sandhya Prabhakaran; Mélanie Rey; Osvaldo Zagordi; Niko Beerenwinkel; Volker Roth
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2014 Jan-Feb       Impact factor: 3.710

3.  Measurement error and variant-calling in deep Illumina sequencing of HIV.

Authors:  Mark Howison; Mia Coetzer; Rami Kantor
Journal:  Bioinformatics       Date:  2019-06-01       Impact factor: 6.937

4.  Characterization of Human Cytomegalovirus Genome Diversity in Immunocompromised Hosts by Whole-Genome Sequencing Directly From Clinical Specimens.

Authors:  Elias Hage; Gavin S Wilkie; Silvia Linnenweber-Held; Akshay Dhingra; Nicolás M Suárez; Julius J Schmidt; Penelope C Kay-Fedorov; Eva Mischak-Weissinger; Albert Heim; Anke Schwarz; Thomas F Schulz; Andrew J Davison; Tina Ganzenmueller
Journal:  J Infect Dis       Date:  2017-06-01       Impact factor: 5.226

Review 5.  Strain Variation and Disease Severity in Congenital Cytomegalovirus Infection: In Search of a Viral Marker.

Authors:  Ravit Arav-Boger
Journal:  Infect Dis Clin North Am       Date:  2015-07-04       Impact factor: 5.982

6.  De novo assembly of highly diverse viral populations.

Authors:  Xiao Yang; Patrick Charlebois; Sante Gnerre; Matthew G Coole; Niall J Lennon; Joshua Z Levin; James Qu; Elizabeth M Ryan; Michael C Zody; Matthew R Henn
Journal:  BMC Genomics       Date:  2012-09-13       Impact factor: 3.969

7.  Evaluation of viral genome assembly and diversity estimation in deep metagenomes.

Authors:  Daniel Aguirre de Cárcer; Florent E Angly; Antonio Alcamí
Journal:  BMC Genomics       Date:  2014-11-18       Impact factor: 3.969

8.  Choice of assembly software has a critical impact on virome characterisation.

Authors:  Thomas D S Sutton; Adam G Clooney; Feargal J Ryan; R Paul Ross; Colin Hill
Journal:  Microbiome       Date:  2019-01-28       Impact factor: 14.650

9.  Evaluating the performance of tools used to call minority variants from whole genome short-read data.

Authors:  Khadija Said Mohammed; Nelson Kibinge; Pjotr Prins; Charles N Agoti; Matthew Cotten; D J Nokes; Samuel Brand; George Githinji
Journal:  Wellcome Open Res       Date:  2018-09-13
View more
  4 in total

1.  Is the reductionist paradox an Achilles Heel of drug discovery?

Authors:  Gerry Maggiora
Journal:  J Comput Aided Mol Des       Date:  2022-07-21       Impact factor: 4.179

2.  Gene signature of m6A-related targets to predict prognosis and immunotherapy response in ovarian cancer.

Authors:  Wei Tan; Shiyi Liu; Zhimin Deng; Fangfang Dai; Mengqin Yuan; Wei Hu; Bingshu Li; Yanxiang Cheng
Journal:  J Cancer Res Clin Oncol       Date:  2022-09-01       Impact factor: 4.322

Review 3.  Promising Role of Emodin as Therapeutics to Against Viral Infections.

Authors:  Qingqing Shao; Tong Liu; Wenjia Wang; Tianli Liu; Ximing Jin; Zhuo Chen
Journal:  Front Pharmacol       Date:  2022-05-04       Impact factor: 5.988

4.  VirStrain: a strain identification tool for RNA viruses.

Authors:  Herui Liao; Dehan Cai; Yanni Sun
Journal:  Genome Biol       Date:  2022-01-31       Impact factor: 13.583

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.