Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Interpretable detection of novel human viruses from genome sequencing data.

Literature DB >> 33554119

Interpretable detection of novel human viruses from genome sequencing data.

Jakub M Bartoszewicz¹, Anja Seidel¹, Bernhard Y Renard¹.

Abstract

Viruses evolve extremely quickly, so reliable methods for viral host prediction are necessary to safeguard biosecurity and biosafety alike. Novel human-infecting viruses are difficult to detect with standard bioinformatics workflows. Here, we predict whether a virus can infect humans directly from next-generation sequencing reads. We show that deep neural architectures significantly outperform both shallow machine learning and standard, homology-based algorithms, cutting the error rates in half and generalizing to taxonomic units distant from those presented during training. Further, we develop a suite of interpretability tools and show that it can be applied also to other models beyond the host prediction task. We propose a new approach for convolutional filter visualization to disentangle the information content of each nucleotide from its contribution to the final classification decision. Nucleotide-resolution maps of the learned associations between pathogen genomes and the infectious phenotype can be used to detect regions of interest in novel agents, for example, the SARS-CoV-2 coronavirus, unknown before it caused a COVID-19 pandemic in 2020. All methods presented here are implemented as easy-to-install packages not only enabling analysis of NGS datasets without requiring any deep learning skills, but also allowing advanced users to easily train and explain new models for genomics.

Entities: CellLine Chemical Disease Gene Mutation Species

Year: 2021 PMID： 33554119 PMCID： PMC7849996 DOI： 10.1093/nargab/lqab004

Source DB: PubMed Journal: NAR Genom Bioinform ISSN： 2631-9268

63 in total

1. Editorial commentary: Unbiased next-generation sequencing and new pathogen discovery: undeniable advantages and still-existing drawbacks.

Authors: Arianna Calistri; Giorgio Palù
Journal: Clin Infect Dis Date: 2015-01-07 Impact factor: 9.079

2. DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.

Authors: Jack Lanchantin; Ritambhara Singh; Beilun Wang; Yanjun Qi
Journal: Pac Symp Biocomput Date: 2017

3. Unified rational protein engineering with sequence-based deep representation learning.

Authors: Ethan C Alley; Grigory Khimulya; Surojit Biswas; Mohammed AlQuraishi; George M Church
Journal: Nat Methods Date: 2019-10-21 Impact factor: 28.547

4. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.

Authors: Sebastian Bach; Alexander Binder; Grégoire Montavon; Frederick Klauschen; Klaus-Robert Müller; Wojciech Samek
Journal: PLoS One Date: 2015-07-10 Impact factor: 3.240

5. Convolutional neural network architectures for predicting DNA-protein binding.

Authors: Haoyang Zeng; Matthew D Edwards; Ge Liu; David K Gifford
Journal: Bioinformatics Date: 2016-06-15 Impact factor: 6.937

6. Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study.

Authors: Anupama Jha; Joseph K Aicher; Matthew R Gazzara; Deependra Singh; Yoseph Barash
Journal: Genome Biol Date: 2020-06-19 Impact factor: 13.583

7. Host Taxon Predictor - A Tool for Predicting Taxon of the Host of a Newly Discovered Virus.

Authors: Wojciech Gałan; Maciej Bąk; Małgorzata Jakubowska
Journal: Sci Rep Date: 2019-03-05 Impact factor: 4.379

8. Next Steps for Access to Safe, Secure DNA Synthesis.

Authors: James Diggans; Emily Leproust
Journal: Front Bioeng Biotechnol Date: 2019-04-24

9. Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts.

Authors: Surag Nair; Daniel S Kim; Jacob Perricone; Anshul Kundaje
Journal: Bioinformatics Date: 2019-07-15 Impact factor: 6.937

10. A new coronavirus associated with human respiratory disease in China.

Authors: Fan Wu; Su Zhao; Bin Yu; Yan-Mei Chen; Wen Wang; Zhi-Gang Song; Yi Hu; Zhao-Wu Tao; Jun-Hua Tian; Yuan-Yuan Pei; Ming-Li Yuan; Yu-Ling Zhang; Fa-Hui Dai; Yi Liu; Qi-Min Wang; Jiao-Jiao Zheng; Lin Xu; Edward C Holmes; Yong-Zhen Zhang
Journal: Nature Date: 2020-02-03 Impact factor: 49.962

8 in total

Review 1. The science of the host-virus network.

Authors: Gregory F Albery; Daniel J Becker; Liam Brierley; Cara E Brook; Rebecca C Christofferson; Lily E Cohen; Tad A Dallas; Evan A Eskew; Anna Fagre; Maxwell J Farrell; Emma Glennon; Sarah Guth; Maxwell B Joseph; Nardus Mollentze; Benjamin A Neely; Timothée Poisot; Angela L Rasmussen; Sadie J Ryan; Stephanie Seifert; Anna R Sjodin; Erin M Sorrell; Colin J Carlson
Journal: Nat Microbiol Date: 2021-11-24 Impact factor: 30.964

2. AMAISE: a machine learning approach to index-free sequence enrichment.

Authors: Meera Krishnamoorthy; Piyush Ranjan; John R Erb-Downward; Robert P Dickson; Jenna Wiens
Journal: Commun Biol Date: 2022-06-09

3. Explainable deep neural networks for novel viral genome prediction.

Authors: Chandra Mohan Dasari; Raju Bhukya
Journal: Appl Intell (Dordr) Date: 2021-06-25 Impact factor: 5.019

4. Characterizing and Evaluating the Zoonotic Potential of Novel Viruses Discovered in Vampire Bats.

Authors: Laura M Bergner; Nardus Mollentze; Richard J Orton; Carlos Tello; Alice Broos; Roman Biek; Daniel G Streicker
Journal: Viruses Date: 2021-02-06 Impact factor: 5.048

5. Predicting the animal hosts of coronaviruses from compositional biases of spike protein and whole genome sequences through machine learning.

Authors: Liam Brierley; Anna Fowler
Journal: PLoS Pathog Date: 2021-04-20 Impact factor: 6.823

6. Correcting the Estimation of Viral Taxa Distributions in Next-Generation Sequencing Data after Applying Artificial Neural Networks.

Authors: Moritz Kohls; Magdalena Kircher; Jessica Krepel; Pamela Liebig; Klaus Jung
Journal: Genes (Basel) Date: 2021-10-31 Impact factor: 4.096

Review 7. Chaos game representation and its applications in bioinformatics.

Authors: Hannah Franziska Löchel; Dominik Heider
Journal: Comput Struct Biotechnol J Date: 2021-11-10 Impact factor: 7.271

8. Identifying and prioritizing potential human-infecting viruses from their genome sequences.

Authors: Nardus Mollentze; Simon A Babayan; Daniel G Streicker
Journal: PLoS Biol Date: 2021-09-28 Impact factor: 8.029

8 in total