Literature DB >> 25475635

Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.

Robert L Charlebois1, Siemon H S Ng2, Lucy Gisonni-Lex2, Laurent Mallet3.   

Abstract

We have designed and implemented a software system, named PhyloID™, that can be used to detect putative adventitious agents in biological samples characterized by next-generation sequencing. PhyloID is run in two steps, each being a self-contained automated process amenable to GMP validation. The first module, MiLY, is responsible for assembling individual sequence reads into contigs, and annotating all sequences with a unique sequence identifier, the number of reads in each contig, and the length of the sequence. The trimmed, assembled and annotated data are then processed by PhyloID's second module, NGmapper. NGmapper takes the FASTA-formatted output from MiLY and identifies the taxonomic origins of the contigs and singletons therein. It compares each sequence's BLASTN hit profile against the patterns of evolutionary relationships described within phylogenomic distance matrices for all of the various taxonomic groups, in order to find the best fit. NGmapper then produces lists of taxonomic assignments in both summarized and detailed form, and tree files for viewing results graphically. We optimized PhyloID's parameters and measured its performance using simulated metagenomic data and subsets of the reference phylogenies. PhyloID's precision and recall in identifying simulated sequences were measured by information retrieval analysis, focusing on read length, read number, sequence accuracy, background complexity, taxonomy and reference data coverage. We found PhyloID to be highly accurate and quantitative in its taxonomic mapping of sequences, with excellent precision, sensitivity and robustness. The degree of taxonomic representation available in publicly available databases remains an issue, as expected, for any sequence classifier, but community sequencing efforts are poised to overcome this problem. In order to illustrate real-world usage of the application, we also describe some simple spike-recovery experiments as well as a multi-site comparative characterization of a viral suspension. These data help to illustrate, to corroborate, and to extend results using simulated data. LAY ABSTRACT: In order to address gaps in the detection of contaminating viruses and microorganisms in vaccines and other biologicals, manufacturers are exploring the use of new technologies that promise greater sensitivity and breadth of coverage. One challenge in implementing such new methods is the complexity of analysis of the "big data" generated by these new instruments: hundreds of millions of sequence reads (segments of genetic material from viruses and cells) need to be compared against a vast and growing number of entries in genetic databases, in order to come up with a confident identification. These large-scale analyses must furthermore be carried out within the strict regulatory environment that governs the industry. We have developed an automated software pipeline named PhyloID™ that is capable of identifying viruses and microorganisms from large-scale sequence data. Using simulated data as well as real samples, we show that PhyloID is both sensitive and accurate in identifying any type of potential contaminant. Such a powerful new assay will be an important addition to the adventitious agent testing package, providing further assurance about product safety. © PDA, Inc. 2014.

Entities:  

Keywords:  Adventitious agent detection; Bioinformatics; Metagenomics

Mesh:

Substances:

Year:  2014        PMID: 25475635     DOI: 10.5731/pdajpst.2014.01023

Source DB:  PubMed          Journal:  PDA J Pharm Sci Technol        ISSN: 1079-7440


  3 in total

1.  Report of the second international conference on next generation sequencing for adventitious virus detection in biologics for humans and animals.

Authors:  Arifa S Khan; Johannes Blümel; Dieter Deforce; Marion F Gruber; Carmen Jungbäck; Ivana Knezevic; Laurent Mallet; David Mackay; Jelle Matthijnssens; Maureen O'Leary; Sebastiaan Theuns; Joseph Victoria; Pieter Neels
Journal:  Biologicals       Date:  2020-07-11       Impact factor: 1.856

2.  A Multicenter Study To Evaluate the Performance of High-Throughput Sequencing for Virus Detection.

Authors:  Arifa S Khan; Siemon H S Ng; Olivier Vandeputte; Aisha Aljanahi; Avisek Deyati; Jean-Pol Cassart; Robert L Charlebois; Lanyn P Taliaferro
Journal:  mSphere       Date:  2017-09-13       Impact factor: 4.389

3.  Sensitivity and breadth of detection of high-throughput sequencing for adventitious virus detection.

Authors:  Robert L Charlebois; Sarmitha Sathiamoorthy; Carine Logvinoff; Lucy Gisonni-Lex; Laurent Mallet; Siemon H S Ng
Journal:  NPJ Vaccines       Date:  2020-07-17       Impact factor: 7.344

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.