Seth Commichaux1,2,3, Kiran Javkar4,5,6, Padmini Ramachandran7, Niranjan Nagarajan8, Denis Bertrand8, Yi Chen7, Elizabeth Reed7, Narjol Gonzalez-Escalona7, Errol Strain9, Hugh Rand7, Mihai Pop5, Andrea Ottesen10. 1. Center for Food Safety and Applied Nutrition, Food and Drug Administration, Laurel, MD, USA. Seth.Commichaux@fda.hhs.gov. 2. Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA. Seth.Commichaux@fda.hhs.gov. 3. Biological Science Graduate Program, University of Maryland, College Park, MD, USA. Seth.Commichaux@fda.hhs.gov. 4. Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA. 5. Department of Computer Science, University of Maryland, College Park, MD, USA. 6. Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD, USA. 7. Center for Food Safety and Nutrition, Food and Drug Administration, College Park, MD, USA. 8. Computational and Systems Biology, Genome Institute of Singapore, Singapore, 13862, Singapore. 9. Center for Food Safety and Applied Nutrition, Food and Drug Administration, Laurel, MD, USA. 10. Center for Veterinary Medicine, Food and Drug Administration, Laurel, MD, USA.
Abstract
BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. RESULTS: We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. CONCLUSION: The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response.
BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. RESULTS: We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. CONCLUSION: The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response.
Authors: Marc W Allard; Errol Strain; David Melka; Kelly Bunning; Steven M Musser; Eric W Brown; Ruth Timme Journal: J Clin Microbiol Date: 2016-03-23 Impact factor: 5.948
Authors: Alexander Mellmann; Stefan Bletz; Thomas Böking; Frank Kipp; Karsten Becker; Anja Schultes; Karola Prior; Dag Harmsen Journal: J Clin Microbiol Date: 2016-08-24 Impact factor: 5.948
Authors: Werner Ruppitsch; Ariane Pietzka; Karola Prior; Stefan Bletz; Haizpea Lasa Fernandez; Franz Allerberger; Dag Harmsen; Alexander Mellmann Journal: J Clin Microbiol Date: 2015-07-01 Impact factor: 5.948
Authors: Madison E Pearce; Nabil-Fareed Alikhan; Timothy J Dallman; Zhemin Zhou; Kathie Grant; Martin C J Maiden Journal: Int J Food Microbiol Date: 2018-02-28 Impact factor: 5.277
Authors: Sarah Azinheiro; Foteini Roumani; Ana Costa-Ribeiro; Marta Prado; Alejandro Garrido-Maestu Journal: Front Microbiol Date: 2022-08-10 Impact factor: 6.064