| Literature DB >> 34850839 |
Ayorinde O Afolayan1, Johan Fabian Bernal2, June M Gayeta3, Melissa L Masim3, Varun Shamanna4, Monica Abrudan5, Khalil Abudahab5, Silvia Argimón5, Celia C Carlos3, Sonia Sia3, Kadahalli L Ravikumar4, Iruka N Okeke6, Pilar Donado-Godoy2, David M Aanensen5, Anthony Underwood5.
Abstract
Performing whole genome sequencing (WGS) for the surveillance of antimicrobial resistance offers the ability to determine not only the antimicrobials to which rates of resistance are increasing, but also the evolutionary mechanisms and transmission routes responsible for the increase at local, national, and global scales. To derive WGS-based outputs, a series of processes are required, beginning with sample and metadata collection, followed by nucleic acid extraction, library preparation, sequencing, and analysis. Throughout this pathway there are many data-related operations required (informatics) combined with more biologically focused procedures (bioinformatics). For a laboratory aiming to implement pathogen genomics, the informatics and bioinformatics activities can be a barrier to starting on the journey; for a laboratory that has already started, these activities may become overwhelming. Here we describe these data bottlenecks and how they have been addressed in laboratories in India, Colombia, Nigeria, and the Philippines, as part of the National Institute for Health Research Global Health Research Unit on Genomic Surveillance of Antimicrobial Resistance. The approaches taken include the use of reproducible data parsing pipelines and genome sequence analysis workflows, using technologies such as Data-flo, the Nextflow workflow manager, and containerization of software dependencies. By overcoming barriers to WGS implementation in countries where genome sampling for some species may be underrepresented, a body of evidence can be built to determine the concordance of antimicrobial sensitivity testing and genome-derived resistance, and novel high-risk clones and unknown mechanisms of resistance can be discovered.Entities:
Keywords: WGS; antimicrobial resistance; bioinformatics; metadata; whole genome sequencing
Mesh:
Substances:
Year: 2021 PMID: 34850839 PMCID: PMC8634317 DOI: 10.1093/cid/ciab785
Source DB: PubMed Journal: Clin Infect Dis ISSN: 1058-4838 Impact factor: 9.079
Figure 1.An overview of 1 potential pathway from sample to phenotypic and genomic outputs. The bottleneck icons represent some of the steps in the process that can cause particular implementation challenges. ① Sample metadata cleaning and validation. ② Conversion of antimicrobial sensitivity testing minimum inhibitory concentration data into standardized formats for downstream processing and interpretation. ③ Quantitative quality assessment of raw reads and assemblies. ④ Processing raw reads to detect the presence or absence of genetic loci, genes, specific nonsynonymous mutations, and variants. ⑤ Aggregating results to produce human readable reports. Abbreviations: AMR, antimicrobial resistance; AST, antimicrobial sensitivity testing; MLST, multilocus sequence typing; ST, sequence type.
Figure 2.Diagram showing the flow of data from sample receipt to final outputs and highlighting the solutions used for each step. The numbers refer to the same data bottlenecks described in Figure 1. The diagram starts when each bacterial sample is submitted accompanied by associated metadata. The sample is processed by traditional phenotypic antimicrobial sensitivity testing to produce minimum inhibitory concentration data. In parallel, genomic DNA from the sample is extracted and sequenced and whole genome sequencing data are processed through reproducible bioinformatics pipelines to produce multiple outputs such as multilocus sequence type, antimicrobial resistance determinant prediction, and single-nucleotide polymorphism–based phylogenies. These data are aggregated using Data-flo and stored in Google Sheets where they can be combined and manipulated using downstream processes such as R scripts or Data-flo pipeline to make final visualizations or reports. Abbreviations: AMR, antimicrobial resistance; AST, antimicrobial sensitivity testing; MIC, minimum inhibitory concentration; MLST, multilocus sequence typing; QC, quality control; RIS, Resistant, Intermediate, Susceptible; SNP, single-nucleotide polymorphism; WGS, whole genome sequencing.
Genome Assembly Training Stream
| Tier Outcome | Tier Title | User Proficiency | Notes |
|---|---|---|---|
| Understand the principles and be able to perform hands-on analysis using web tools | Genome assembly tutorial: principles and web-based analysis | Genomic scientist | Use the Galaxy web platform to run examples samples and assess output |
| Be able to implement analysis using command line | Genome assembly tutorial: command-line analysis | Command-line user | Run assembly with the command-line tools underpinning the Galaxy interface. The same parameters are employed so that the connection between running the assembly via the website and on the command line is apparent. |
| Be able to run reproducible high-throughput analysis and interpret the results | Genome assembly tutorial: reproducible batch processing | Command-line user | An in-depth knowledge of the command line is not required for this training. |
Figure 3.Implementation vignettes. Abbreviations: AMR, antimicrobial resistance; KIMS, Kempegowda Institute of Medical Sciences; SOP, standard operating procedure; WGS, whole genome sequencing.