| Literature DB >> 32228804 |
Kathy E Raven1, Beth Blane1, Narender Kumar1, Danielle Leek1, Eugene Bragin2, Francesc Coll3, Julian Parkhill4, Sharon J Peacock5,1.
Abstract
Bacterial sequencing will become increasingly adopted in routine microbiology laboratories. Here, we report the findings of a technical evaluation of almost 800 clinical methicillin-resistant Staphylococcus aureus (MRSA) isolates, in which we sought to define key quality metrics to support MRSA sequencing in clinical practice. We evaluated the accuracy of mapping to a generic reference versus clonal complex (CC)-specific mapping, which is more computationally challenging. Focusing on isolates that were genetically related (<50 single nucleotide polymorphisms (SNPs)) and belonged to prevalent sequence types, concordance between these methods was 99.5 %. We use MRSA MPROS0386 to control for base calling accuracy by the sequencer, and used multiple repeat sequences of the control to define a permitted range of SNPs different to the mapping reference for this control (equating to 3 standard deviations from the mean). Repeat sequences of the control were also used to demonstrate that SNP calling was most accurate across differing coverage depths (above 35×, the lowest depth in our study) when the depth required to call a SNP as present was at least 4-8×. Using 786 MRSA sequences, we defined a robust measure for mec gene detection to reduce false-positives arising from contamination, which was no greater than 2 standard deviations below the average depth of coverage across the genome. Sequencing from bacteria harvested from clinical plates runs an increased risk of contamination with the same or different species, and we defined a cut-off of 30 heterozygous sites >50 bp apart to identify same-species contamination for MRSA. These metrics were combined into a quality-control (QC) flowchart to determine whether sequence runs and individual clinical isolates passed QC, which could be adapted by future automated analysis systems to enable rapid hands-off sequence analysis by clinical laboratories.Entities:
Keywords: MRSA; quality metrics; translational; whole-genome sequencing
Year: 2020 PMID: 32228804 PMCID: PMC7276698 DOI: 10.1099/mgen.0.000354
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.CC-specific versus a single mapping reference. (a) Graph showing the relationship between pairwise SNP distances based on mapping to a CC22 reference and mapping to a CC-specific reference, for all isolate pairs <50 SNPs apart that belonged to STs with >10 isolates in the collection, based on either method. Line indicates an exact match between the two methods. (b) Graph showing the number of SNPs different between CC22 and CC-specific mapping based on isolate pairs belonging to CC22 (orange) or non-CC22 (blue) STs.
Fig. 2.Sequencing depth to call a SNP as present. Graph showing the number of SNPs identified against the HO 5096 0412 mapping reference with MGEs removed (y-axis) in 43 sequencing replicates of MPROS0386 with different average depths of coverage (x-axis) based on varying thresholds for the depth of coverage required to call a SNP as present.
Fig. 3.Defining a cut-off for the detection of mec genes. (a) Comparison of the depth of coverage across the mapping reference to the depth of coverage identified across mecA or mecC for each isolate in the study collection (n=786). Red dot indicates an obvious outlier. (b) Graph showing the depth of coverage across the mapping reference and across mec genes for the 786 study isolates, in comparison to potential QC cut-offs of 1, 2 or 3 standard deviations from the mean depth across the mapping reference.
Fig. 4.Defining a cut-off for same-species contamination. (a) Graph showing the number of heterozygous sites >50 bp apart for each of the 786 study isolates. Outliers are highlighted in red. (b) Magnification of Fig. 4a showing only those isolates with <100 heterozygous sites >50 bp apart. The line indicates the suggested cut-off of 30 heterozygous sites.
Fig. 5.A final QC flowchart for passing/failing positive (MPROS0386), negative (NCTC12241) and no template controls and clinical isolates during clinical MRSA sequencing. Note that 0.4 and 4 % of reads matching another species in Kraken equates to contamination of 1 and 10 %, respectively. If any of the controls fail any of the QC metrics, the entire sequence run will fail and require re-sequencing. If clinical isolates fail any of the QC metrics, that single isolate is failed and should be repeated without further analysis.