| Literature DB >> 34356004 |
Jordy P M Coolen1, Casper Jamin2, Paul H M Savelkoul2,3, John W A Rossen4,5, Heiman F L Wertheim1, Sébastien P Matamoros3, Lieke B van Alphen2.
Abstract
Whole-genome sequencing is becoming the de facto standard for bacterial outbreak surveillance and infection prevention. This is accompanied by a variety of bioinformatic tools and needs bioinformatics expertise for implementation. However, little is known about the concordance of reported outbreaks when using different bioinformatic workflows. In this multi-centre proficiency testing among 13 major Dutch healthcare-affiliated centres, bacterial whole-genome outbreak analysis was assessed. Centres who participated obtained two randomized bacterial datasets of Illumina sequences, a Klebsiella pneumoniae and a Vancomycin-resistant Enterococcus faecium, and were asked to apply their bioinformatic workflows. Centres reported back on antimicrobial resistance, multi-locus sequence typing (MLST), and outbreak clusters. The reported clusters were analysed using a method to compare landscapes of phylogenetic trees and calculating Kendall-Colijn distances. Furthermore, fasta files were analysed by state-of-the-art single nucleotide polymorphism (SNP) analysis to mitigate the differences introduced by each centre and determine standardized SNP cut-offs. Thirteen centres participated in this study. The reported outbreak clusters revealed discrepancies between centres, even when almost identical bioinformatic workflows were used. Due to stringent filtering, some centres failed to detect extended-spectrum beta-lactamase genes and MLST loci. Applying a standardized method to determine outbreak clusters on the reported de novo assemblies, did not result in uniformity of outbreak-cluster composition among centres.Entities:
Keywords: Bioinformatics; Infection Prevention Control; Outbreak analysis; Proficiency test; Whole genome sequencing; bacterial typing
Mesh:
Year: 2021 PMID: 34356004 PMCID: PMC8549354 DOI: 10.1099/mgen.0.000612
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.UPGMA tree-of-centres for both the KP as the VRE dataset. The trees indicate the relation of reported outbreak outcome of all 13 centres. Majority and geometric median calculations are added to the UPGMA trees. The data next to the UPGMA trees show the bioinformatic workflow used per centre divided in readcleaning, assembly, and outbreak analysis tools. Furthermore, cluster definitions applied per centre are plotted in barplots and the outcome of the centres is indicated in the barplots with cluster composition. Legends are integrated in the figure.
Fig. 2.Sample-to-sample relations as reported by the 13 participating centres. The figure is divided in the (KP) outcome and in the VRE outcome. All samples are named according the naming that was provided throughout the study. Legend of the figure can be found in the right corner of the figure. The ST that was reported by the majority of the centres was added at each cluster.
Fig. 3.Sweep cut-off analysis results. The barplots in this figure illustrate the mean differences between the outbreak clusters reported among centres. For example, a distance of 0 means that centres reported identical outbreak clusters. The mean distance is calculated using the Kendall–Colijn distances metric. (a) Sweep cut-off analysis of the KP samples using the wgSNP method. (b) Sweep cut-off analysis of the VRE samples using the wgSNP method. (c) Sweep cut-off analysis of the KP samples using the cgSNP method. (d) sweep cut-off analysis of the VRE samples using the cgSNP method.
Fig. 4.Illustration of differences in sample-to-sample relations between centre 9 and centre 12. This figure illustrates for a sweep cut-off of 5, 10, 15 and 20 SNPs using the cgSNP method the differences in outbreak cluster composition between centre 9 and centre 12. Both given for the KP as well as the VRE samples.