| Literature DB >> 33827898 |
Mostafa Y Abdel-Glil1,2, Alexandra Chiaverini3, Giuliano Garofolo3, Antonio Fasanella4, Antonio Parisi4, Dag Harmsen5, Keith A Jolley6, Mandy C Elschner1, Herbert Tomaso1, Jörg Linde1, Domenico Galante4.
Abstract
Whole-genome sequencing (WGS) has been established for bacterial subtyping and is regularly used to study pathogen transmission, to investigate outbreaks, and to perform routine surveillance. Core-genome multilocus sequence typing (cgMLST) is a bacterial subtyping method that uses WGS data to provide a high-resolution strain characterization. This study aimed at developing a novel cgMLST scheme for Bacillus anthracis, a notorious pathogen that causes anthrax in livestock and humans worldwide. The scheme comprises 3,803 genes that were conserved in 57 B. anthracis genomes spanning the whole phylogeny. The scheme has been evaluated and applied to 584 genomes from 50 countries. On average, 99.5% of the cgMLST targets were detected. The cgMLST results confirmed the classical canonical single-nucleotide-polymorphism (SNP) grouping of B. anthracis into major clades and subclades. Genetic distances calculated based on cgMLST were comparable to distances from whole-genome-based SNP analysis with similar phylogenetic topology and comparable discriminatory power. Additionally, the application of the cgMLST scheme to anthrax outbreaks from Germany and Italy led to a definition of a cutoff threshold of five allele differences to trace epidemiologically linked strains for cluster typing and transmission analysis. Finally, the association of two clusters of B. anthracis with human cases of injectional anthrax in four European countries was confirmed using cgMLST. In summary, this study presents a novel cgMLST scheme that provides high-resolution strain genotyping for B. anthracis. This scheme can be used in parallel with SNP typing methods to facilitate rapid and harmonized interlaboratory comparisons, essential for global surveillance and outbreak analysis. The scheme is publicly available for application by users, including those with little bioinformatics knowledge.Entities:
Keywords: Bacillus anthracis; canonical SNP; cgMLST; genome typing; whole-genome typing
Mesh:
Year: 2021 PMID: 33827898 PMCID: PMC8218748 DOI: 10.1128/JCM.02889-20
Source DB: PubMed Journal: J Clin Microbiol ISSN: 0095-1137 Impact factor: 5.948
FIG 1Phylogenetic analysis and geographical origin distribution of global B. anthracis genomes. (A) Comparison between the neighbor-joining tree (left) and maximum likelihood tree (right) constructed for the 584 genomes based on the pairwise allelic distances, ignoring untypeable genes and whole-genome SNPs after filtering regions with high SNP density using Gubbins, respectively. Tree visualizations were performed using iTOL. (B) Geographical origin distribution of 584 B. anthracis genomes used in the evaluation of the core-genome MLST. The updated canonical SNPs groups from Sahl et al. (20) were added and color coded.
FIG 2Geographical origin distribution (A) and a minimum-spanning tree (B) illustrating the last three anthrax outbreaks that occurred in cattle populations in Germany. Each node represents a unique cgMLST allele profile. Colored nodes represent the location of isolation. Numbers on connecting lines refer to the number of different alleles. Previously published genomes are marked with a star (*).
FIG 3Geographical origin distribution (A) and minimum-spanning tree (B) illustrating 35 distinct spatiotemporal anthrax outbreaks that occurred in Italy. Each node represents a unique cgMLST allele profile. Colored nodes represent the city of isolation, while node labels correspond to different MLVA profiles. Numbers on connecting lines refer to the number of different alleles. (C) Phylogenetic analysis of the strains using a neighbor-joining tree based on the whole-genome SNP data.
FIG 4Geographical origin distribution (A) and minimum-spanning tree (B) illustrating 57 B. anthracis strains from human cases (heroin users) with injectional anthrax in four different European countries. Each node represents a unique cgMLST allele profile. The sizes of the nodes represent the number of isolates. Colored nodes represent the different clusters identified based on whole-genome SNPs and cgMLST. Numbers on connecting lines refer to the numbers of different alleles. cgMLST profiles with less than five different alleles to the central genotype are shaded.