| Literature DB >> 29238330 |
Clémentine Henri1, Pimlapas Leekitcharoenphon2, Heather A Carleton3, Nicolas Radomski1, Rolf S Kaas2, Jean-François Mariet1, Arnaud Felten1, Frank M Aarestrup2, Peter Gerner Smidt3, Sophie Roussel1, Laurent Guillier1, Michel-Yves Mistou1, René S Hendriksen2.
Abstract
Background/objectives: Whole genome sequencing (WGS) has proven to be a powerful subtyping tool for foodborne pathogenic bacteria like L. monocytogenes. The interests of genome-scale analysis for national surveillance, outbreak detection or source tracking has been largely documented. The genomic data however can be exploited with many different bioinformatics methods like single nucleotide polymorphism (SNP), core-genome multi locus sequence typing (cgMLST), whole-genome multi locus sequence typing (wgMLST) or multi locus predicted protein sequence typing (MLPPST) on either core-genome (cgMLPPST) or pan-genome (wgMLPPST). Currently, there are little comparisons studies of these different analytical approaches. Our objective was to assess and compare different genomic methods that can be implemented in order to cluster isolates of L. monocytogenes.Entities:
Keywords: Listeria monocytogenes; PFGE; SNPs; WGS; cgMLST; conventional MLST; surveillance; wgMLST
Year: 2017 PMID: 29238330 PMCID: PMC5712588 DOI: 10.3389/fmicb.2017.02351
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Phylogenic trees with the 208 L. monocytogenes, based on genomic MLST scheme define by Bionumerics® (core and pan) and Phylogenic trees based on SNPsvanalysis with EGD-e and EGD as references. Trees were circulated using ItoL. Inner circle represents lineage for each strains, second ring represents PCR-serotype, the third band shows the pulsotype cluster for each strain and the last two rings shows results from conventional seven loci MLST typing for each strains either with CC and ST. Color codes for Lineage, PCR-serotype and conventional seven loci MLST CC are shown aside in the figure legend. (A) The Analysis was performed on 1,748 core genes scheme from Bionumerics and dendograms was done using the UPGMA algorithm with the allele calls considered categorical data. (B) The Analysis was performed on pan genes scheme from Bionumerics and dendograms was done using the UPGMA algorithm with the allele calls considered categorical data. (C) SNP tree was constructed from SNPs that were identified using the pipeline CSI phylogeny accessible from the Center for Genomic Epidemiology (www.genomicepidemiology.org). EGDe was used as the reference genome to called SNPs. The SNP alignments were subjected to maximum-likelihood tree construction using PhyML with 100 bootstrap replicates. (D) SNP tree was constructed from SNPs that were identified using the pipeline CSI phylogeny accessible from the Center for Genomic Epidemiology (www.genomicepidemiology.org). EGD was used as the reference genome to called SNPs. The SNP alignments were subjected to maximum-likelihood tree construction using PhyML with 100 bootstrap replicates.
Backward comparison with routine typing methods.
| Core genome MLST | 100.0 | 96.6 | 99.5 | 67.3 |
| Whole genome MLST | 100.0 | 97.6 | 97.1 | 68.8 |
| SNP tree EGD-e | 100.0 | 99.0 | 94.7 | 69.2 |
| SNP tree EGD | 100.0 | 97.6 | 94.7 | 67.8 |
| CgMLPPST tree based on the study panel | 100.0 | 98.1 | 97.1 | 73.1 |
| WgMLPPST tree (Shell) | 99.0 | 96.6 | 87.50 | 62.5 |
| WgMLPPST tree (Cloud) | 83.2 | 88.0 | 87.02 | 70.2 |
The performance of genomic methods was measured by concordance with routine methods (Lineage, PCR-Serotype, MLST, PFGE). The 100% means all strains from a particular group for routine method clustered together in corresponding tree. For instance, all strains clustered together according their lineage (I or II) for cgMLST, wgMLST, SNP trees and core genes tree but only 99 and 83.2% of strains for both MLPPST (respectively Shell and Cloud). See detail of count in Supplementary Table .
Figure 2Visual comparison of genome SNP trees using EGD-e or EGD as reference. Using R software, SNP trees performed with the study panel of 208 L. monocytogenes were compared. By facing the two trees one in front of the other, corresponding strains were linked (on the left the SNP tree using EGD as reference and on right the SNP tree using EGD-e as reference). The connection between strains was colored according to the CC of the strains (refer to the color code). The two references are indicated in red. Nodes were rotated to optimize matching between corresponding strains in both trees as closely as possible. Similar clusters are connected by straight lines, while curved line connect strains from distinct clusters.
Figure 3Visual comparison of cgMLST and wgMLST. R software was used to compare core genome and wgMLST on the study panel of 208 L. monocytogenes. In this opposite comparison corresponding strains were linked (on the left cgMLST and on right wgMLST). The connection between strains was colored according to the CC of the strains (refer to the color code). Nodes were rotated to optimize matching between corresponding strains in both trees as closely as possible. Similar clusters are connected by straight lines, while curved line connect strains from distinct clusters.
Figure 4Visual comparison of genome SNP and wgMLST. We compared genome SNP and wgMLST on the study panel using R software (on the left cgMLST and on right wgMLST). Using this face-to-face comparison, we linked corresponding strains. The connection between strains was colored according to the CC of the strains (refer to the color code). Nodes were rotated to optimize matching between corresponding strains in both trees as closely as possible. Similar clusters are connected by straight lines, while curved line connect strains from distinct clusters.