| Literature DB >> 28550058 |
Yi Chen1, Yan Luo2, Heather Carleton3, Ruth Timme2, David Melka2, Tim Muruvanda2, Charles Wang2, George Kastanis2, Lee S Katz3, Lauren Turner4, Angela Fritzinger4, Terence Moore5, Robert Stones6, Joseph Blankenship2, Monique Salter2, Mickey Parish2, Thomas S Hammack2, Peter S Evans2, Cheryl L Tarr3, Marc W Allard2, Errol A Strain2, Eric W Brown2.
Abstract
Epidemiological findings of a listeriosis outbreak in 2013 implicated Hispanic-style cheese produced by company A, and pulsed-field gel electrophoresis (PFGE) and whole genome sequencing (WGS) were performed on clinical isolates and representative isolates collected from company A cheese and environmental samples during the investigation. The results strengthened the evidence for cheese as the vehicle. Surveillance sampling and WGS 3 months later revealed that the equipment purchased by company B from company A yielded an environmental isolate highly similar to all outbreak isolates. The whole genome and core genome multilocus sequence typing and single nucleotide polymorphism (SNP) analyses results were compared to demonstrate the maximum discriminatory power obtained by using multiple analyses, which were needed to differentiate outbreak-associated isolates from a PFGE-indistinguishable isolate collected in a nonimplicated food source in 2012. This unrelated isolate differed from the outbreak isolates by only 7 to 14 SNPs, and as a result, the minimum spanning tree from the whole genome analyses and certain variant calling approach and phylogenetic algorithm for core genome-based analyses could not provide differentiation between unrelated isolates. Our data also suggest that SNP/allele counts should always be combined with WGS clustering analysis generated by phylogenetically meaningful algorithms on a sufficient number of isolates, and the SNP/allele threshold alone does not provide sufficient evidence to delineate an outbreak. The putative prophages were conserved across all the outbreak isolates. All outbreak isolates belonged to clonal complex 5 and serotype 1/2b and had an identical inlA sequence which did not have premature stop codons.IMPORTANCE In this outbreak, multiple analytical approaches were used for maximum discriminatory power. A PFGE-matched, epidemiologically unrelated isolate had high genetic similarity to the outbreak-associated isolates, with as few as 7 SNP differences. Therefore, the SNP/allele threshold should not be used as the only evidence to define the scope of an outbreak. It is critical that the SNP/allele counts be complemented by WGS clustering analysis generated by phylogenetically meaningful algorithms to distinguish outbreak-associated isolates from epidemiologically unrelated isolates. Careful selection of a variant calling approach and phylogenetic algorithm is critical for core-genome-based analyses. The whole-genome-based analyses were able to construct the highly resolved phylogeny needed to support the findings of the outbreak investigation. Ultimately, epidemiologic evidence and multiple WGS analyses should be combined to increase confidence levels during outbreak investigations.Entities:
Keywords: Listeria monocytogenes; core genome multilocus sequence typing; outbreak; whole genome multilocus sequence typing; whole genome sequencing
Year: 2017 PMID: 28550058 PMCID: PMC5514676 DOI: 10.1128/AEM.00633-17
Source DB: PubMed Journal: Appl Environ Microbiol ISSN: 0099-2240 Impact factor: 4.792
FIG 1Maximum likelihood tree constructed from SNPs identified by using the CFSAN SNP Pipeline. Isolate identifiers are followed by the abbreviation of the state where they were isolated and the type of sample. The bootstrap value for clade I and the minimum and maximum numbers of pairwise chromosomal SNPs among clade I isolates are listed near the root. The environmental isolate from company B, the New York (NY) cheese isolate, and the California (CA) clinical isolate are highlighted in red, blue, and green boxes, respectively.
Single nucleotide polymorphisms that specifically distinguished clade I isolates from the cheese isolate from New York (CFSAN009740) and clinical isolate from California (PNUSAL000355)
| SNP position | Nucleotide at the position in isolate(s) from: | Synonymous change? | Amino acid at the position in isolate(s) from: | Gene locus tag, putative protein function, and corresponding gene locus tag in wgMLST pan-genome | ||||
|---|---|---|---|---|---|---|---|---|
| Clade I | NY | CA | Clade I | NY | CA | |||
| 479720 | T | Yes | CG42_RS02440, ZIP family metal transporter, lmo0414 | |||||
| 607603 | G | G | No | A | A | CG42_RS02995, LacI family transcriptional regulator, lmo0535 | ||
| 782555 | T | Yes | CG42_RS03880, flagellar cap protein FliD, lmo0707 | |||||
| 1080475 | A | No | E | CG42_RS05405, copper homeostasis protein CutC, lmo1018 | ||||
| 1298795 | C | C | No | P | P | CG42_RS06585, DNA primase, LMON_1266 | ||
| 1334724 | C | C | No | T | T | CG42_RS06775, histidine phosphatase family protein, lmo1244 | ||
| 1740888 | T | Yes | CG42_RS08730, VOC family protein, lmo1635 | |||||
| 1762440 | C | Intergenic | ||||||
| 1775838 | C | C | No | A | A | CG42_RS08875, rRNA methyltransferase, lmo1662 | ||
| 2275331 | T | T | Yes | CG42_RS11330, sugar ABC transporter substrate-binding protein, lmo2125 | ||||
| 2311944 | A | A | Yes | CG42_RS11530, xylose isomerase, lmo2160 | ||||
| 2532881 | C | C | No | W | W | CG42_RS12665, glutamate decarboxylase, lmo2434 | ||
The reported SNP position, protein ID, and putative functions are based on the complete and annotated chromosome of isolate CFSAN010068 (GenBank accession number NZ_CP014250.1). All specific SNPs are located on the chromosome.
Underlining indicates that the nucleotide is different from that in clade I isolate.
The locus is in the putative prophage region.
FIG 2Phylogenetic trees constructed based on wgMLST loci that had summary allele calls for at least one isolate, based on NJ by wgMLST (A) and UPGMA by cgMLST (B). The company B isolate, the New York (NY) cheese isolate, and the California (CA) clinical isolate are highlighted in red, blue, and green boxes, respectively.
FIG 3Minimum spanning tree based on wgMLST loci that had summary allele calls for all the isolates. Clade I isolates illustrated in Fig. 1 and 2, except the company B environmental isolate, are shown in white circles, and isolate identifiers are not shown. The New Mexico clinical isolate, California clinical isolate, New York cheese isolate, and company B environmental isolate are in black, green, blue, and red, respectively. The area of each circle is proportional to the number of isolates represented. The number of allele differences between two circles is listed on the line connecting the two circles. The length of each connecting line is proportional to the log of the number of allele differences.
Alleles that specifically distinguished clade I isolates from the cheese isolate from New York (CFSAN009740) and the clinical isolate from California (PNUSAL000355)
| Locus in the pan-genome | Allele profile for isolate(s) from: | Putative protein function and corresponding gene locus tag in CFSAN010068 genome | ||
|---|---|---|---|---|
| Clade I | NY | CA | ||
| lmo0414 | 88 | ZIP family metal transporter, | ||
| lmo0459 | 5 | 5 | Transcriptional regulator, CG42_RS02650 | |
| lmo0460 | 14 | 14 | Membrane-associated lipoprotein, CG42_RS02660 | |
| lmo0535 | 5 | 5 | LacI family transcriptional regulator, CG42_RS02995 | |
| lmo0707 | 100 | Flagellar cap protein FliD, CG42_RS03880 | ||
| lmo1018 | 84 | Copper homeostasis protein CutC, CG42_RS05405 | ||
| LMON_1266 | 4 | 4 | DNA primase, CG42_RS06585 | |
| lmo1244 | 13 | 13 | Histidine phosphatase family protein, CG42_RS06775 | |
| lmo1337 | 4 | 4 | Rhomboid family intramembrane serine protease, | |
| lmo1635 | 36 | VOC family protein, | ||
| lmo1662 | 11 | 11 | rRNA methyltransferase, | |
| lmo2125 | 2 | 2 | Sugar ABC transporter substrate-binding protein, CG42_RS11330 | |
| lmo2160 | 17 | 17 | Xylose isomerase, | |
| lmo2434 | 15 or 129 | 15 | Glutamate decarboxylase, CG42_RS12665 | |
The locus was included in the wgMLST scheme but not in the cgMLST scheme.
In the BioNumerics allele database, numbers to designate the same alleles for CDC users are different from those for general users.
Underlining indicates that the nucleotide is different from that in clade I isolates.
The functions of genes were identified as hypothetical proteins in the EGD-e annotation (GenBank accession number NC_003210.1), and so the functions of corresponding regions in isolate CFSAN010068 (GenBank accession number NZ_CP014250.1) are listed.
The locus was identified from the complete genome of EGD (NC_022568.1) as part of the pan-genome panel. The designations for other loci are from the EGD-e genome.
Isolates analyzed in the present study
| Strain identifier | GenBank accession no. | Source state | Sample type | Collection date |
|---|---|---|---|---|
| PNUSAL000140 | New Mexico | Clinical | July 2013 | |
| PNUSAL000355 | California | Clinical | October 2013 | |
| CFSAN009740 | New York | Cheese | December 2012 | |
| PNUSAL000569 | Maryland | Clinical | August 2013 | |
| PNUSAL000571 | Maryland | Clinical | August 2013 | |
| PNUSAL000570 | Maryland | Clinical | August 2013 | |
| PNUSAL000517 | Maryland | Clinical | October 2013 | |
| PNUSAL000520 | Maryland | Clinical | November 2013 | |
| CFSAN011016 | Maryland | Cheese | February 2014 | |
| CFSAN011017 | Maryland | Cheese | February 2014 | |
| CFSAN011018 | Maryland | Cheese | February 2014 | |
| CFSAN010068 | Maryland | Cheese | February 2014 | |
| CFSAN010069 | Maryland | Cheese | February 2014 | |
| CFSAN010070 | Maryland | Cheese | February 2014 | |
| CFSAN010071 | Maryland | Cheese | February 2014 | |
| CFSAN010072 | Maryland | Cheese | February 2014 | |
| CFSAN010073 | Maryland | Cheese | February 2014 | |
| CFSAN010074 | Maryland | Cheese | February 2014 | |
| CFSAN010075 | Maryland | Cheese | February 2014 | |
| CFSAN010076 | Maryland | Cheese | February 2014 | |
| CFSAN010077 | Maryland | Cheese | February 2014 | |
| CFSAN011015 | Maryland | Cheese | February 2014 | |
| CFSAN010972 | Washington, DC | Cheese | February 2014 | |
| CFSAN010973 | Washington, DC | Cheese | February 2014 | |
| CFSAN010088 | Delaware | Environment | February 2014 | |
| CFSAN010089 | Delaware | Environment | February 2014 | |
| CFSAN010090 | Delaware | Environment | February 2014 | |
| CFSAN010091 | Delaware | Environment | February 2014 | |
| CFSAN010092 | Delaware | Environment | February 2014 | |
| CFSAN010093 | Delaware | Environment | February 2014 | |
| CFSAN010094 | Delaware | Environment | February 2014 | |
| CFSAN010095 | Delaware | Environment | February 2014 | |
| CFSAN010096 | Delaware | Environment | February 2014 | |
| CFSAN010097 | Delaware | Environment | February 2014 | |
| CFSAN010098 | Delaware | Environment | February 2014 | |
| CFSAN018314 | Delaware | Environment | May 2014 | |
| CFSAN010067 | Virginia | Cheese | February 2014 | |
| CFSAN010078 | Virginia | Cheese | February 2014 | |
| CFSAN010079 | Virginia | Cheese | February 2014 | |
| CFSAN010080 | Virginia | Cheese | February 2014 | |
| CFSAN010081 | Virginia | Cheese | February 2014 | |
| CFSAN010082 | Virginia | Cheese | February 2014 | |
| CFSAN010083 | Virginia | Cheese | February 2014 | |
| CFSAN010084 | Virginia | Cheese | February 2014 | |
| CFSAN010085 | Virginia | Cheese | February 2014 | |
| CFSAN010086 | Virginia | Cheese | February 2014 | |
| CFSAN010087 | Virginia | Cheese | February 2014 | |
| CFSAN010754 | Virginia | Cheese | February 2014 | |
| CFSAN010755 | Virginia | Cheese | February 2014 | |
| CFSAN010756 | Virginia | Cheese | February 2014 | |
| CFSAN010757 | Virginia | Cheese | February 2014 | |
| CFSAN010758 | Virginia | Cheese | February 2014 | |
| CFSAN010759 | Virginia | Cheese | February 2014 | |
| CFSAN010760 | Virginia | Cheese | February 2014 | |
| CFSAN010761 | Virginia | Cheese | February 2014 | |
| CFSAN010762 | Virginia | Cheese | February 2014 | |
| CFSAN010763 | Virginia | Cheese | February 2014 |
All isolates were serotype 1/2b, CC5. All isolates except PNUSAL000140 had the PFGE pattern GX6A16.0259/GX6A12.2046 (AscI/ApaI).
For identification of SNPs via the CFSAN SNP Pipeline, the completely closed genome of the reference isolate and raw reads from other isolates were used. The closed genome was not used in the wgMLST/cgMLST analyses.